使用非 ascii 文件名调用 _findfirst答案

【问题标题】：Calling _findfirst with non ascii file names使用非 ascii 文件名调用 _findfirst
【发布时间】：2016-10-11 05:17:11
【问题描述】：

我有一个多字节 Windows 项目，我尝试访问一个文件，该文件的名称可以带有现代 Windows 允许的任何符号。但是如果文件名包含非 ASCII 字符（日语、瑞典语、俄语等），我会惨败。

例如：

const char * filename_ = "C:\\testÖ.txt"
struct _finddata_t fd;
long fh = _findfirst(filename_, &fd);

此时_findfirst() 失败。

在这里支持所有可能的文件名的最佳解决方案是什么？我读到_findfirst() 取决于程序启动时设置的系统区域设置。好吧，我可以为某个文件更改它，但在这种情况下，我如何确定文件名所需的语言环境？

项目必须保持多字节。

以前有人解决过这个问题吗？

我也尝试使用宽字符转换，但也没有运气。下面的代码示例：

debug_prnt("DEBUG: Checking existance of a file: %s\n", filename_);
struct _wfinddata_t ff;
size_t requiredSize = mbstowcs(NULL, filename_, 0);
wchar_t * filename = (wchar_t *)malloc((requiredSize + 1) * sizeof(wchar_t));
if (!filename)
{
    debug_prnt("ERROR: Memory allocation failed\n");
    return FALSE;
}
size_t size = mbstowcs(filename, filename_, requiredSize + 1);
if (size == (size_t)(-1))
{
    debug_prnt("ERROR: Couldn't convert string--invalid multibyte character.\n");
    return FALSE;
}

long fh = _wfindfirst(filename, &ff);
if (fh > 0)
    debug_prnt("DEBUG: File exists\n");
else
    debug_prnt("DEBUG: File does not exist %ls\n", filename);
free(filename);

【问题讨论】：

_findfirst() 上的文档和变体是 msdn.microsoft.com/en-us/library/zyzxfzac.aspx，在我看来你应该使用 _wfindfirst()。一般来说，这些天我坚持使用 UNICODE 和宽字符，因为 Windows API 需要它。你为什么使用strlen()？这意味着您的原始 filename_ 包含 char 文本而不是 wchar_t 文本，因此这可能是您的问题所在。
strlen 出错了。我已经在 IBM 论坛上找到了正确的长度计算并在此处更新了代码，但我仍然找不到该文件。我也在使用 _wfindfirst 但到目前为止没有运气。
这是您使用的实际代码吗？此处的示例 cplusplus.com/reference/cstdlib/mblen 用于 mblen() 和 mbtowc() 显示了对这两个功能的重置，并且执行方式与您正在执行的方式不同。
我认为你想改用mbstowcs()。 cplusplus.com/reference/cstdlib/mbstowcs
我使用的是 IBM 示例 ibm.com/support/knowledgecenter/ssw_ibm_i_71/rtref/mbtowc.htm

标签： c localization

【解决方案1】：

这是一个简短但完整的 Windows 控制台应用程序，它使用您想要使用的功能。

此程序所做的是在当前工作文件夹中创建一个文件作为要查找的内容，然后列出当前工作文件夹中扩展名为 .txt 的文件。

对于搜索条件，我使用的是硬编码的宽字符串。在您的情况下，您可能需要将字符串作为多字节字符串接受，将其转换为宽字符，然后将其与 _wfindfirst() 一起使用。

但是，在我的设置中，printf() 似乎存在文本转换问题，因此打印到控制台的非 ASCII 文本中有一个奇怪的字符。但是调试器显示它很好。

// multibyte_file_search.cpp : Defines the entry point for the console application.
//

#include "stdafx.h"

#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <locale.h>
#include <io.h>

int _tmain(int argc, _TCHAR* argv[])
{
    const char * filename_ = "testÖ.txt";
    FILE *fp = fopen (filename_, "w");
    fclose(fp);

    // test out mbstowcs()
    wchar_t *wcsFileName_ = new wchar_t[512];
    int requiredSize = mbstowcs(NULL,filename_,0);
    size_t xsize = mbstowcs(wcsFileName_,filename_,512);
    printf ("mbstowcs() return %d\n", xsize);

    // do an actual directory search on the current working directory.
    printf ("\n\n Directory search begins.\n");
    struct _wfinddata_t ff = {0};
    char *csFileName_ = new char[512];
    strcpy (csFileName_, "*.txt");
    xsize = mbstowcs(wcsFileName_,csFileName_,512);  // convert search to wide character.
    intptr_t  fh = _wfindfirst(wcsFileName_, &ff);

    if (fh != -1) {
        do {
            wcstombs (csFileName_, ff.name, 512);
            printf (" ff.name %S and converted name %s \n", ff.name, csFileName_);
            wprintf (L"     ff.name %s and converted name %S \n", ff.name, csFileName_);
        } while (_wfindnext (fh, &ff) == 0);
        _findclose (fh);
    } else {
        printf ("No files in directory.\n");
    }

    return 0;
}

【讨论】：

但是您没有在_wfindfirst 中的mbstowcs 后面使用wcsFileName_。尝试提供一个实际的文件名而不是掩码的文件名。好吧，我用DEBUG: Checking existance of a file: C:\testÖ.txt Handle: -1 DEBUG: File does not exist C:\testÖ.txt, errno: 2 失败了，在这里我在调用_wfindfirst 之后打印了文件handle 和errno
你去。祝你好运。
老兄，如果你已经知道文件名，那么你就不需要 _findfirst()。只需使用您已经知道的文件名。 _findfirst() 是使用条件搜索目录，以便开发目录中匹配条件的文件列表。
Afaik，我知道文件名，但我不知道在调用此函数时它是否存在。我找到了解决方案。 wcstombs 是转换的 STL 实现，如果 UTF-8 字符串作为源在 char 指针内发送，它会失败。微软功能效果更好，所以你需要使用msdn.microsoft.com/en-us/library/windows/desktop/… 它可以工作
当它是 UTF-8 文本时，为什么要询问多字节文本？请参阅Difference between MBCS and UTF-8 on Windows，它解释了差异并提到您必须使用 `MultiByteToWideChar()'。