在 ios 上读取 C 中的大文本文件答案

【问题标题】：Read large text file in C on ios在 ios 上读取 C 中的大文本文件
【发布时间】：2024-01-08 13:38:01
【问题描述】：

我有一个 vb6 程序，它从 sql server 读取数据并将它们写入文本文件。每条记录由换行符分隔。这些文件（也可以大于 200mb）必须在 sqlite 数据库中从 iPad 读取和写入。为了避免内存警告，我在 C 中使用此函数读取文件的每一行

“strRet”是C中读取的字符串

“NSString *stringa”是字符串C转化为NSString

NSDictionary *readLineAsNSString(FILE *f,int pospass,BOOL testata,int primorecord  )
{
    char *strRet = malloc(BUFSIZ);//(char *) togliere perche con c potrebbe restituire un int
    if (strRet==NULL)
    {
        return nil;
    }

    int size = BUFSIZ;

    BOOL finito=NO;
    int pos = 0;
    int c;
    fseek(f,pospass,SEEK_SET);

    do{ // read one line

        c = fgetc(f);

        if (pos >= size-1)
        {
            size=size+BUFSIZ;
            strRet = realloc(strRet, size);
            if (strRet==NULL)
            {
                return nil;
            }

        }

        if(c != EOF)
        {
            strRet[pos] = c;
            pos=pos+1;
        }
        else
        {
            finito=YES;
        }

    }while(c != EOF && c != '\n');

    if (pos!=0)
    {
        strRet[pos] = '\0';
    }

    NSString *stringa=[NSString stringWithCString:strRet encoding:NSASCIIStringEncoding];

    if (pos==0)
    {
        stringa=@"";
    }

    long long sizerecord;
    if (pos!=0)
    {
        sizerecord=   (long long) [[NSString stringWithFormat:@"%ld",sizeof(char)*(pos)] longLongValue];
    }
    else
    {
        sizerecord=0;
    }
    pos = pospass + pos;

    NSDictionary *risultatoc = @{st_risultatofunzione: stringa,
                                 st_criterio: [NSString stringWithFormat:@"%d",pos],
                                 st_finito: [NSNumber numberWithBool:finito],
                                 st_size: [NSNumber numberWithLongLong: sizerecord]
                                 };

    //Svuoto il buffer
    free(strRet);
    // free(tmpStr);
    strRet=NULL;

    return risultatoc;

}

但是，当我在文件中有特殊字符（例如 € 符号或重音字母或某些北欧国家）时，记录无法正确读取，并且我发现自己带有随机字符的 NSString而不是正确的。你知道你帮我吗？谢谢！

【问题讨论】：

在stringWithCString:encoding 方法中，尝试将encoding 更改为NSWindowsCP1252StringEncoding。见madore.org/~david/computers/unicode/cstab.html#CP1252

标签： ios c text ascii non-ascii-characters

【解决方案1】：

以下行告诉 iOS 你有 ASCII 数据：

NSString *stringa= [NSString stringWithCString:strRet encoding:NSASCIIStringEncoding];

但是，€ 符号或重音字母不是 ASCII 的一部分。所以你显然有不同的看法。

找出它是什么编码（例如 UTF-8、Windows ANSI、ISO-8859-1）并相应地更新该行，例如：

NSString *stringa= [NSString stringWithCString:strRet encoding: NSWindowsCP1251StringEncoding];

更新

弄清楚正在使用什么编码可能很棘手。

根据我的经验，VB6 和 SQL Server 是一对很好的组合，因为它们通常不会弄乱编码。薄弱的部分是文本文件，它依赖于编码，但不包含任何关于使用何种编码的明确信息。 VB6 可能使用 Windows 默认设置，这取决于您的语言设置。不幸的是，我不知道你在哪里可以看到 Windows 中的默认编码。

在西方国家，编码通常设置为 Windows ANSI aka Code Page 1251（这就是常量 NSWindowsCP1251StringEncoding 的来源）。

您可以或多或少地验证它。如果您打开一个包含欧元符号 (€) 的文本文件，如果它以 CP 1251 编码，则它必须使用值 80（十六进制）。在 Latin-1（又名 ISO-8859-1）中，您不能表示欧元符号。在 Latin-9（又名 ISO-8859-15）中，它将使用 A4（十六进制）。在 UTF-8 中，需要三个字节：E2 82 AC。

所以自己检查一下。如果您不确定，请添加文本文件相关部分的十六进制转储。

【讨论】：

如何找到正确的编码？文件是用VB6程序写在windows上的，数据是从sql server读取的
谢谢，适当的编码是 windows 1252。“€”被识别为 80，现在我的应用程序能够正确读取文件中的所有 255 个字符。 en.wikipedia.org/wiki/Windows-1252