C中的这种排序算法有什么问题？答案

【问题标题】：What is wrong with this sorting algorithm in C?C中的这种排序算法有什么问题？
【发布时间】：2019-01-20 18:25:46
【问题描述】：

我正在尝试一个问题，我必须按字母顺序排列 5000 个名字的列表（这些名字存储在文本文件 "names.txt" 中）。从下面的代码中可以看出，我创建了一个二维数组 names[n][m] 来存储名称。

对于每一个名字，我都会按字母顺序将其与所有其他名字进行比较。每当第 i 个名称按字母顺序大于另一个名称时，其存储在其数组元素 rank[i] 中的字母顺序就会增加。例如，当"Mary" 与"Denise" 进行比较时，Mary 的排名将增加 1，因为它按字母顺序比 Denise 大。所有等级都是从 1 开始的。

这似乎有效，因为在使用问题中提供的示例进行测试时它是成功的。但是，我得到的最终答案是错误的。更重要的是，我发现其中几个名字的排名相同（即我检查并发现"Columbus" 和"Colt" 的排名相同）。我不确定我的算法为什么或在哪里存在缺陷，因为它在我看来在逻辑上是合理的（？）。我尝试通过添加一些 cmets 使我的代码更具可读性，如果有人能向我解释我的错误，我将不胜感激。我才编码几天，如果我犯了任何新手错误，请原谅我。谢谢你的时间！

问题链接：https://projecteuler.net/problem=22

编辑：代码被略微截断（我省略了我刚刚将所有分数加在一起的最后一步）。但是我谈到的错误只与提供的代码有关。谢谢！

#include <stdio.h>
#include <string.h>
#include <math.h>
#include <stdlib.h>

int main() {
    FILE *fp;
    int i, j;
    int a = 0;
    int b = 0;
    fp = fopen("names.txt", "r");
    char names[5200][30] = { 0 };
    int rank[5200] = { 0 }; //Rank corresponds to their alphabetical positions
    unsigned long long score[5200] = { 0 };
    unsigned long long sum = 0;
    for (i = 0; i < 5200; i++) {
        (rank[i])++;  //All the rankings start from 1.
    }
    for (i = 0; !feof(fp); i++) {
        fscanf(fp, "\"%[A-Z]\",", &names[i]); //Scanning and storing the names from the file into the array.
    }

    for (i = 0; i < 5200; i++) {
        for (j = 0; j < 5200; j++) {
            if (i != j && names[i][0] != 0 && names[j][0] != 0) {
                while (names[i][a] == names[j][a]) {  //If the ith word and jth word have the same letter, then increment a (which advances to the next letter).
                    a++;
                }
                if (names[i][a] > names[j][a]) { 
                    (rank[i])++; //If the ith word has a larger letter than the jth word, there will be an increase in its rank.
                } else
                if (names[j][a] == 0 && names[i][a] != 0) { 
                    (rank[i])++; //If the jth word is shorter than the ith word, then i also increases its rank.
                }
            }
            a = 0;
        }
        for (a = 0; a < 30; a++) {
            if (names[i][a] != 0 && names[i][0] != 0) {
                score[i] += (names[i][a] - 64); //Sum up the alphabetical value (as per the question) for each name.
            }
        }
        score[i] = (rank[i]) * (score[i]);
    }

【问题讨论】：

strcmp() (#include <string.h>) 完成了您正在做的大部分工作。为你的算法使用它的返回值，代码会更短更清晰。而且它可能会更加优化。 pubs.opengroup.org/onlinepubs/9699919799/functions/strcmp.html

标签： c algorithm sorting

【解决方案1】：

您的算法有效，但存在一些实现问题：

测试for (i = 0; !feof(fp); i++) { 不正确。 fscanf() 可能无法在文件结束前转换文件内容，导致无限循环。您应该改为测试 fscanf() 是否返回 1 以表示成功。
您应该计算读入数组的字数并将循环限制在此范围内。
您不应假定文件中的名称不重复。如果i 和j 具有相同的内容，则循环while (names[i][a] == names[j][a]) { a++ } 具有未定义的行为。确实，使用strcmp() 比较名称更简单、更安全。
无需保留所有名称的排名和分数，您可以在外循环内即时计算总和。这也保存了初始化代码。

这是一个更正和简化的版本：

#include <stdio.h>
#include <string.h>

int main() {
    char names[5200][30];
    FILE *fp;
    int i, j, n, a, rank, score;
    unsigned long long sum = 0;

    fp = fopen("p022_names.txt", "r");
    if (fp == NULL)
        return 1;

    // Scan and store the names from the file into the array.
    for (n = 0; n < 5200; n++) {
        if (fscanf(fp, "\"%29[A-Z]\",", names[n]) != 1)
            break;
    }
    fclose(fp);

    // n first names read from file.
    for (i = 0; i < n; i++) {
        rank = 1;
        for (j = 0; j < n; j++) {
            if (strcmp(names[i], names[j]) > 0)
                rank++;
        }
        score = 0;
        for (a = 0; names[i][a] != '\0'; a++) {
            // Sum up the alphabetical value (as per the question) for each name.
            score += names[i][a] - 'A' + 1;
        }
        sum += rank * score;
    }
    printf("Total score of %d names: %lld\n", n, sum);
    return 0;
}

输出：

Total score of 5163 names: 871198282

【讨论】：