【问题标题】:reading more than one text file and store the content to an array读取多个文本文件并将内容存储到数组中
【发布时间】:2013-12-20 15:10:26
【问题描述】:

我写了一个c#程序,从5个文本文件中读取数据,并根据给定的关键字统计它们

        string[] word_1 = File.ReadAllText(@"C:\Users\Niyomal N\Desktop\Assignment\Assignment\D1_H1.txt").Split(' ');
        string[] word_2 = File.ReadAllText(@"C:\Users\Niyomal N\Desktop\Assignment\Assignment\D2_H1.txt").Split(' ');
        string[] word_3 = File.ReadAllText(@"C:\Users\Niyomal N\Desktop\Assignment\Assignment\D3_H2.txt").Split(' ');
        string[] word_4 = File.ReadAllText(@"C:\Users\Niyomal N\Desktop\Assignment\Assignment\D4_H2.txt").Split(' ');
        string[] word_5 = File.ReadAllText(@"C:\Users\Niyomal N\Desktop\Assignment\Assignment\D5_H2.txt").Split(' ');
        string[] given_doc = File.ReadAllText(@"C:\Users\Niyomal N\Desktop\Assignment\Assignment\Given_doc.txt").Split(' ');

这就是我从文本文件中读取的方式,在阅读完之后我使用 for 循环和 if 循环来计算软管文件中的每个单词

for (int i = 0; i < word_1.Length; i++)

        {

            string s = word_1[i];


                if ("Red".Equals(word_1[i]))
                {
                    //Console.WriteLine(word[i]);

                    h1_r++;
                }
                if ("Green".Equals(word_1[i]))
                {
                    h1_g++;
                }
                if ("Blue".Equals(word_1[i]))
                {
                    h1_b++;
                }

        }

这是我用来从一个文件中获取计数的循环,它工作正常,我做了 5 次这个过程来读取所有文件,我的问题是我如何使用一个 for 循环读取这 5 个文件并将它们存储在一个数组(每个关键字的计数)

提前致谢!!

【问题讨论】:

  • 文件名重要吗?还是您只阅读该目录中的所有文件?
  • 你的第一个代码块编译了吗?! ReadAllText() 返回一个字符串,而不是一个数组。
  • 实际上文本文件的数量很重要,而不是文件名。我想从多个文本文件中获取数据
  • 为什么arrays使用List
  • 其实我想知道为什么你必须存储它们而不是读取、计数和处理......

标签: c# arrays loops for-loop


【解决方案1】:

LINQ 查询是您最简单的解决方案:

var filenames = new[] { "D1_H1.txt", "D2_H1.txt", "D3_H2.txt" };
var words = new[] { "Red", "Green", "Blue" };
var counters = 
  filenames.Select(filename => Path.Combine(@"C:\Users\Niyomal N\Desktop\Assignment\Assignment", filename))
           .SelectMany(filepath => File.ReadAllLines(filepath))
           .SelectMany(line => line.Split(new[] { ' ' }))
           .Where(word => words.Contains(word))
           .GroupBy(word => word, (key, values) => new
              {
                 Word = key,
                 Count = values.Count()
              })
           .ToDictionary(g => g.Word, g => g.Count);

然后您在所有文件中都有单词计数器字典:

int redCount = counters["Red"];

如果你想为每个文件存储计数器,你可以使用稍微修改的查询:

var filenames = new[] { "D1_H1.txt", "D2_H1.txt", "D3_H2.txt" };
var words = new[] { "Red", "Green", "Blue" };
var counters =
  filenames.Select(filename => Path.Combine(@"C:\Users\Niyomal N\Desktop\Assignment\Assignment", filename))
           .Select(filepath => new
           {
              Filepath = filepath,
              Count = File.ReadAllLines(filepath)
                          .SelectMany(line => line.Split(new[] { ' ' }))
                          .Where(word => words.Contains(word))
                          .GroupBy(word => word, (key, values) => new
                           {
                              Word = key,
                              Count = values.Count()
                           })
                          .ToDictionary(g => g.Word, g => g.Count)
            })
            .ToDictionary(g => g.Filepath, g => g.Count);

然后相应地使用它:

int redCount = counters[@"C:\Users\(...)\D1_H1.txt"]["Red"];

【讨论】:

    【解决方案2】:

    复制粘贴代码一般不好。它会导致代码违反 Don't Repeat Yourself (DRY) 规则。重构你的代码:

    const string path = @"C:\Users\Niyomal N\Desktop\Assignment\Assignment";
    string[] files = new string[] { "D1_H1.txt", "D2_H1.txt", "D3_H1.txt", ... };
    
    foreach (string file in files) {
        string fullPath = Path.Combine(path, file);
        //TODO: count words of file `fullPath`
    }
    

    将单词计数存储在数组中并不是最佳选择,因为您必须针对文件中遇到的每个单词遍历数组。 改用具有恒定查找时间的字典。这要快得多。

    var wordCount = new Dictionary<string, int>();
    

    然后你可以像这样计算单词:

    int count;
    if (wordCount.TryGetValue(word, out count)) {
        wordCount[word] = count + 1;
    } else {
        wordCount[word] = 1;
    }
    

    更新

    你可以测试这样的关键字

    var keywords = new HashSet<string> { "Red", "Green", "Blue" };
    
    string word = "Green";
    if (keywords.Contains(word)) {
        ...
    }
    

    HasSet 和字典一样快。

    注意单词大小写。 HashSets 默认区分大小写。如果必须全部找到“red”、“Red”和“RED”,请像这样初始化HashSet

    var keywords = new HashSet<string>(StringComparer.InvariantCultureIgnoreCase)
        { "Red", "Green", "Blue" };
    

    【讨论】:

    • 我如何使用它来计算那些特殊关键字我的意思是我只想计算每个文档中的“红色”“绿色”和“蓝色”关键字计数,文档可以包含其他单词我只想过滤那些关键词的数量
    【解决方案3】:
    List<KeyValuePair<string, string>> completeList = new List<KeyValuePair<string, string>>();
    
                completeList.AddRange("D1_H1.txt",File.ReadAllText(@"C:\Users\Niyomal N\Desktop\Assignment\Assignment\D1_H1.txt").Split(' '));
                completeList.AddRange("D1_H2.txt", File.ReadAllText(@"C:\Users\Niyomal N\Desktop\Assignment\Assignment\D2_H1.txt").Split(' '));
                completeList.AddRange("D1_H3.txt", File.ReadAllText(@"C:\Users\Niyomal N\Desktop\Assignment\Assignment\D3_H2.txt").Split(' '));
                completeList.AddRange("D1_H4.txt", File.ReadAllText(@"C:\Users\Niyomal N\Desktop\Assignment\Assignment\D4_H2.txt").Split(' '));
                completeList.AddRange("D1_H5.txt", File.ReadAllText(@"C:\Users\Niyomal N\Desktop\Assignment\Assignment\D5_H2.txt").Split(' '));
                completeList.AddRange("D1_H6.txt", File.ReadAllText(@"C:\Users\Niyomal N\Desktop\Assignment\Assignment\Given_doc.txt").Split(' '));
    
    
                var result = completeList.GroupBy(r => r.Key).Select(r => new {File = r.Key, Red = r.Count(s => s.Value == "red"), Green = r.Count(s => s.Value == "green"), Blue = r.Count(s => s.Value == "blue") });
                foreach (var itm in result)
                {
                    Console.WriteLine(itm.File);
                    Console.WriteLine(itm.Red);
                    Console.WriteLine(itm.Green);
                    Console.WriteLine(itm.Blue);
    
                }
    

    【讨论】:

    • 我想分别获取每个文件中每个关键字的计数并将它们存储在一个数组中。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2017-07-14
    • 2016-02-23
    • 1970-01-01
    • 2022-11-12
    • 2013-11-19
    • 1970-01-01
    相关资源
    最近更新 更多