【问题标题】:Reading a line from a streamreader without consuming?从流式阅读器读取一行而不消耗?
【发布时间】:2010-10-24 22:56:19
【问题描述】:

有没有办法提前读一行来测试下一行是否包含特定的标签数据?

我正在处理一种有开始标签但没有结束标签的格式。

我想读取一行,将其添加到结构中,然后测试下面的行以确保它不是新的“节点”,如果它关闭该结构并创建一个新的,则它没有继续添加

我能想到的唯一解决方案是让两个流阅读器同时进行,沿着锁定步骤有点混乱,但这似乎很浪费(如果它甚至可以工作的话)

我需要 peek 但 peekline 之类的东西

【问题讨论】:

  • 我认为 PeekLine 方法不是处理“无结束标签”问题的好方法,因为您总是需要查看行并测试新结构的开始位置。我想将流的位置设置为上一行,下一个 ReadLine 将返回您已阅读的行。

标签: c# readline streamreader gedcom


【解决方案1】:

问题是底层流甚至可能不可搜索。如果您查看流读取器实现,它使用缓冲区,因此即使流不可搜索,它也可以实现 TextReader.Peek()。

您可以编写一个简单的适配器来读取下一行并在内部对其进行缓冲,如下所示:

 public class PeekableStreamReaderAdapter
    {
        private StreamReader Underlying;
        private Queue<string> BufferedLines;

        public PeekableStreamReaderAdapter(StreamReader underlying)
        {
            Underlying = underlying;
            BufferedLines = new Queue<string>();
        }

        public string PeekLine()
        {
            string line = Underlying.ReadLine();
            if (line == null)
                return null;
            BufferedLines.Enqueue(line);
            return line;
        }


        public string ReadLine()
        {
            if (BufferedLines.Count > 0)
                return BufferedLines.Dequeue();
            return Underlying.ReadLine();
        }
    }

【讨论】:

  • 我会在使用之前初始化BufferedLines :) 而且,我会为 PeekLine() 使用另一个名称,因为顾名思义它总是会返回同一行(下一行来自最后一个 ReadLine 的位置)。已经投票 +1
  • 感谢添加初始化程序。甚至从未编译过代码。也许像 LookAheadReadLine() 这样的东西可能更合适。
  • 我稍微扩展了这个,所以类继承自 TextReader:gist.github.com/1317325
  • @AndyEdinborough 喜欢 PeekableTextReader
  • @AndyEdinborough 你刚刚为我节省了两个小时,辛苦了,非常感谢!
【解决方案2】:

您可以存储访问 StreamReader.BaseStream.Position 的位置,然后读取下一行,进行测试,然后在读取该行之前找到该位置:

            // Peek at the next line
            long peekPos = reader.BaseStream.Position;
            string line = reader.ReadLine();

            if (line.StartsWith("<tag start>"))
            {
                // This is a new tag, so we reset the position
                reader.BaseStream.Seek(pos);    

            }
            else
            {
                // This is part of the same node.
            }

这是很多寻找和重新阅读相同的行。使用一些逻辑,您可以完全避免这种情况 - 例如,当您看到一个新标签开始时,关闭现有结构并开始一个新结构 - 这是一个基本算法:

        SomeStructure myStructure = null;
        while (!reader.EndOfStream)
        {
            string currentLine = reader.ReadLine();
            if (currentLine.StartsWith("<tag start>"))
            {
                // Close out existing structure.
                if (myStructure != null)
                {
                    // Close out the existing structure.
                }

                // Create a new structure and add this line.
                myStructure = new Structure();                   
                // Append to myStructure.
            }
            else
            {
                // Add to the existing structure.
                if (myStructure != null)
                {
                    // Append to existing myStructure
                }
                else
                {
                    // This means the first line was not part of a structure.
                    // Either handle this case, or throw an exception.
                }
            }
        }

【讨论】:

【解决方案3】:

为什么困难?无论如何都返回下一行。检查它是否是一个新节点,如果不是,将其添加到结构中。如果是,则创建一个新结构。

// Not exactly C# but close enough
Collection structs = new Collection();
Struct struct;
while ((line = readline()) != null)) {
    if (IsNode(line)) {
        if (struct != null) structs.add(struct);
        struct = new Struct();
        continue;
    }
    // Whatever processing you need to do
    struct.addLine(line);
}
structs.add(struct); // Add the last one to the collection

// Use your structures here
foreach s in structs {

}

【讨论】:

    【解决方案4】:

    这是我到目前为止所做的。我走的分割路线比逐行路线的流式阅读器更多。

    我确信有一些地方正在变得更加优雅,但现在它似乎正在发挥作用。

    请告诉我你的想法

    struct INDI
        {
            public string ID;
            public string Name;
            public string Sex;
            public string BirthDay;
            public bool Dead;
    
    
        }
        struct FAM
        {
            public string FamID;
            public string type;
            public string IndiID;
        }
        List<INDI> Individuals = new List<INDI>();
        List<FAM> Family = new List<FAM>();
        private void button1_Click(object sender, EventArgs e)
        {
            string path = @"C:\mostrecent.ged";
            ParseGedcom(path);
        }
    
        private void ParseGedcom(string path)
        {
            //Open path to GED file
            StreamReader SR = new StreamReader(path);
    
            //Read entire block and then plit on 0 @ for individuals and familys (no other info is needed for this instance)
            string[] Holder = SR.ReadToEnd().Replace("0 @", "\u0646").Split('\u0646');
    
            //For each new cell in the holder array look for Individuals and familys
            foreach (string Node in Holder)
            {
    
                //Sub Split the string on the returns to get a true block of info
                string[] SubNode = Node.Replace("\r\n", "\r").Split('\r');
                //If a individual is found
                if (SubNode[0].Contains("INDI"))
                {
                    //Create new Structure
                    INDI I = new INDI();
                    //Add the ID number and remove extra formating
                    I.ID = SubNode[0].Replace("@", "").Replace(" INDI", "").Trim();
                    //Find the name remove extra formating for last name
                    I.Name = SubNode[FindIndexinArray(SubNode, "NAME")].Replace("1 NAME", "").Replace("/", "").Trim(); 
                    //Find Sex and remove extra formating
                    I.Sex = SubNode[FindIndexinArray(SubNode, "SEX")].Replace("1 SEX ", "").Trim();
    
                    //Deterine if there is a brithday -1 means no
                    if (FindIndexinArray(SubNode, "1 BIRT ") != -1)
                    {
                        // add birthday to Struct 
                        I.BirthDay = SubNode[FindIndexinArray(SubNode, "1 BIRT ") + 1].Replace("2 DATE ", "").Trim();
                    }
    
                    // deterimin if there is a death tag will return -1 if not found
                    if (FindIndexinArray(SubNode, "1 DEAT ") != -1)
                    {
                        //convert Y or N to true or false ( defaults to False so no need to change unless Y is found.
                        if (SubNode[FindIndexinArray(SubNode, "1 DEAT ")].Replace("1 DEAT ", "").Trim() == "Y")
                        {
                            //set death
                            I.Dead = true;
                        }
                    }
                    //add the Struct to the list for later use
                    Individuals.Add(I);
                }
    
                // Start Family section
                else if (SubNode[0].Contains("FAM"))
                {
                    //grab Fam id from node early on to keep from doing it over and over
                    string FamID = SubNode[0].Replace("@ FAM", "");
    
                    // Multiple children can exist for each family so this section had to be a bit more dynaimic
    
                    // Look at each line of node
                    foreach (string Line in SubNode)
                    {
                        // If node is HUSB
                        if (Line.Contains("1 HUSB "))
                        {
    
                            FAM F = new FAM();
                            F.FamID = FamID;
                            F.type = "PAR";
                            F.IndiID = Line.Replace("1 HUSB ", "").Replace("@","").Trim();
                            Family.Add(F);
                        }
                            //If node for Wife
                        else if (Line.Contains("1 WIFE "))
                        {
                            FAM F = new FAM();
                            F.FamID = FamID;
                            F.type = "PAR";
                            F.IndiID = Line.Replace("1 WIFE ", "").Replace("@", "").Trim();
                            Family.Add(F);
                        }
                            //if node for multi children
                        else if (Line.Contains("1 CHIL "))
                        {
                            FAM F = new FAM();
                             F.FamID = FamID;
                            F.type = "CHIL";
                            F.IndiID = Line.Replace("1 CHIL ", "").Replace("@", "");
                            Family.Add(F);
                        }
                    }
                }
            }
        }
    
        private int FindIndexinArray(string[] Arr, string search)
        {
            int Val = -1;
            for (int i = 0; i < Arr.Length; i++)
            {
                if (Arr[i].Contains(search))
                {
                    Val = i;
                }
            }
            return Val;
        }
    

    【讨论】:

    • FAM 和 INDI 是这些结构的可怕名称(如果其他人可能需要阅读或使用您的代码)。
    • 这是标签的名称,我认为它很容易解释
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2019-01-21
    • 1970-01-01
    • 2015-09-19
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2017-08-21
    相关资源
    最近更新 更多