Ruby解析输入文件并放入哈希答案

【问题标题】：Ruby parsing input file and put in hashRuby解析输入文件并放入哈希
【发布时间】：2023-03-18 17:56:01
【问题描述】：

我有一个文件来表示图中节点的邻接列表作为我需要解析的文本文件。第一行是节点总数。第二行是 node1，以及它连接到的节点列表（无向图）。例如

7
2 3 -1
1 3 4 5 7 -1
1 2 -1
2 6 -1
2 6 -1
4 5 -1
2 -1

line1：该图共有 7 个节点。
line2：Node1 连接 Node2、Node3。
line3：Node2连接Node1、Node3、Node4、Node5、Node7。

-1 有点没用。

这是我当前的 ruby 实现。我正在寻找一种方法来设置它

def parse_file(filename)
  total_nodes = `wc -l "#{filename}"`.strip.split(' ')[0].to_i
  node_hash = Hash.new

  File.foreach(filename).with_index do |line, line_num|
    # convert each line into an array
    line = line.strip.split(" ")
    # take out weird -1 at the end of txt file in each line
    line = line[0...-1]
    #puts "#{line_num}: #{line}"

    # how come node_hash[Node.new(line_num)] = line does not work?
    node_hash[Node.new(line_num)] = line
  end
end

parse_file('test_data.txt')

我的节点类有一个 adjacency_nodes 数组，我可以将 node2 和 node3 推入其中。例如：node1.adjancency_nodes

class Node
  attr_accessor :id, :state, :adjacent_nodes, :graph

  def initialize(id)
    @id = id
    @adjacent_nodes = []
  end

  def to_s
    "node #{@id}"
  end
end

循环遍历此文本文件、创建新节点并将其存储在哈希中以及推送其所有相邻节点的最干净的方法是什么？

【问题讨论】：

我完全不喜欢 ruby，但我做过类似的事情，所以让我尝试一下。
如果您可以修改文件的内容，您可能会考虑进行一些更改。首先，我建议您删除第一行的节点数，因为这可以通过计算行数来确定。其次，以节点号开始每一行。这使得文件更容易被人类阅读，并避免了按行号对行进行排序的需要。第三，对于每个节点，只列出编号较高的相邻节点。由于图是无向的，因此可以很容易地得出反向连接。如果使用手工制作的文件进行测试，这将特别有用。
我同意，输入文件太奇怪了。我没有想到直接编辑它而不是使用 ruby 来解析！
开发人员一直在为他们的程序设计奇怪的输入格式。逆向工程和编写读写器的灵活性也很有用。

标签： ruby algorithm file hash nodes

【解决方案1】：

系统调用的使用很奇怪；你真的不需要它来获取文件中的第一行。

第一行代表节点数。

之后的每一行代表给定节点的相邻节点。 n 行代表node (n-1) 的节点。

所以你可以逐行查看：

def parse_file(path)

  # start
  f = File.open(path, 'r')

  # get node count. Convert to integer
  num_nodes = f.readline.to_i

  # create your nodes
  nodes = {}
  1.upto(num_nodes) do |id|
    node = Node.new(id)
    nodes[id] = node
  end

  # join them and stuff
  1.upto(num_nodes) do |id|
    node = nodes[id]

    # for each line, read it, strip it, then split it
    tokens = f.readline.strip.split(" ")
    tokens.each do |other_id|
      other_id = other_id.to_i
      break if other_id == -1

      # grab the node object, using the ID as key
      other_node = nodes[other_id]
      node.adjacent_nodes << other_node
    end
  end

  # done
  f.close
end

【讨论】：

【解决方案2】：

这具有家庭作业问题的所有特征，但我会尽力提供帮助。

我的节点类有一个 adjacency_nodes 数组，我可以将 node2 和 node3 推入其中。例如：node1.adjancency_nodes

您想将节点 ID 推送到数组还是节点引用本身？

# how come node_hash[Node.new(line_num)] = line does not work?

“不起作用”是什么意思？它不会将该行添加到您的哈希中吗？

您正在构建一个哈希，其中键是节点引用，值是相邻节点。您实际上并没有修改每个节点的 adjacent_nodes 属性。那是你想做的吗？此外，如果您有两行引用相同的节点 ID，例如2，您将实例化该节点两次，例如Node.new(2) 将被调用两次。这是你想做的吗？

看看你写的东西，我注意到一些事情：

你使用String#strip，而你真的只是想String#chomp 换行符离开字符串的末尾。
您忽略了文件顶部的重要信息，即节点总数。
更糟糕的是，您正在调用 shell 命令来获取该信息（只需在 Ruby 中执行此操作！），而您做错了：您将包含节点数的行计算为节点定义，所以你的 total_nodes 变量被设置为八（不是七）。
您忽略了每行末尾的重要信息。是的，-1 终结符有点奇怪，但您可以使用它来确定何时停止处理该行。我认为您的教授计划发送一些错误的输入，例如2 3 4 -1 5，以查看您的代码是否损坏。在这种情况下，您的代码应该只考虑相邻的节点 2、3 和 4。
在所有情况下，您都不会将包含数值的字符串转换为正确的整数。
您希望您的parse_file 方法返回什么？它可能不会像现在写的那样返回您认为它正在返回的内容。

考虑到这一点，让我们进行一些更改：

让我们读取输入的第一行以确定节点总数。
让我们预先分配我们的节点，以便我们只实例化每个节点一次。
让我们使用一个数组（索引从零开始）来保存我们的节点引用。在实例化节点和对数组执行查找时（使用从 1 开始的节点 ID），我们必须牢记这一点。
当我们看到无效的节点 ID 时，让我们停止处理相邻节点。
让我们对所有输入使用String#chomp 和String#to_i。
让我们实际添加相邻节点...
让我们在方法的末尾返回节点数组，以便调用者得到一些有用的东西。

def parse_file(filename)
  # open the file in read-only mode
  file = File.new(filename, 'r')
  # read the first line as an integer to determine the number of nodes
  num_nodes = file.readline.chomp.to_i
  # preallocate our nodes so we can store adjacent node references
  nodes = Array.new(num_nodes) { |i| Node.new(i + 1) }

  # read the remaining lines containing node definitions
  file.each_line.with_index do |line, i|
    # parse the adjacent node ids as integers
    line.chomp.split(' ').map(&:to_i).each do |node_id|
      # a sentinel node id of -1 means stop processing
      break if node_id < 0

      # TODO: What's supposed to happen when the node doesn't exist?
      #raise "Unknown node ID: #{node_id}" if node_id == 0 || node_id > num_nodes

      # add the node reference to the list of adjacent nodes
      nodes[i].adjacent_nodes << nodes[node_id - 1]
    end
  end

  nodes
end

【讨论】：

谢谢。我不认为我需要担心这些案件。假设所有输入文件都是正确的。所以没有 1 -1 4 7 或类似的东西。目标只是创建具有其他相邻节点列表的节点。我将在其他部分使用它来构建图表。清理原始文件本身要容易得多，但是尝试保持原样并看看我是否可以在 ruby 中使用它会更有趣。

【解决方案3】：

人们可能会利用 ruby 支持技术上无限的对象交叉嵌套：

class Node
  attr_accessor :id, :adjacents
  def initialize(id)
    @id = id
    @adjacents = []
  end
  def to_s
    "<#Node #{@adjacents.map(&:id).inspect}>"
  end
end

class Graph
  attr_accessor :nodes
  def initialize(count)
    @nodes = (1..count).map(&Node.method(:new))
  end
  def to_s
    "<#Graph nodes: {#{@nodes.map(&:to_s)}}>"
  end
end

input = "7\n2 3 -1\n1 3 4 5 7 -1\n1 2 -1\n2 6 -1\n2 6 -1\n4 5 -1\n2 -1"

graph, *nodes = input.split($/)
count = graph.to_i

result =
  nodes.
    each.
    with_index.
    with_object(Graph.new(count)) do |(line, idx), graph|
      graph.nodes[idx].adjacents |=
        line.split.map(&:to_i).
          select { |e| e >= 1 && e <= count }.
          map { |e| graph.nodes[e - 1] }
    end

现在您有了无限嵌套的图（您可以在任何节点上调用adjacents 以获得正确的结果。）

顶层图结构可以通过以下方式实现：

puts result.to_s
#⇒ <#Graph nodes: {["<#Node [2, 3]>",
#                   "<#Node [1, 3, 4, 5, 7]>",
#                   "<#Node [1, 2]>",
#                   "<#Node [2, 6]>",
#                   "<#Node [2, 6]>",
#                   "<#Node [4, 5]>",
#                   "<#Node [2]>"]}>

【讨论】：