【问题标题】:Iterating through a nested table/spreadhseet遍历嵌套表/电子表格
【发布时间】:2017-06-27 14:40:55
【问题描述】:

不知道如何开始。我有许多制表符分隔的文件,我希望能够将它们放入数据库中。然而,困难的部分是桌子没有以最好的方式布置。例如,父行将被指定一个字母 (D),然后该父行下的行对应于父行,直到列出下一个 D 行

理想情况下,我希望所有子行与父行位于同一行。为了将其放入数据库并查询结果(除非有其他方法)

这里是数据链接:http://www.gasnom.com/ip/vector/archive.cfm?type=4

在任何人提到它之前更好地直观地表示数据,我无法抓取 html 数据,因为这是唯一具有相应网站的数据文件。

http://www.vector-pipeline.com/Informational-Postings/Index-of-Customers.aspx

【问题讨论】:

    标签: python python-2.7 csv pandas


    【解决方案1】:

    我认为这行得通。它只是在“父”行列表中的每个“父”行的末尾添加一个“子”行列表。

    customer_file = open('index_of_customers.txt', 'r') # you should of course do more try-except stuff in your script
    database = []                                       # all data ends up here
    for each_line in customer_file:                     # reads one line at a time
        each_line = each_line.strip('\n')               # removes newlines
        each_line = each_line.split('\t')               # split the line of text into a list. This should save any empty columns aswell
        if each_line[0] == 'D':                         # if line starts with a single D
            each_line.append([])                        # add a list for the other lines at the end of the D line
            database.append( each_line )                # add a D line to the "database" as a list
        else:                                           # if line don't start with a single D
            if len(database):                           # the first line is not a D line, so we need to check if the database is empty to avoid errors
                database[-1][-1].append(each_line)      # add the line to the last D line's list. 
    for each_D_line in database:                        # prints out the database in an ugly way
        print( str(each_D_line[:-1]) )                  # first the D lines
        for each_other_line in each_D_line[-1]:
            print( '\t' + str(each_other_line) )        # then each other line
    

    【讨论】:

      猜你喜欢
      • 2020-03-14
      • 2019-08-22
      • 2016-07-29
      • 2015-05-16
      • 2011-05-15
      • 2015-12-06
      • 2018-11-21
      • 2021-09-19
      相关资源
      最近更新 更多