【问题标题】:Find first line of text according to value in Python在Python中根据值查找第一行文本
【发布时间】:2019-05-22 03:39:50
【问题描述】:

如何在 Python 中的“file.txt”列表中搜索第一个“纬度、经度”坐标的值并获取上面 3 行和下面 3 行?

价值

37.0459

文件.txt

37.04278,-95.58895
37.04369,-95.58592
37.04369,-95.58582
37.04376,-95.58557
37.04376,-95.58546
37.04415,-95.58429
37.0443,-95.5839
37.04446,-95.58346
37.04461,-95.58305
37.04502,-95.58204
37.04516,-95.58184
37.04572,-95.58139
37.04597,-95.58127
37.04565,-95.58073
37.04546,-95.58033
37.04516,-95.57948
37.04508,-95.57914
37.04494,-95.57842
37.04483,-95.5771
37.0448,-95.57674
37.04474,-95.57606
37.04467,-95.57534
37.04462,-95.57474
37.04458,-95.57396
37.04454,-95.57274
37.04452,-95.57233
37.04453,-95.5722
37.0445,-95.57164
37.04448,-95.57122
37.04444,-95.57054
37.04432,-95.56845
37.04432,-95.56834
37.04424,-95.5668
37.044,-95.56251
37.04396,-95.5618

预期结果

37.04502,-95.58204
37.04516,-95.58184
37.04572,-95.58139
37.04597,-95.58127
37.04565,-95.58073
37.04546,-95.58033
37.04516,-95.57948

其他信息

在 linux 中,我可以获得最近的线路并使用 grep、sed、cut 等进行我需要的处理,但我希望在 Python 中。

任何帮助将不胜感激! 谢谢。

【问题讨论】:

    标签: python search text filter find


    【解决方案1】:

    如何搜索第一个“纬度,经度”的值 在 Python 中的“file.txt”列表中进行坐标,并在上面获取 3 行和 3 下面的行?*


    你可以试试:

    with open("text_filter.txt") as f:
        text = f.readlines() # read text lines to list
    
        filter= "37.0459"
        match = [i for i,x in enumerate(text) if filter in x] # get list index of item matching filter
        if match:
            if len(text) >= match[0]+3: # if list has 3 items after filter, print it
                print("".join(text[match[0]:match[0]+3]).strip())
            print(text[match[0]].strip())
            if match[0] >= 3:  # if list has 3 items before filter, print it
                print("".join(text[match[0]-3:match[0]]).strip())
    

    输出:

    37.04597,-95.58127
    37.04565,-95.58073
    37.04546,-95.58033
    37.04597,-95.58127
    37.04502,-95.58204
    37.04516,-95.58184
    37.04572,-95.58139
    

    【讨论】:

    • 此解决方案存在一些问题: 1/ 查找“37.044”匹配“37.04415,-95.58429”行 2/ 如果匹配在文件的最后 3 行,则返回错误结果.
    • 有时用户所问的并不是他们所需要的。 OP 似乎对结果感到满意,我对此表示满意。无论如何,谢谢您指出这一点。总帐。
    【解决方案2】:

    您可以使用 pandas 将数据导入数据框中,然后轻松对其进行操作。根据您的问题,要检查的值不是完全匹配,因此我已将其转换为字符串。

    import pandas as pd
    data = pd.read_csv("file.txt", header=None, names=["latitude","longitude"]) #imports text file as dataframe
    value_to_check = 37.0459 # user defined
    for i in range(len(data)):
        if str(value_to_check) == str(data.iloc[i,0])[:len(str(value_to_check))]:
            break
    print(data.iloc[i-3:i+4,:])
    

    输出

        latitude  longitude
    9   37.04502  -95.58204
    10  37.04516  -95.58184
    11  37.04572  -95.58139
    12  37.04597  -95.58127
    13  37.04565  -95.58073
    14  37.04546  -95.58033
    15  37.04516  -95.57948
    

    【讨论】:

      【解决方案3】:

      一个带有迭代器的解决方案,它只在内存中保留必要的行并且不加载文件的不必要部分:

      from collections import deque
      from itertools import islice
      
      
      def find_in_file(file, target, before=3, after=3):
      
          queue = deque(maxlen=before)
          with open(file) as f:
              for line in f:
                  if target in map(float, line.split(',')):
                      out = list(queue) + [line] + list(islice(f, 3))
                      return out
                  queue.append(line)
              else:
                  raise ValueError('target not found')
      

      一些测试:

      print(find_in_file('test.txt', 37.04597))
      
      # ['37.04502,-95.58204\n', '37.04516,-95.58184\n', '37.04572,-95.58139\n', '37.04597,-95.58127\n',
      #  '37.04565,-95.58073\n', '37.04565,-95.58073\n', '37.04565,-95.58073\n']
      
      print(find_in_file('test.txt', 37.044))  # Only one line after the match
      
      # ['37.04432,-95.56845\n', '37.04432,-95.56834\n', '37.04424,-95.5668\n', '37.044,-95.56251\n', 
      #   '37.04396,-95.5618\n']
      

      此外,如果匹配之前或之后的行数少于预期的行数,它也会起作用。我们匹配浮点数,而不是字符串,否则 '37.04' 会错误地匹配 '37.0444'。

      【讨论】:

        【解决方案4】:

        此解决方案将打印前后元素,即使它们小于 3。 我也使用字符串,因为它暗示你也想要部分匹配的问题。 即。 37.0459 将匹配 37.04597

        search_term='37.04462'
        with open('file.txt') as f:
            lines = f.readlines()
        lines = [line.strip().split(',') for line in lines] #remove '\n'
        for lat,lon in lines:
            if search_term in lat:
                index=lines.index([lat,lon])
                break
        left=0
        right=0
        for k in range (1,4): #bcoz last one is not included
            if index-k >=0:
                left+=1
            if index+k<=(len(lines)-1):
                right+=1
        for i in range(index-left,index+right+1): #bcoz last one is not included
            print(lines[i][0],lines[i][1])
        

        【讨论】:

          猜你喜欢
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 2020-10-13
          • 2012-04-28
          • 1970-01-01
          • 1970-01-01
          • 2010-10-23
          相关资源
          最近更新 更多