【问题标题】:Finding maximum temperature for every month in a csv file?在 csv 文件中查找每个月的最高温度?
【发布时间】:2016-12-17 17:21:55
【问题描述】:

我需要一些帮助。所以我有一个large csv file(+8785 行)。

所以,我基本上需要的是获得每个月的最高温度。例如(输出):

Month Max Temperature

January 5.3
February 6.1
March 25.5
...

这是我写的:

temp = open("weather_2012.csv","r")
total = 0
maxt = 0.0

for line in temp:
    try:
        p = float(line.split(",")[1])
        total += 1
        maxt = max(maxt,p)
    except:
        pass

print("Maximum:",maxt)

但整个月(整体)只有一个最高温度:

Maximum: 33.0

【问题讨论】:

  • 您没有使用文件中的 Date 值在月份之间进行过滤(第一列),并且提供的示例仅显示 1 月份的 5 天。
  • 抱歉,我的文件太大了,在线 csv viever 只显示前 100 列。如何过滤日期?这是我的问题,因为每个月都有不同的天数。

标签: python python-3.x csv


【解决方案1】:

我认为这是一个好方法,因为它避免了将许多(如果不是大多数)值硬编码到所需代码中(因此适用于任何年份,并使用特定于语言环境的月份名称):

from calendar import month_name
import csv
from datetime import datetime
import sys

filename = 'weather_2012.csv'
max_temps = [-sys.maxsize] * 13  # has extra [0] entry

with open(filename, 'r', newline='') as csvfile:
    reader = csv.reader(csvfile); next(reader)  # skip header row
    for date, high_temp, *_ in reader:
        month = datetime.strptime(date, '%Y-%m-%d %H:%M:%S').month
        max_temps[month] = max(max_temps[month], float(high_temp))

print('Monthly Max Temperatures\n')
longest = max(len(month) for month in month_name)  # length of longest month name
for month, temp in enumerate(max_temps[1:], 1):
    print('{:>{width}}: {:5.1f}'.format(month_name[month], temp, width=longest))

输出:

Monthly Max Temperatures

  January:   5.3
 February:   6.1
    March:  25.5
    April:  27.8
      May:  31.2
     June:  33.0
     July:  33.0
   August:  32.8
September:  28.4
  October:  21.1
 November:  17.5
 December:  11.9

【讨论】:

    【解决方案2】:

    你必须找到的不是一个,而是所有十二个最大值。您可以从月份名称列表开始,并在此列表中找到每个月的最大值。在您的 csv 文件中,月份位于第一个元素的字符位置 5 到 6。

    使用这种数据格式……

    Date/Time,Temp (C),Dew Point Temp (C),Rel Hum (%),Wind Spd (km/h),Visibility (km),Stn Press (kPa),Weather
    2012-01-01 00:00:00,-1.8,-3.9,86,4,8.0,101.24,Fog
    2012-01-01 01:00:00,-1.8,-3.7,87,4,8.0,101.24,Fog
    2012-01-01 02:00:00,-1.8,-3.4,89,7,4.0,101.26,"Freezing Drizzle,Fog"
    2012-01-01 03:00:00,-1.5,-3.2,88,6,4.0,101.27,"Freezing Drizzle,Fog"
    2012-01-01 04:00:00,-1.5,-3.3,88,7,4.8,101.23,Fog
    … to be continued
    

    ...你可以通过这个程序找到最大值:

    month=["January","February","March","April","May","June","July",
           "August","September","October","November","December"]
    maxt = {}
    with open("weather_2012.csv","r") as temp:
        for line in temp:
            try: # is there valid data in line?
                m0, p0, *junk = line.split(",")
                p = float(p0)
                m = month[int(m0[5:7])-1]
                try: # do we already have data for this month?
                    maxt[m] = max (p, maxt[m])
                except: # first data of this month 
                    maxt[m] = p
            except: # skip this line
                pass
    
    print("Maxima:")        
    for m in month:
        print("%s: %g"%(m,maxt[m]))
    

    【讨论】:

      【解决方案3】:

      另一种解决方案可能是这样的:

      #-*- coding: utf-8 -*-
      import csv
      import datetime
      import itertools
      import collections
      
      fd = open('weather_2012.csv', 'rb')
      reader = csv.DictReader(fd, delimiter=',')
      rows = []
      for row in reader:
          row['yearmonth'] = datetime.datetime.strptime(row['Date/Time'],  '%Y-%m-%d %H:%M:%S').strftime('%Y%m')
          rows.append(row)
      fd.close()
      # sort them
      rows.sort(key=lambda r: r['yearmonth'])
      ans = collections.OrderedDict()
      for yearmonth, values in itertools.groupby(rows, lambda r: r['yearmonth']):
          ans[yearmonth] = max([float(r['Temp (C)']) for r in values])
      
      print ans
      

      此方案首先根据年月字符串对数据进行排序,然后使用 groupby 内置函数。

      【讨论】:

        【解决方案4】:

        首先您必须在第一列中按每个月过滤每个值,然后您可以找到每个月的最高温度

        希望接下来的代码可以帮助到你:

        import csv
        months= {
            "01": "January",
            "02": "February",
            "03": "March",
            "04": "April",
            "05": "May",
            "06": "June",
            "07": "July",
            "08": "August",
            "09": "September",
            "10": "October",
            "11": "November","12": "December"
        }
        
        weather_file = csv.DictReader(open("weather_2012.csv", 'r'), delimiter=',', quotechar='"')
        
        results = {}
        
        for row in weather_file:
            # get month
            month = row["Date/Time"].split(" ")[0].split("-")[1]
            if not (month in results):
                results[month] = {
                    "max": float(row["Temp (C)"])
                }
                continue
        
            if float(row["Temp (C)"]) > results[month]["max"]:
                results[month]["max"] = float(row["Temp (C)"])
        
        # ordering and showing
        print "Max temp by month:"
        for month in sorted(results, key=lambda results: results):
            # do some stuff about month, to this case only show
            print "%s: %.2f" % (months[month], results[month]["max"])
        

        输出: Max temp by month: January: 5.3 February: 6.1 March: 25.5 April: 27.8 May: 31.2 June: 33.0 July: 33.0 August: 32.8 September: 28.4 October: 21.1 November: 17.5 December: 11.9

        【讨论】:

          猜你喜欢
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 2021-08-27
          • 1970-01-01
          • 2018-04-11
          • 2019-12-01
          • 1970-01-01
          • 1970-01-01
          相关资源
          最近更新 更多