【问题标题】:How to sort list of student records by categories in Python如何在 Python 中按类别对学生记录列表进行排序
【发布时间】:2020-10-09 01:46:41
【问题描述】:

假设我有一个看起来像这样的嵌套列表

# Student records
data_list=[ # person, subject, grade
   ["John", "Physics", 5], ["John", "PC", 7], ["John", "Math", 8], 
   ["Mary", "Physics", 6], ["Mary", "PC", 10], ["Mary", "Algebra", 7], 
   ["Helen", "Physics", 7], ["Helen","PC", 6], ["Helen", "Algebra", 8], 
   ["Helen", "Analysis", 10], ["Bill", "PC", 10], ["Bill", "Analysis", 6], 
   ["Bill", "Math", 8], ["Bill", "Biology", 6], ["Michael", "Analysis", 10]
]

我如何创建一个代码来打印每个学生参加的科目?代码输出应如下所示:

# Subjects taken by person
[["John", "Physics", "PC", "Math"],
["Mary", "Physics", "PC", "Algebra"],
...]

我更喜欢只使用 Python 列表的解决方案。

【问题讨论】:

    标签: python list nested names


    【解决方案1】:

    您可以尝试以下功能:

    def consolidate(data):
        # list of names from  data_list
        names = list(set([i[0] for i in data])) 
        consolidated = [] # empty list to get the sublists
    
        # Looping for each name
        for name in names:
            subList = [name] # sublist for each name
            for i in data: # loop for each item within data_list
                if i[0] == name: # if name is the same, append the subject
                    subList.append(i[1])
            consolidated.append(subList) # append sublists in consolidated list
        return consolidated
    

    调用函数:

    consolidate(data_list)
    

    输出:

    [['Mary', 'Physics', 'PC', 'Algebra'], ['Helen', 'Physics', 'PC', 'Algebra', 'Analysis'],
    ['John', 'Physics', 'PC', 'Math'], ['Bill', 'PC', 'Analysis', 'Math', 'Biology'],
    ['Michael', 'Analysis']]
    

    【讨论】:

      【解决方案2】:

      如何使用字典来收集每个人的主题:

      from collections import defaultdict
      
      subjects = defaultdict(list)
      for record in data_list: 
          subjects[record[0]].append(record[1])
      
      # subjects
      defaultdict(list,
                  {'John': ['Physics', 'PC', 'Math'],
                   'Mary': ['Physics', 'PC', 'Algebra'],
                   'Helen': ['Physics', 'PC', 'Algebra', 'Analysis'],
                   'Bill': ['PC', 'Analysis', 'Math', 'Biology'],
                   'Michael': ['Analysis']})
      

      然后可以将其转换为列表列表:

      subjects_list = [[x] + y for x, y in subjects.items()]
      
      # subjects_list
      [['John', 'Physics', 'PC', 'Math'],
       ['Mary', 'Physics', 'PC', 'Algebra'],
       ['Helen', 'Physics', 'PC', 'Algebra', 'Analysis'],
       ['Bill', 'PC', 'Analysis', 'Math', 'Biology'],
       ['Michael', 'Analysis']]
      

      编辑根据 OP 的要求,这是一个仅使用列表的解决方案:

      subject_list = []  # output
      persons = []       # track unique persons
      for record in data_list:
          if record[0] not in persons:
              persons.append(record[0])           # Track new person
              subject_list.append([record[0]])    # Add new person to output
              subject_list[-1].append(record[1])  # Add subject for person
          else:
              subject_list[persons.index(record[0])].append(record[1])
      
      # subject_list
      [['John', 'Physics', 'PC', 'Math'],
       ['Mary', 'Physics', 'PC', 'Algebra'],
       ['Helen', 'Physics', 'PC', 'Algebra', 'Analysis'],
       ['Bill', 'PC', 'Analysis', 'Math', 'Biology'],
       ['Michael', 'Analysis']]
      

      EDIT 2 您可以推广这种方法,通过任何索引(例如人,...)过滤掉一个 类别 em>(例如学科、年级、...)

      def groupby(data, index, category):
          """Sort list of records by index and category
          """
          output = []
          indices = []
          for record in data:
              if record[index] not in indices:
                  indices.append(record[index])
                  output.append([record[index]])
                  output[-1].append(record[category])
              else:
                  output[indices.index(record[index])].append(record[category])
          return output
      

      这将允许您像这样列出每个人的主题:

      # index 0 -> person
      # category 1 -> subject
      subject_list = groupby(data_list, 0, 1)
      

      或者你可以这样列出每个人的成绩:

      # index 0 -> person
      # category 2 -> grade
      grade_list = groupby(data_list, 0, 2)
      
      # grad_list
      [['John', 5, 7, 8],
       ['Mary', 6, 10, 7],
       ['Helen', 7, 6, 8, 10],
       ['Bill', 10, 6, 8, 6],
       ['Michael', 10]]
      

      然后您可以像这样获得每人所学科目的数量或平均成绩:

      import statistics
      
      subjects_taken = [len(x) - 1 for x in subject_list]
      average_grade = [statistics.mean(x[1:]) for x in grade_list]
      

      把所有东西放在一起给你:

      persons = [x[0] for x in subject_list]
      final_list = list(zip(persons, subjects_taken, average_grade))
      
      # final_list
      [('John', 3, 6.666666666666667),
       ('Mary', 3, 7.666666666666667),
       ('Helen', 4, 7.75),
       ('Bill', 4, 7.5),
       ('Michael', 1, 10)]
      

      【讨论】:

      • 只需要代码从data_list取数据,只使用lists
      • @Nontas 该代码仅使用 data_list 中的数据。我添加了一个仅使用列表的解决方案,尽管您确实应该尽可能利用其他 Python 容器类型。
      • 非常感谢!我也会尝试字典解决方案。我写的代码和你的很相似,但是在主题列表中我放的是 0 而不是 -1
      • 如何根据课数和总平均数得到每个名字的结果?它应该看起来像这样 [['John', 3, 6.67],.....]
      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2014-10-20
      • 2018-06-21
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多