【问题标题】:Formatting JSON output格式化 JSON 输出
【发布时间】:2016-04-21 22:09:00
【问题描述】:

我有一个包含键值对数据的 JSON 文件。我的 JSON 文件如下所示。

{
    "professors": [
        {
            "first_name": "Richard", 
            "last_name": "Saykally", 
            "helpfullness": "3.3", 
            "url": "http://www.ratemyprofessors.com/ShowRatings.jsp?tid=111119", 
            "reviews": [
                {
                    "attendance": "N/A", 
                    "class": "CHEM 1A", 
                    "textbook_use": "It's a must have", 
                    "review_text": "Tests were incredibly difficult (averages in the 40s) and lectures were essentially useless. I attended both lectures every day and still was unable to grasp most concepts on the midterms. Scope out a good GSI to get help and ride the curve."
                }, 
                {
                    "attendance": "N/A", 
                    "class": "CHEMISTRY1A", 
                    "textbook_use": "Essential to passing", 
                    "review_text": "Saykally really isn't as bad as everyone made him out to  be. If you go to his lectures he spends about half the time blowing things up, but if you actually read the texts before his lectures and pay attention to what he's writing/saying, you'd do okay. He posts practice tests that were representative of actual tests and curves the class nicely!"
                }]
         {
      {
        "first_name": "Laura", 
        "last_name": "Stoker", 
        "helpfullness": "4.1", 
        "url": "http://www.ratemyprofessors.com/ShowRatings.jsp?tid=536606", 
        "reviews": [
            {
                "attendance": "N/A", 
                "class": "PS3", 
                "textbook_use": "You need it sometimes", 
                "review_text": "Stoker is by far the best professor.  If you put in the effort, take good notes, and ask questions, you will be fine in the class. As far as her lecture, she does go a bit fast, but her lecture is in the form of an outline. As long as you take good notes, you will have everything you need for exams. She is funny and super nice if you speak with her"
            }, 
            {
                "attendance": "Mandatory", 
                "class": "164A", 
                "textbook_use": "Barely cracked it open", 
                "review_text": "AMAZING professor.  She has a good way of keeping lectures interesting.  Yes, she can be a little everywhere and really quick with her lecture, but the GSI's are useful to make sure you understand the material.  Oh, and did I mention she's hilarious!"
            }]
    }]

所以我正在尝试做多种事情。 我正在尝试获得最受关注的 ['class'] 键。然后获取班级名称和提到的时间。 然后我想以这种方式输出我的格式。也在教授阵下。这只是教授的信息,例如 CHEM 1A、CHEMISTRY1A - 它是 Richard Saykally。

{
    courses:[
    {
       "course_name" : # class name
       "course_mentioned_times" : # The amount of times the class was mentioned
       professors:[ #The professor array should have professor that teaches this class which is in my shown json file
         {
              'first_name' : 'professor name'
              'last_name' : 'professor last name'
         }
    }

所以我想对我的 json 文件键值进行排序,其中我的最大值为最小值。到目前为止,我已经能够弄清楚 isd

if __name__ == "__main__":
        open_json = open('result.json')
        load_as_json = json.load(open_json)['professors']
        outer_arr = []
        outer_dict = {}
        for items in load_as_json:

            output_dictionary = {}
            all_classes = items['reviews']
            for classes in all_classes:
                arr_info = []
                output_dictionary['class'] = classes['class']
                output_dictionary['first_name'] = items['first_name']
                output_dictionary['last_name'] = items['last_name']
                #output_dictionary['department'] = items['department']
                output_dictionary['reviews'] = classes['review_text']
                with open('output_info.json','wb') as outfile:
                    json.dump(output_dictionary,outfile,indent=4)

【问题讨论】:

  • 您的问题标题提到了格式,但听起来像是在对 json 文件中的数据进行排序。那是对的吗?您还需要更清楚(更明确)您的输入和期望的输出是什么。
  • Benji,Stack Overflow 是一个问答网站。像你这样的读者提出问题,而其他读者试图回答这些问题。您的帖子中有很多信息,但它缺少使 Stack Overflow 工作的一件事:一个问题。您有具体的编程问题吗?
  • 我为 @Rob 道歉。是的,我在以我想要的格式打印输出时遇到问题。我不知道我应该如何接近。就像我启动一个新字典或新数组一样。我的输出不断重复。我在脚本中启动字典和数组时遇到问题。就像它需要在哪个 for 循环下一样。因此,我提供了以下代码。
  • 你有什么问题?

标签: python arrays json dictionary output


【解决方案1】:

我认为这个程序可以满足您的需求:

import json


with open('result.json') as open_json:
    load_as_json = json.load(open_json)

courses = {}
for professor in load_as_json['professors']:
    for review in professor['reviews']:
        course = courses.setdefault(review['class'], {})
        course.setdefault('course_name', review['class'])
        course.setdefault('course_mentioned_times', 0)
        course['course_mentioned_times'] += 1
        course.setdefault('professors', [])
        prof_name = {
            'first_name': professor['first_name'],
            'last_name': professor['last_name'],
        }
        if prof_name not in course['professors']:
            course['professors'].append(prof_name)

courses = {
    'courses': sorted(courses.values(),
                      key=lambda x: x['course_mentioned_times'],
                      reverse=True)
}
with open('output_info.json', 'w') as outfile:
    json.dump(courses, outfile, indent=4)

结果,使用问题中的示例输入:

{
    "courses": [
        {
            "professors": [ 
                {
                    "first_name": "Laura",
                    "last_name": "Stoker"
                }
            ], 
            "course_name": "PS3", 
            "course_mentioned_times": 1
        }, 
        {
            "professors": [
                {
                    "first_name": "Laura", 
                    "last_name": "Stoker"
                }
            ],
            "course_name": "164A", 
            "course_mentioned_times": 1
        },
        {
            "professors": [
                {
                    "first_name": "Richard", 
                    "last_name": "Saykally"
                }
            ], 
            "course_name": "CHEM 1A", 
            "course_mentioned_times": 1
        }, 
        {
            "professors": [
                {
                    "first_name": "Richard", 
                    "last_name": "Saykally"
                }
            ], 
            "course_name": "CHEMISTRY1A", 
            "course_mentioned_times": 1
        }
    ]
}

【讨论】:

  • 现在我的输出看起来像这样。但我有教授的名字被骗了。 “课程”:[{“教授”:[{“first_name”:“Richard”,“last_name”:“Saykally”},{“first_name”:“Richard”,“last_name”:“Saykally”},我会就像只有一位教授的名字打印不 riplicas 像只有一位 Richard Saykally 在教授阵列下为那个特定的课程。像多位教授一样,但没有被他们的名字所欺骗。
  • 你是一个救生员。最后一个问题。所以我已经将我的课程名称格式化为字母和数字分开的位置。例如,我想比较这些字母并打印出最常提及的课程。有 CHEM 1A 和 CHEM 214 -> 我比较了实例 CHEM 和 CHEM 的第一个字母 -> 它们是相同的。所以我只是将这两个课程中提到最多的课程附加到我的字典中