【问题标题】:StopIteration Error while using scholarly.pprint function使用 Academicly.pprint 函数时出现 StopIteration 错误
【发布时间】:2021-04-17 12:20:02
【问题描述】:

我正在尝试提取某些教授的 Google Scholar 公开个人资料。

我有一个教授姓名列表,我在 scholarly 包的帮助下使用它来抓取他们的公开个人资料信息。但是,我遇到了一个错误。我只能检索professor_list 中名字的信息,而不能检索后续的信息。

for name in professor_list:
    search_query = scholarly.search_author(name)
    scholarly.pprint(next(search_query))

输出:

{'affiliation': 'Deakin University',
 'citedby': 2528,
 'email_domain': '@deakin.edu.au',
 'filled': False,
 'interests': ['Lynn Batten'],
 'name': 'Lynn Batten',
 'scholar_id': 'Tmg0T9sAAAAJ',
 'source': 'SEARCH_AUTHOR_SNIPPETS',
 'url_picture': 'https://scholar.google.com/citations?view_op=medium_photo&user=Tmg0T9sAAAAJ'}
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-242-5b96571c0972> in <module>
      1 for name in professor_list:
      2     search_query = scholarly.search_author(name)
----> 3     scholarly.pprint(next(search_query))

StopIteration:

【问题讨论】:

  • 有人能回复吗
  • 任何帮助将不胜感激。
  • 您应该遍历search_query 而不是调用next()。您还可以使用list(search_query) 将其转换为列表并打印。有关 Python 迭代器的信息,请参阅 this linkthis

标签: python web-scraping web-crawler google-scholar


【解决方案1】:

虽然scholarly.pprint(next(search_query)) 应该可以工作,但您可以为next() 方法添加默认值None,以防找不到任何东西,例如next(search_query, None):

from scholarly import scholarly

professor_list = ["Marty Banks, Berkeley",
                  "Adam Lobel, Blizzard",
                  "Daniel Blizzard, Blizzard",
                  "Shuo Chen, Blizzard",
                  "Ian Livingston, Blizzard",
                  "Minli Xu, Blizzard"]

for professor_name in professor_list:
    search_query = scholarly.search_author(name=professor_name)
    scholarly.pprint(next(search_query, None))

更多information about StopIteration by Martijn Pieters.

完整输出:

{'affiliation': 'Professor of Vision Science, UC Berkeley',
 'citedby': 22559,
 'email_domain': '@berkeley.edu',
 'filled': False,
 'interests': ['vision science', 'psychology', 'human factors', 'neuroscience'],
 'name': 'Martin Banks',
 'scholar_id': 'Smr99uEAAAAJ',
 'source': 'SEARCH_AUTHOR_SNIPPETS',
 'url_picture': 'https://scholar.google.com/citations?view_op=medium_photo&user=Smr99uEAAAAJ'}
{'affiliation': 'Blizzard Entertainment',
 'citedby': 3050,
 'email_domain': '@AdamLobel.com',
 'filled': False,
 'interests': ['Gaming', 'Emotion regulation'],
 'name': 'Adam Lobel',
 'scholar_id': '_xwYD2sAAAAJ',
 'source': 'SEARCH_AUTHOR_SNIPPETS',
 'url_picture': 'https://scholar.google.com/citations?view_op=medium_photo&user=_xwYD2sAAAAJ'}
{'affiliation': '',
 'citedby': 873,
 'email_domain': '',
 'filled': False,
 'interests': ['Daniel Blizzard'],
 'name': 'Daniel Blizzard',
 'scholar_id': 'dk4LWEgAAAAJ',
 'source': 'SEARCH_AUTHOR_SNIPPETS',
 'url_picture': 'https://scholar.google.com/citations?view_op=medium_photo&user=dk4LWEgAAAAJ'}
{'affiliation': 'Senior Data Scientist, Blizzard Entertainment',
 'citedby': 656,
 'email_domain': '@cs.cornell.edu',
 'filled': False,
 'interests': ['Machine Learning', 'Data Mining', 'Artificial Intelligence'],
 'name': 'Shuo Chen',
 'scholar_id': 'OBf4YnkAAAAJ',
 'source': 'SEARCH_AUTHOR_SNIPPETS',
 'url_picture': 'https://scholar.google.com/citations?view_op=medium_photo&user=OBf4YnkAAAAJ'}
{'affiliation': 'Blizzard Entertainment',
 'citedby': 620,
 'email_domain': '@usask.ca',
 'filled': False,
 'interests': ['Human-computer interaction',
               'User Experience',
               'Player Experience',
               'User Research',
               'Games'],
 'name': 'Ian Livingston',
 'scholar_id': 'xBHVqNIAAAAJ',
 'source': 'SEARCH_AUTHOR_SNIPPETS',
 'url_picture': 'https://scholar.google.com/citations?view_op=medium_photo&user=xBHVqNIAAAAJ'}
{'affiliation': 'Blizzard Entertainment',
 'citedby': 502,
 'email_domain': '@blizzard.com',
 'filled': False,
 'interests': ['Game', 'Machine Learning', 'Data Science', 'Bioinformatics'],
 'name': 'Minli Xu',
 'scholar_id': 'QST5iogAAAAJ',
 'source': 'SEARCH_AUTHOR_SNIPPETS',
 'url_picture': 'https://scholar.google.com/citations?view_op=medium_photo&user=QST5iogAAAAJ'}

或者,您可以在scholarly.search_author() 结果上再迭代一次以使其工作:

from scholarly import scholarly
import json

professor_list = ["Marty Banks, Berkeley",
                  "Adam Lobel, Blizzard",
                  "Daniel Blizzard, Blizzard",
                  "Shuo Chen, Blizzard",
                  "Ian Livingston, Blizzard",
                  "Minli Xu, Blizzard"]

professor_results = []

for professor_name in professor_list:
    for professor_result in scholarly.search_author(name=professor_name):
        professor_results.append({
            "name": professor_result.get("name"),
            "affiliations": professor_result.get("affiliation"),
            "email_domain": professor_result.get("email_domain"),
            "interests": professor_result.get("interests"),
            "citedby": professor_result.get("citedby")
        })

print(json.dumps(professor_results, indent=2, ensure_ascii=False))

完整输出:

[
  {
    "name": "Martin Banks",
    "affiliations": "Professor of Vision Science, UC Berkeley",
    "email_domain": "@berkeley.edu",
    "interests": [
      "vision science",
      "psychology",
      "human factors",
      "neuroscience"
    ],
    "citedby": 22559
  },
  {
    "name": "Adam Lobel",
    "affiliations": "Blizzard Entertainment",
    "email_domain": "@AdamLobel.com",
    "interests": [
      "Gaming",
      "Emotion regulation"
    ],
    "citedby": 3050
  },
  {
    "name": "Daniel Blizzard",
    "affiliations": "",
    "email_domain": "",
    "interests": [
      "Daniel Blizzard"
    ],
    "citedby": 873
  },
  {
    "name": "Shuo Chen",
    "affiliations": "Senior Data Scientist, Blizzard Entertainment",
    "email_domain": "@cs.cornell.edu",
    "interests": [
      "Machine Learning",
      "Data Mining",
      "Artificial Intelligence"
    ],
    "citedby": 656
  },
  {
    "name": "Ian Livingston",
    "affiliations": "Blizzard Entertainment",
    "email_domain": "@usask.ca",
    "interests": [
      "Human-computer interaction",
      "User Experience",
      "Player Experience",
      "User Research",
      "Games"
    ],
    "citedby": 620
  },
  {
    "name": "Minli Xu",
    "affiliations": "Blizzard Entertainment",
    "email_domain": "@blizzard.com",
    "interests": [
      "Game",
      "Machine Learning",
      "Data Science",
      "Bioinformatics"
    ],
    "citedby": 502
  }
]

另一种选择是使用来自 SerpApi 的Google Scholar Profiles API。这是一个付费 API,有一个免费计划,可以处理扩展,通过专用代理和 CAPTCHA 解决服务绕过搜索引擎的块。查看playground

要集成的示例代码:

from serpapi import GoogleScholarSearch
import json

professor_list = ["Marty Banks, Berkeley",
                  "Adam Lobel, Blizzard",
                  "Daniel Blizzard, Blizzard",
                  "Shuo Chen, Blizzard",
                  "Ian Livingston, Blizzard",
                  "Minli Xu, Blizzard"]

for professor_name in professor_list:
    params = {
        "api_key": "Your SerpApi API key",
        "engine": "google_scholar_profiles",
        "hl": "en",
        "mauthors": professor_name
    }

    search = GoogleScholarSearch(params)
    results = search.get_dict()

    for result in results["profiles"]:
        print(json.dumps(result, indent=2))

完整输出:

{
  "name": "Martin Banks",
  "link": "https://scholar.google.com/citations?hl=en&user=Smr99uEAAAAJ",
  "serpapi_link": "https://serpapi.com/search.json?author_id=Smr99uEAAAAJ&engine=google_scholar_author&hl=en",
  "author_id": "Smr99uEAAAAJ",
  "affiliations": "Professor of Vision Science, UC Berkeley",
  "email": "Verified email at berkeley.edu",
  "cited_by": 22559,
  "interests": [
    {
      "title": "vision science",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Avision_science",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:vision_science"
    },
    {
      "title": "psychology",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Apsychology",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:psychology"
    },
    {
      "title": "human factors",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Ahuman_factors",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:human_factors"
    },
    {
      "title": "neuroscience",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Aneuroscience",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:neuroscience"
    }
  ],
  "thumbnail": "https://scholar.google.com/citations/images/avatar_scholar_56.png"
}
{
  "name": "Adam Lobel",
  "link": "https://scholar.google.com/citations?hl=en&user=_xwYD2sAAAAJ",
  "serpapi_link": "https://serpapi.com/search.json?author_id=_xwYD2sAAAAJ&engine=google_scholar_author&hl=en",
  "author_id": "_xwYD2sAAAAJ",
  "affiliations": "Blizzard Entertainment",
  "email": "Verified email at AdamLobel.com",
  "cited_by": 3050,
  "interests": [
    {
      "title": "Gaming",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Agaming",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:gaming"
    },
    {
      "title": "Emotion regulation",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Aemotion_regulation",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:emotion_regulation"
    }
  ],
  "thumbnail": "https://scholar.googleusercontent.com/citations?view_op=small_photo&user=_xwYD2sAAAAJ&citpid=3"
}
https://serpapi.com/search
{
  "name": "Daniel Blizzard",
  "link": "https://scholar.google.com/citations?hl=en&user=dk4LWEgAAAAJ",
  "serpapi_link": "https://serpapi.com/search.json?author_id=dk4LWEgAAAAJ&engine=google_scholar_author&hl=en",
  "author_id": "dk4LWEgAAAAJ",
  "affiliations": "",
  "cited_by": 873,
  "thumbnail": "https://scholar.google.com/citations/images/avatar_scholar_56.png"
}
{
  "name": "Shuo Chen",
  "link": "https://scholar.google.com/citations?hl=en&user=OBf4YnkAAAAJ",
  "serpapi_link": "https://serpapi.com/search.json?author_id=OBf4YnkAAAAJ&engine=google_scholar_author&hl=en",
  "author_id": "OBf4YnkAAAAJ",
  "affiliations": "Senior Data Scientist, Blizzard Entertainment",
  "email": "Verified email at cs.cornell.edu",
  "cited_by": 656,
  "interests": [
    {
      "title": "Machine Learning",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Amachine_learning",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:machine_learning"
    },
    {
      "title": "Data Mining",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Adata_mining",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:data_mining"
    },
    {
      "title": "Artificial Intelligence",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Aartificial_intelligence",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:artificial_intelligence"
    }
  ],
  "thumbnail": "https://scholar.googleusercontent.com/citations?view_op=small_photo&user=OBf4YnkAAAAJ&citpid=1"
}
{
  "name": "Ian Livingston",
  "link": "https://scholar.google.com/citations?hl=en&user=xBHVqNIAAAAJ",
  "serpapi_link": "https://serpapi.com/search.json?author_id=xBHVqNIAAAAJ&engine=google_scholar_author&hl=en",
  "author_id": "xBHVqNIAAAAJ",
  "affiliations": "Blizzard Entertainment",
  "email": "Verified email at usask.ca",
  "cited_by": 620,
  "interests": [
    {
      "title": "Human-computer interaction",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Ahuman_computer_interaction",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:human_computer_interaction"
    },
    {
      "title": "User Experience",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Auser_experience",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:user_experience"
    },
    {
      "title": "Player Experience",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Aplayer_experience",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:player_experience"
    },
    {
      "title": "User Research",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Auser_research",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:user_research"
    },
    {
      "title": "Games",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Agames",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:games"
    }
  ],
  "thumbnail": "https://scholar.google.com/citations/images/avatar_scholar_56.png"
}
{
  "name": "Minli Xu",
  "link": "https://scholar.google.com/citations?hl=en&user=QST5iogAAAAJ",
  "serpapi_link": "https://serpapi.com/search.json?author_id=QST5iogAAAAJ&engine=google_scholar_author&hl=en",
  "author_id": "QST5iogAAAAJ",
  "affiliations": "Blizzard Entertainment",
  "email": "Verified email at blizzard.com",
  "cited_by": 502,
  "interests": [
    {
      "title": "Game",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Agame",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:game"
    },
    {
      "title": "Machine Learning",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Amachine_learning",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:machine_learning"
    },
    {
      "title": "Data Science",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Adata_science",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:data_science"
    },
    {
      "title": "Bioinformatics",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Abioinformatics",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:bioinformatics"
    }
  ],
  "thumbnail": "https://scholar.googleusercontent.com/citations?view_op=small_photo&user=QST5iogAAAAJ&citpid=14"
}

免责声明,我为 SerpApi 工作。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2014-09-18
    • 2021-07-23
    • 2022-01-03
    • 2016-11-20
    • 2020-05-11
    • 2023-01-03
    • 2021-03-08
    相关资源
    最近更新 更多