【问题标题】:Why am I receiving this JSON Decode Error?为什么我会收到此 JSON 解码错误?
【发布时间】:2021-05-19 02:08:58
【问题描述】:

这就是我正在做的事情。

我正在向 reddit oembed 端点发送 get 请求。我想解析返回的 json 并获取原始 html 以将 reddit 帖子嵌入到我的 django 页面上。我尝试这样做时收到的错误是

json.decoder.JSONDecodeError: Expecting value: line 2 column 1 (char 1)

这是该代码的一个示例。 (在函数内部)

 endpoint = requests.get("https://www.reddit.com/oembed?url=https://www.reddit.com/r/nba/comments/n6l2zu/the_crew_lock_in_their_predictions_and_ernie_has/")

 return endpoint.json()['html']

这是它应该返回的 html。 我在想也许我必须重新格式化它?有人可以帮我吗?谢谢!

 '\n    <blockquote class="reddit-card" >\n      <a href="https://www.reddit.com/r/nba/comments/n6l2zu/the_crew_lock_in_their_predictions_and_ernie_has/?ref_source=embed&amp;ref=share">The crew lock in their predictions and Ernie has the Jazz going to the Finals</a> from\n      <a href="https://www.reddit.com/r/nba/">nba</a>\n    </blockquote>\n    <script async src="https://embed.redditmedia.com/widgets/platform.js" charset="UTF-8"></script>\n'

编辑:

这里是打印endpoint.json()的结果

    {
   "provider_url":"https://www.reddit.com/",
   "version":"1.0",
   "title":"The crew lock in their predictions and Ernie has the Jazz going to the Finals",
   "provider_name":"reddit",
   "type":"rich",
   "html":"\n    <blockquote class=\"reddit-card\" >\n      <a href=\"https://www.reddit.com/r/nba/comments/n6l2zu/the_crew_lock_in_their_predictions_and_ernie_has/?ref_source=embed&amp;ref=share\">The crew lock in their predictions and Ernie has the Jazz going to the Finals</a> from\n      <a href=\"https://www.reddit.com/r/nba/\">nba</a>\n    </blockquote>\n    <script async src=\"https://embed.redditmedia.com/widgets/platform.js\" charset=\"UTF-8\"></script>\n",
   "author_name":"tanookiben"
}

【问题讨论】:

  • 你能打印endpoint.Json()并分享结果吗?
  • 编辑完成
  • 你的key"html"的值不是""括起来的字符串,这是问题,你需要在使用json解码器之前解析它
  • 有没有一种方法可以格式化用双引号括起来的值?
  • 另外,endpoint.json()['html'] 返回的数据类型是字符串。更奇怪的是,如果我运行这段代码来查看原始 HTML,我会收到 JSON 解码错误,但在运行 7 - 8 次后,我终于收到了我想要的 html 代码。

标签: python json decode oembed


【解决方案1】:
import requests
import json

def get_response():
    endpoint = requests.get("https://www.reddit.com/oembed?url=https://www.reddit.com/r/nba/comments/n6l2zu/the_crew_lock_in_their_predictions_and_ernie_has/")
    if endpoint.status_code == 200:
        return json.loads(endpoint.text)
        
    return {}
    
print(get_response())

当您发出请求时,reddit 似乎会响应如下错误消息。所以最好先查看响应状态码。

<!doctype html>
<html>
  <head>
    <title>Too Many Requests</title>
    <style>
      body {
          font: small verdana, arial, helvetica, sans-serif;
          width: 600px;
          margin: 0 auto;
      }

      h1 {
          height: 40px;
          background: transparent url(//www.redditstatic.com/reddit.com.header.png) no-repeat scroll top right;
      }
    </style>
  </head>
  <body>
    <h1>whoa there, pardner!</h1>
    


<p>we're sorry, but you appear to be a bot and we've seen too many requests
from you lately. we enforce a hard speed limit on requests that appear to come
from bots to prevent abuse.</p>

<p>if you are not a bot but are spoofing one via your browser's user agent
string: please change your user agent string to avoid seeing this message
again.</p>

<p>please wait 6 second(s) and try again.</p>

    <p>as a reminder to developers, we recommend that clients make no
    more than <a href="http://github.com/reddit/reddit/wiki/API">one
    request every two seconds</a> to avoid seeing this message.</p>
  </body>
</html>

【讨论】:

  • 我在这里仍然收到同样的错误。我需要从返回的 json 响应中获取 html
  • 当服务器响应错误时返回空的json数据
  • 我明白了,但是返回的状态码是'429'。
  • 在转换 JSON 之前,您必须先检查响应代码。当CODE 429 时,服务器响应HTML 文本,而不是JSON。因此,在将响应转换为 JSON 之前检查状态代码。在这里查看:developer.mozilla.org/en-US/docs/Web/HTTP/Status/429
  • 我没有应用正确的标题。你是对的,我收到了 429 错误。非常感谢!
【解决方案2】:

我没有发送正确的标题。一直收到 429 错误,因为没有返回 json,所以我收到了 json 解码器错误。

【讨论】:

    猜你喜欢
    • 2018-06-22
    • 2016-01-01
    • 2020-01-29
    • 1970-01-01
    • 2017-07-24
    • 1970-01-01
    • 2012-02-19
    相关资源
    最近更新 更多