【问题标题】:Can't unshorten bit.ly urls?无法取消缩短 bit.ly 网址?
【发布时间】:2013-10-20 23:59:51
【问题描述】:

我正在使用this stackoverflow 帖子中的代码来缩短网址...

import httplib
import urlparse

def unshorten_url(url):
    parsed = urlparse.urlparse(url)
    h = httplib.HTTPConnection(parsed.netloc)
    resource = parsed.path
    if parsed.query != "":
        resource += "?" + parsed.query
    h.request('HEAD', resource )
    response = h.getresponse()
    if response.status/100 == 3 and response.getheader('Location'):
        return unshorten_url(response.getheader('Location')) # changed to process chains of short urls
    else:
        return url

除了新创建的 bit.ly 网址之外,所有缩短的链接都将不缩短。

我收到此错误:

>>> unshorten_url("bit.ly/1atTViN")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 7, in unshorten_url
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 955, in request
    self._send_request(method, url, body, headers)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 989, in _send_request
    self.endheaders(body)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 951, in endheaders
    self._send_output(message_body)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 811, in _send_output
    self.send(msg)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 773, in send
    self.connect()
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 754, in connect
    self.timeout, self.source_address)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 571, in create_connection
    raise err
socket.error: [Errno 61] Connection refused

什么给了?

【问题讨论】:

  • 对最初发布的答案投了反对票,该答案获得了 6 个赞
  • @user2799617:然后投反对票,而不是这个问题。
  • ankitpanda.com/tweeting-with-python,但我也尝试过使用另一个网址... youtube.com/watch?v=eeAjkbNq4xI
  • 奇怪的是,这个较旧的 bit.ly 网址 bit.ly/GVBQJS 不会缩短,而新的则不会。

标签: python url header http-headers


【解决方案1】:

您忘记包含 URL 方案:

unshorten_url("http://bit.ly/1atTViN")

注意那里的http://,这是重要的。没有它,无法正确解析 URL:

>>> import urlparse
>>> urlparse.urlparse('bit.ly/1atTViN')
ParseResult(scheme='', netloc='', path='bit.ly/1atTViN', params='', query='', fragment='')
>>> urlparse.urlparse('http://bit.ly/1atTViN')
ParseResult(scheme='http', netloc='bit.ly', path='/1atTViN', params='', query='', fragment='')

查看不包含http://netloc 参数如何为空;您最终尝试连接到您自己的计算机,并且您没有运行网络服务器,因此连接被拒绝。

【讨论】:

    【解决方案2】:

    可能 bit.ly 拒绝来自 httplib 等工具的连接。您可以尝试像这样更改用户代理:

    h.putheader('User-Agent','Mozilla/5.0 (X11; U; Linux i686; pl-PL; rv:1.7.10) Gecko/20050717 Firefox/1.0.6')
    

    【讨论】:

    • 在发送标头之前拒绝连接。
    猜你喜欢
    • 1970-01-01
    • 2019-05-12
    • 2021-12-21
    • 2011-08-20
    • 1970-01-01
    • 2010-11-25
    • 1970-01-01
    • 2012-06-22
    相关资源
    最近更新 更多