【问题标题】:No Content-Disposition header in response from mechanize没有响应 mechanize 的 Content-Disposition 标头
【发布时间】:2013-02-10 20:29:36
【问题描述】:

在 Celery 任务期间,我必须从 AdWords 结算页面下载 csv 文件。而且我不知道我的实现出了什么问题,所以需要你的帮助。

登录:

browser = mechanize.Browser()
browser.open('https://accounts.google.com/ServiceLogin')
browser.select_form(nr=0)
browser['Email'] = g_email
browser['Passwd'] = g_password
browser.submit()

browser.set_handle_robots(False)
billing_resp = browser.open('https://adwords.google.com/')

没关系,我现在在结算页面上。接下来,我解析了令牌和 id 的结果页面,分析了 Chrome 调试器中的请求标头和操作 url,现在我想发出 POST 请求并接收我的 csv 文件。响应标头(在 Chrome 中)是:

content-disposition:attachment; filename="myclientcenter.csv.gz"
content-length:307479
content-type:application/x-gzip; charset=UTF-8

机械化:

data = {
    '__u': effectiveUserId,
    '__c': customerId,
    'token': token,
}

browser.addheaders = [
    ('accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'),
    ('content-type', 'application/x-www-form-urlencoded'),
    ("accept-encoding", "gzip,deflate,sdch"),
    ('user-agent', "Mozilla/5.0"),
    ('referer', "https://adwords.google.com/mcm/Mcm?__u=8183865359&__c=3069937889"),
    ('origin', "https://adwords.google.com"),
]

browser.set_handle_refresh(True)
browser.set_debug_responses(True)
browser.set_debug_redirects(True)
browser.set_handle_referer(True)
browser.set_debug_http(True)
browser.set_handle_equiv(True)
browser.set_handle_gzip(True)

response = browser.open(
    'https://adwords.google.com/mcm/file/ClientSummary/',
    data='&'.join(['='.join(pair) for pair in data.items()]),
)

但是!此响应中的 Content-Length 标头为 0,并且没有 Content-Disposition。为什么?我该怎么做才能让它发挥作用?

尝试使用 Requests,但连登录阶段都过不去...

【问题讨论】:

    标签: python http-headers mechanize google-ads-api


    【解决方案1】:

    我现在有了自己问题的答案(感谢我的团队领导)。

    主要错误在于这个不正确的请求数据:

    data = {
        '__u': effectiveUserId,
        '__c': customerId,
        'token': token,
    }
    

    让我们再试一次,找到合适的解决方案。

    # Open Google login page and log in.
    browser = mechanize.Browser()
    try:
        browser.open('https://accounts.google.com/ServiceLogin')
        browser.select_form(nr=0)
        browser['Email'] = 'email@adwords.login'
        browser['Passwd'] = 'password'
        browser.submit()
    except HTTPError:
        raise AdWordsException("Can't find the Google login form")
    

    我们现在已经登录,可以更深入了。

    try:
        browser.set_handle_robots(False)
        billing_resp = browser.open('https://adwords.google.com/')
    except HTTPError:
        raise AdWordsException("Can't open AdWords dashboard page")
    
    # Welcome to the AdWords billing dashboard. We can get 
    # session-unique token from this page for the further POST-request
    token_re = re.search(r"token:\'(.{41})\'", billing_resp.read())
    if token_re is None:
        raise AdWordsException("Can't parse the token")
    
    # It's time for some magic now. We have to construct proper mcsSelector
    # serialized data structure. This is GWT-RPC wire protocol hell.
    # Paste your specific version from web debugger.
    MCS_TEMPLATE = (
        "7|0|49|https://adwords.google.com/mcm/gwt/|18FBB090A5C26E56AC16C9DF0689E720|"
        "com.google.ads.api.services.common.selector.Selector/1054041135|"
        "com.google.ads.api.services.common.date.DateRange/1118087507|"
        "com.google.ads.api.services.common.date.Date/373224763|"
        "java.util.ArrayList/4159755760|java.lang.String/2004016611|ClientName|"
        "ExternalCustomerId|PrimaryUserLogin|PrimaryCompanyName|IsManager|"
        "SalesChannel|Tier|AccountSettingTypes|Labels|Alerts|CostWithCurrency|"
        "CostUsd|Clicks|Impressions|Ctr|Conversions|ConversionRate|SearchCtr|"
        "ContentCtr|BudgetAmount|BudgetStartDate|BudgetEndDate|BudgetPercentSpent|"
        "BudgetType|RemainingBudget|ClientDateTimeZoneId|"
        "com.google.ads.api.services.common.selector.OrderBy/524388450|"
        "SearchableData|"
        "com.google.ads.api.services.common.sorting.SortOrder/2037387810|"
        "com.google.ads.api.services.common.pagination.Paging/363399854|"
        "com.google.ads.api.services.common.selector.Predicate/451365360|"
        "SeedObfuscatedCustomerId|"
        "com.google.ads.api.services.common.selector.Predicate$Operator/2293561107|"
        "java.util.Arrays$ArrayList/2507071751|[Ljava.lang.String;/2600011424|"
        "3069937889|ExcludeSeeds|true|ClientTraversal|DIRECT|"
        "com.google.ads.api.services.common.selector.Summary/3224078220|included|1|"
        "2|3|4|5|"
        "{report_date}|5|{report_date}"  # take a note of this
        "|6|26|7|8|7|9|7|10|7|11|7|12|7|13|7|14|7|15|7|16|7|17|7|18|7|19|7|20|7|21|"
        "7|22|7|23|7|24|7|25|7|26|7|27|7|28|7|29|7|30|7|31|7|32|7|33|6|0|0|0|6|2|34|"
        "35|36|0|34|9|-35|37|100|0|6|0|6|3|38|39|40|2|41|42|1|43|38|44|40|0|41|42|1|"
        "45|38|46|-45|41|42|1|47|0|0|6|0|6|1|48|6|0|49|6|0|0|"
    )
    
    # To take stats for today
    report_date = datetime.date.today()
    mcs_selector = MCS_TEMPLATE.format(
        report_date='%s|%s|%s' % (
            report_date.day,
            report_date.month,
            report_date.year
        ),
    )
    data = urllib.urlencode({
        'token': token_re.group(1),
        'mcsSelector': mcs_selector,
    })
    
    # And... it finally works! Token and proper mcsSelector is all we need. 
    # POST-request with this data returns zipped csv file for us with
    # current balance state and another info that's not available via AdWords API
    zipped_csv = browser.open(
        'https://adwords.google.com/mcm/file/ClientSummary', 
        data=data
    )
    # Unpack it and use as you wish.
    with gzip.GzipFile(mode='r', fileobj=zipped_csv) as csv_io:
        try:
            csv = StringIO.StringIO(csv_io.read())
        except IOError:
            raise AdWordsException("Can't get CSV file from response")
        finally:
            browser.close()
    

    【讨论】:

      猜你喜欢
      • 2018-10-30
      • 2010-11-03
      • 2017-10-10
      • 2012-02-27
      • 2011-02-02
      • 2020-04-02
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多