【问题标题】:Scraping a table based on the option value from a drop down根据下拉列表中的选项值抓取表格
【发布时间】:2021-03-03 23:05:13
【问题描述】:

在更改下拉列表中的选项值后,我试图从表中抓取值。我发现了这个堆栈溢出帖子scraping a response from a selected option in dropdown list,但我仍然没有设法让它工作。

我要抓取的网站是: https://www.myfxbook.com/forex-market/correlation

我正在尝试获取“5 分钟”图表,但它仍然返回“1 天”的默认表格。

import requests
import pandas as pd
url = "https://www.myfxbook.com/forex-market/correlation"
with requests.Session() as session:
    response = session.get(url)
    soup = BeautifulSoup(response.content)
    data ={"timeScales": "5"}
    response = session.post(url, data=data)
    soup = BeautifulSoup(response.content)

pd.read_html(str(soup))

【问题讨论】:

    标签: pandas dataframe beautifulsoup python-requests


    【解决方案1】:

    要抓取表格,您可以通过添加正确的headersdata

    here 发送POST 请求

    要从下拉列表中选择不同的“时间范围”,请修改 data 字典中的 timeScale(如下)的值。似乎值应该基于分钟,例如选择“5分钟”表,使用"timeScale": "5",或者选择“1周”表,使用timeScale: 10080

    在您的示例中,选择“5 天”表:

    import requests
    from bs4 import BeautifulSoup
    
    
    headers = {
        "x-requested-with": "XMLHttpRequest",
        "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.190 Safari/537.36",
        "referer": "https://www.myfxbook.com/",
    }
    
    data = {
        "colSymbols": "8,9,10,6,7,1,4,2,5,3",
        "rowSymbols": "8,47,9,10,1234,11,103,12,46,1245,6,13,14,15,17,18,7,2114,19,20,21,22,1246,23,1,1233,107,24,25,4,2872,137,48,1236,1247,2012,2,1863,3240,26,49,27,28,2090,131,5,29,5779,31,34,3,36,37,38,2076,40,41,42,43,45,3005,3473,50,2115,2603,2119,1815,2521,51,12755,5435,5079,10064,1893",
        # Change the value of `timeScale` to get the different Timeframe, the value should be by minutes, for example to get 1 week, use `10080`
        "timeScale": "5",
        "z": "0.6367404250506281",
    }
    
    response = requests.post(
        "https://www.myfxbook.com/updateCorrelationSymbolMenu.json",
        headers=headers,
        data=data,
    ).json()
    
    table_html = response["content"]["marketCorrelationTable"]
    soup = BeautifulSoup(table_html, "html.parser")
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2019-07-24
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2018-10-31
      相关资源
      最近更新 更多