【问题标题】:how can i scrape data from a url on the network?如何从网络上的 url 抓取数据?
【发布时间】:2020-12-24 12:27:37
【问题描述】:

在这里我想创建一个程序,它从位于页面底部的https://www.futbin.com/21/player/560/aubameyang 抓取数据是每日和每小时图表部分,每小时图表是我想要的,可以在网络部分找到检查名为 https://www.futbin.com/21/playerPrices?player=188567&rids=84074647&_=1608811830598 的元素,这给了我一个使用 LCPrice 、 LCPrice2 等的最近销售历史的所有平台(ps、xbox、pc)的列表......这就是我喜欢抓取/提取的内容。

在此示例中,每个播放器也被一个 id 用于该播放器,该播放器的 id 为 188567,通过提供价格列表的网络选项卡找到,我当前的代码是这样的: 它不会打印/返回任何东西,任何帮助将不胜感激

import requests
from datetime import datetime

player_ids = {
  'Arturo Vidal': 181872,
  'Pierre-Emerick Aubameyang': 188567,
  'Robert Lewandowski': 188545,
  'Jerome Boateng': 183907,
  'Sergio Ramos': 155862,
  'Antoine Griezmann': 194765,
  'David Alaba': 197445,
  'Paulo Dybala': 211110,
  'Radja Nainggolan': 178518
}

for (name,id) in player_ids.items():
    r = requests.get('https://www.futbin.com/21/playerPrices?player={0}'.format(id))
    data = r.json()

    print(name)
    print("-"*20)
    #Change ps to xbox or pc to get other prices
    for price in data['ps']:
        price = price[1]
        print(price)

【问题讨论】:

  • 你可以使用SeleniumBeautifulSoup

标签: python html web-scraping python-requests-html


【解决方案1】:

应该改进问题,但根据我的理解,您正在搜索类似以下示例的内容。

有何不同

以正确的方式访问播放器和控制台的数据

data[str(id)]['prices']['ps'].values()

示例:

import requests
from datetime import datetime

player_ids = {
  'Arturo Vidal': 181872,
  'Pierre-Emerick Aubameyang': 188567,
  'Robert Lewandowski': 188545,
  'Jerome Boateng': 183907,
  'Sergio Ramos': 155862,
  'Antoine Griezmann': 194765,
  'David Alaba': 197445,
  'Paulo Dybala': 211110,
  'Radja Nainggolan': 178518
}

for (name,id) in player_ids.items():
    r = requests.get('https://www.futbin.com/21/playerPrices?player={0}'.format(id))
    data = r.json()

    print(name)
    print("-"*20)

    psPrices = list(data[str(id)]['prices']['ps'].values())
    print(psPrices)
    xboxPrices = list(data[str(id)]['prices']['xbox'].values())
    print(xboxPrices) 

输出:

Arturo Vidal
--------------------
['0', '0', '0', '0', '0', '10 weeks ago', '3,600', '65,000', '0']
['0', '0', '0', '0', '0', '10 weeks ago', '2,100', '37,500', '100']
Pierre-Emerick Aubameyang
--------------------
['59,000', '59,000', '0', '0', '0', '13 mins ago', '12,250', '230,000', '21']
['57,000', '57,500', '58,000', '58,000', '58,000', '14 mins ago', '11,000', '210,000', '23']
Robert Lewandowski
--------------------
['72,500', '72,500', '72,500', '72,500', '72,500', '14 mins ago', '6,000', '110,000', '63']
['73,500', '73,500', '73,500', '73,500', '73,500', '2 mins ago', '7,400', '140,000', '49']
Jerome Boateng
--------------------
['1,400', '1,400', '1,400', '1,400', '1,400', '15 mins ago', '700', '10,000', '7']
['1,300', '1,300', '1,300', '1,300', '1,300', '4 mins ago', '700', '10,000', '6']
Sergio Ramos
--------------------
['50,000', '50,500', '50,500', '50,500', '50,500', '19 mins ago', '8,000', '150,000', '29']
['51,000', '51,000', '51,000', '51,000', '0', '15 mins ago', '7,200', '140,000', '32']
Antoine Griezmann
--------------------
['29,250', '29,250', '29,250', '29,250', '29,250', '35 mins ago', '2,800', '50,000', '56']
['32,750', '32,750', '33,000', '33,000', '33,000', '37 mins ago', '2,900', '55,000', '57']
David Alaba
--------------------
['0', '0', '0', '0', '0', '14 mins ago', '700', '10,000', '100']
['0', '0', '0', '0', '0', '16 mins ago', '700', '11,000', '100']
Paulo Dybala
--------------------
['36,000', '36,000', '36,000', '36,250', '36,500', '19 mins ago', '3,600', '65,000', '52']
['37,500', '37,500', '37,500', '38,000', '38,000', '1 min ago', '3,100', '55,000', '66']
Radja Nainggolan
--------------------
['2,100', '2,100', '2,100', '2,100', '2,100', '21 mins ago', '700', '10,000', '15']
['1,900', '1,900', '1,900', '1,900', '1,900', '32 mins ago', '700', '10,000', '12']

【讨论】:

    猜你喜欢
    • 2018-06-30
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2017-01-31
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多