【发布时间】:2018-05-29 11:25:48
【问题描述】:
我尝试使用 python 提取表,但无法删除 \n,尽管使用了 replace、remove、rsplit、lsplit 函数。请帮忙。
以下是我的代码。
from urllib.request import urlopen
from bs4 import BeautifulSoup
import requests
import pandas as pd
url = "https://shared.websol.barchart.com/quotes/quote.php?page=quote&sym=ng&x=13&y=8&domain=if&display_ice=1&enabled_ice_exchanges=&tz=0&ed=0"
res = requests.get(url)
soup = BeautifulSoup(res.text, 'lxml')
soup.prettify()
Header = soup.findAll('tr', limit=2)[1].findAll('th')
column_headers = [th.getText() for th in soup.findAll('tr', limit=2)[1].findAll('th')]
print(column_headers)
data_rows = soup.findAll('tr')[2:]
i = range(len(data_rows))
for td in data_rows:
row = td.get_text()
print(row)
我的代码输出如下。只复制了几行。
['Contract', 'Last', 'Change', 'Open', 'High', 'Low', 'Volume', 'Prev. Stl.', 'Time', 'Links']
\n Cash (NGY00)\n 2.890s\n +0.020\n 0.000\n 2.890\n 2.890\n 0\n 2.870\n 05/25/18\n Q / C / O\n
\n Jun \'18 (NGM18)\n 2.946\n +0.007\n 2.946\n 2.968\n 2.908\n 2331\n 2.939\n 19:13\n Q / C / O\n
\n Jul \'18 (NGN18)\n 2.974\n +0.011\n 2.974\n 3.000\n 2.937\n 23859\n 2.963\n 19:37\n Q / C / O\n
\n Aug \'18 (NGQ18)\n 2.989\n +0.006\n 2.983\n 3.016\n 2.957\n 4434\n 2.983\n 18:25\n Q / C / O\n
\n Sep \'18 (NGU18)\n 2.977\n +0.010\n 2.970\n 2.998\n 2.942\n 2313\n 2.967\n 18:07\n Q / C / O\n
\n Oct \'18 (NGV18)\n 2.975\n +0.005\n 2.969\n 2.999\n 2.944\n 2259\n 2.970\n 19:01\n Q / C / O\n
\n Nov \'18 (NGX18)\n 3.013\n +0.005\n 3.007\n 3.034\n 2.983\n 1774\n 3.008\n 19:18\n Q / C / O\n
\n Dec \'18 (NGZ18)\n 3.113\n +0.007\n 3.106\n 3.131\n 3.082\n 1287\n 3.106\n 17:59\n Q / C / O\n
\n Jan \'19 (NGF19)\n 3.198\n +0.011\n 3.177\n 3.212\n 3.165\n 1737\n 3.187\n 17:51\n Q / C / O\n
\n Feb \'19 (NGG19)\n 3.156\n +0.008\n 3.137\n 3.170\n 3.126\n 776\n 3.148\n 17:39\n Q / C / O\n
\n Mar \'19 (NGH19)\n 3.042\n +0.002\n 3.042\n 3.063\n 3.017\n 2891\n 3.040\n 18:27\n Q / C / O\n
\n Apr \'19 (NGJ19)\n 2.672\n +0.018\n 2.662\n 2.676\n 2.648\n 2403\n 2.654\n 11:00\n Q / C / O\n
【问题讨论】:
-
所以我假设您的变量
row不是真正的行,而是一个字段。你试过print(row.strip())吗?请注意,strip() 不会改变字符串,而是返回一个新字符串。
标签: python