【问题标题】:Removing names/characters won't allow me删除名称/字符将不允许我
【发布时间】:2015-07-25 17:33:57
【问题描述】:

我尝试过 string.replace("'","H") 但这会返回错误:

AttributeError: 'list' 对象没有属性 'replace'

我也可以做 re.sub 但这会产生类似的错误


我可能已经找到了解决问题的方法:

    25 July 2015
Scottish Football
East Stirling 2 - Stenhousemuir 3
[u" Donaldson 30' ", u" McKenna 77' "]
[u" Stirling 35', 45' ", u" McMenamin 59' "]

我的输出如上,如何从外部删除 [u" ],然后将 ' 替换为顶行的 H 和第二行的 A?

我正在尝试使底部 2 行看起来像下面

 25 July 2015
    Scottish Football
    East Stirling 2 - Stenhousemuir 3
    30H, 77H,
    35A, 45A, 59A,

然后从文本中删除所有名称

import requests
from bs4 import BeautifulSoup
import csv
import re
from collections import OrderedDict

def parse_page(data):
        subsoup = BeautifulSoup(data)
        rs = requests.get("http://www.bbc.co.uk/sport/0/football/33578498")
        ssubsoup = BeautifulSoup(rs.content)
        matchoverview = subsoup.find('div', attrs={'id':'match-overview'})
        print '--------------'
        date = ssubsoup.find('div', attrs={'id':'article-sidebar'}).findNext('span').text
        league = ssubsoup.find('a', attrs={'class':'secondary-nav__link'}).findNext('span').findNext('span').text
        #HomeTeam info printing
        homeTeam = matchoverview.find('div', attrs={'class':'team-match-details'}).findNext('span').findNext('a').text
        homeScore = matchoverview.find('div', attrs={'class':'team-match-details'}).findNext('span').findNext('span').text
        homeGoalScorers = []

        for goals in matchoverview.find('div', attrs={'class':'team-match-details'}).findNext('p').find_all('span'):
            homeGoalScorers.append(goals.text.replace(u'\u2032', "'"))
        homeGoals = homeGoalScorers

        #AwayTeam info printing
        awayTeam = matchoverview.find('div', attrs={'id': 'away-team'}).find('div', attrs={'class':'team-match-details'}).findNext('span').findNext('a').text
        awayScore = matchoverview.find('div', attrs={'id': 'away-team'}).find('div', attrs={'class':'team-match-details'}).findNext('span').findNext('span').text
        awayGoalScorers = []
        for goals in matchoverview.find('div', attrs={'id': 'away-team'}).find('div', attrs={'class':'team-match-details'}).findNext('p').find_all('span'):
            awayGoalScorers.append(goals.text.replace(u'\u2032', "'"))
        awayGoals = awayGoalScorers

        #Printouts
        print date
        print league
        print '{0} {1} - {2} {3}'.format(homeTeam, homeScore, awayTeam, awayScore)
        print homeGoals
        print awayGoals
        if len(homeTeam) >1:
                with open('score.txt', 'a') as f:
                        writer = csv.writer(f)
                        writer.writerow([league,date,homeTeam,awayTeam])

def all_league_results():
    r = requests.get("http://www.bbc.co.uk/sport/football/league-one/results")
    soup = BeautifulSoup(r.content)

    # Save Teams
    for link in soup.find_all("a", attrs={'class': 'report'}):
        fullLink = 'http://www.bbc.com' + link['href']
        subr = requests.get(fullLink)
        parse_page(subr.text)

def specific_game_results(url):
    subr = requests.get(url)
    parse_page(subr.text)

#get specific games results
specific_game_results('http://www.bbc.co.uk/sport/0/football/33578498')

【问题讨论】:

    标签: python csv beautifulsoup


    【解决方案1】:

    我相信你可以在这里更改代码

    for goals in matchoverview.find('div', attrs={'class':'team-match-details'}).findNext('p').find_all('span'):
                    homeGoalScorers.append(goals.text.replace(u'\u2032', "'") +'H')
    homeGoals = ",".join(homeGoalScorers)
    

    删除homeGoals = "H".join(homeGoalScorers)

    【讨论】:

    • 在这种情况下,您应该将列表加入字符串,然后替换您想要的字母
    • 我看不到我这样做:/
    • @footystattowannab 错误消息显示您尝试对列表对象使用替换方法,但列表对象没有它,因此您应该将列表转换/加入字符串
    猜你喜欢
    • 2012-08-23
    • 2015-07-14
    • 1970-01-01
    • 2023-04-08
    • 1970-01-01
    • 2021-01-02
    • 2016-01-28
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多