【问题标题】:Parsing soccer stats stored as XML [closed]解析存储为 XML 的足球统计数据 [关闭]
【发布时间】:2021-11-07 16:09:33
【问题描述】:

我正在尝试通过根据球员统计数据创建可视化来帮助一位朋友跟踪他的足球队的进步。我有用于在会议网站上发布统计数据的 .xml 文件,但我无法想出一种方法来实际读取数据。这是我正在尝试使用的内容的一个片段。非常感谢!

<?xml version="1.0" encoding="UTF-8"?>
<teams>
  <team vh="V" id="COLLEGE" name="College" code="conf125" record="0-2-0"> 
    <linescore periods="2" line="0,0" score="0" shotline="1,3" shots="4"> 
      <lineprd prd="1" score="0" shots="1" saves="1" fouls="3" corners="0" offsides="0"></lineprd>  
      <lineprd prd="2" score="0" shots="3" saves="2" fouls="5" corners="2" offsides="0"></lineprd> 
    </linescore>  
    <totals> 
      <shots g="0" a="0" sh="4" sog="1" ps="0" psatt="0"></shots>  
      <goaltype gw="0" ua="0" fg="0" ot="0" en="0" hat="0" gt="0" so="0"></goaltype>  
      <penalty count="0" red="0" yellow="0" green="0" fouls="8"></penalty>  
      <misc minutes="957" dsave="0"></misc>  
      <goalie minutes="90:00" ga="3" saves="3" sf="6" shutout="0" savebyprd="1,2"> 
        <savesbyprd prd="1" saves="1"></savesbyprd>  
        <savesbyprd prd="2" saves="2"></savesbyprd> 
      </goalie> 
    </totals> 
    <player uni="4" code="4" name="First Last" checkname="LAST,FIRST" gp="1" playerId="jg777sdg276512tqj"> 
      <shots g="0" a="0" sh="0" sog="0" ps="0" psatt="0"></shots>  
      <goaltype gw="0" ua="0" fg="0" ot="0" en="0" hat="0" gt="0" so="0"></goaltype>  
      <penalty count="0" red="0" yellow="0" green="0" fouls="0"></penalty>  
      <misc minutes="39" dsave="0"></misc> 
    </player> 
    </team>
</teams> 

【问题讨论】:

  • 你想从 xml 中提取什么?到目前为止,您尝试了什么?
  • 每个文件有两个项,每个有几个项。我想从每个 中提取 uni、name、gp 以及 项目中的所有内容。我尝试使用 pd.read_xml(),它将“玩家”显示为 nan。我也尝试按照此处的示例(medium.com/@robertopreste/…)使用 et.parse(),但再次无法“访问”玩家数据。
  • 您能分享一下您目前所做的工作吗?
  • 当然。 pd.read_xml() 返回具有以下列的 (2,8) 数据框: vh id name code recordlinescore totals player。 'player' 有我要查找的数据,但使用此方法以 nan 形式返回。 et.parse() 返回一个 ElementTree,然后我可以使用 iterfind() 打印“团队”元素。 { for team in plyrs.iterfind('team'): print(team) } 我不知道如何从那里进一步挖掘。如果我尝试解析“团队”,我会得到 TypeError: expected str, bytes or os.PathLike...

标签: python pandas xml statistics


【解决方案1】:

见下文

import xml.etree.ElementTree as ET
from collections import defaultdict

xml = '''<teams>
  <team vh="V" id="COLLEGE" name="College" code="conf125" record="0-2-0"> 
    <linescore periods="2" line="0,0" score="0" shotline="1,3" shots="4"> 
      <lineprd prd="1" score="0" shots="1" saves="1" fouls="3" corners="0" offsides="0"></lineprd>  
      <lineprd prd="2" score="0" shots="3" saves="2" fouls="5" corners="2" offsides="0"></lineprd> 
    </linescore>  
    <totals> 
      <shots g="0" a="0" sh="4" sog="1" ps="0" psatt="0"></shots>  
      <goaltype gw="0" ua="0" fg="0" ot="0" en="0" hat="0" gt="0" so="0"></goaltype>  
      <penalty count="0" red="0" yellow="0" green="0" fouls="8"></penalty>  
      <misc minutes="957" dsave="0"></misc>  
      <goalie minutes="90:00" ga="3" saves="3" sf="6" shutout="0" savebyprd="1,2"> 
        <savesbyprd prd="1" saves="1"></savesbyprd>  
        <savesbyprd prd="2" saves="2"></savesbyprd> 
      </goalie> 
    </totals> 
    <player uni="4" code="4" name="First Last" checkname="LAST,FIRST" gp="1" playerId="jg777sdg276512tqj"> 
      <shots g="0" a="0" sh="0" sog="0" ps="0" psatt="0"></shots>  
      <goaltype gw="0" ua="0" fg="0" ot="0" en="0" hat="0" gt="0" so="0"></goaltype>  
      <penalty count="0" red="0" yellow="0" green="0" fouls="0"></penalty>  
      <misc minutes="39" dsave="0"></misc> 
    </player> 
    </team>
</teams> '''

data = defaultdict(list)
root = ET.fromstring(xml)
print(root)
for team in root.findall('.//team'):
    team_name = team.attrib['name']
    for player in team.findall('player'):
        data[team_name].append(player.attrib)
        for entry in ['shots','goaltype','penalty','misc']:
            data[team_name][-1][entry] = player.find(entry).attrib

print(data)

输出

defaultdict(<class 'list'>, {'College': [{'uni': '4', 'code': '4', 'name': 'First Last', 'checkname': 'LAST,FIRST', 'gp': '1', 'playerId': 'jg777sdg276512tqj', 'shots': {'g': '0', 'a': '0', 'sh': '0', 'sog': '0', 'ps': '0', 'psatt': '0'}, 'goaltype': {'gw': '0', 'ua': '0', 'fg': '0', 'ot': '0', 'en': '0', 'hat': '0', 'gt': '0', 'so': '0'}, 'penalty': {'count': '0', 'red': '0', 'yellow': '0', 'green': '0', 'fouls': '0'}, 'misc': {'minutes': '39', 'dsave': '0'}}]})

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2023-03-06
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2013-07-12
    • 2013-03-14
    相关资源
    最近更新 更多