【发布时间】:2014-07-15 18:35:27
【问题描述】:
我有一个漂亮的汤对象列表,我正在尝试进一步解析单元格的内容。我的输出变成了一个列表,每个列表有 3 个项目,因为该表有 3 列。
file = <html><p><center><h1> Interference Report </h1></center><p>
<b> Interference Report Project File: </b>C:\Users\ksobon\Documents\test_project_03_ksobon.rvt <br> <b> Created: </b> Monday, May 26, 2014 7:52:32 PM <br> <b> Last Update: </b> <br>
<p><table border=on> <tr> <td></td> <td ALIGN="center">A</td> <td ALIGN="center">B</td> </tr>
<tr> <td> 1 </td> <td> Workset1 : Walls : Basic Wall : E103-CON 100mm : id 469021 </td> <td> Workset1 : Furniture : FUR_BoardroomTable10Chairs_gm : Board Room Layout : id 482259 </td> </tr>
<tr> <td> 2 </td> <td> Workset1 : Walls : Basic Wall : E103-CON 100mm : id 469021 </td> <td> Workset1 : Walls : Basic Wall : E103-CON 100mm : id 483442 </td> </tr>
<tr> <td> 3 </td> <td> Workset1 : Walls : Basic Wall : E103-CON 100mm : id 469060 </td> <td> Workset1 : Furniture : FUR_Sofa_gm : 2100mm : id 475041 </td> </tr>
<tr> <td> 4 </td> <td> Workset1 : Walls : Basic Wall : E103-CON 100mm : id 469109 </td> <td> Workset1 : Furniture : FUR_Sofa_gm : 2100mm : id 475273 </td> </tr>
<tr> <td> 5 </td> <td> Workset1 : Walls : Basic Wall : E103-CON 100mm : id 469178 </td> <td> Workset1 : Furniture : FUR_Sofa_gm : 2100mm : id 475510 </td> </tr>
<tr> <td> 6 </td> <td> Workset1 : Walls : Basic Wall : E103-CON 100mm : id 469178 </td> <td> Workset1 : Furniture : FUR_Sofa_gm : 2100mm : id 482306 </td> </tr>
<tr> <td> 7 </td> <td> whatever : Doors : DOR_Single_gm : 800w, 2100h (720Leaf) - Mark 102B : id 472052 </td> <td> Workset1 : Windows : WIN-ConceptWindowFixed_gm : 1200 H x 1200 W - Mark 102B : id 472822 </td> </tr>
<tr> <td> 8 </td> <td> whatever : Doors : DOR_Single_gm : 800w, 2100h (720Leaf) - Mark 101A : id 472376 </td> <td> Workset1 : Windows : WIN-ConceptWindowFixed_gm : 1200 H x 1200 W - Mark 101C : id 472720 </td> </tr>
<tr> <td> 9 </td> <td> Workset1 : Windows : WIN-ConceptWindowFixed_gm : 1800 H x 1200 W 2 - Mark 101B : id 472688 </td> <td> Workset1 : Furniture : FUR_Sofa_gm : 2100mm : id 482306 </td> </tr>
</table>
<p><b> End of Interference Report </b>
</html>
从 BeautifulSoup 导入 BeautifulSoup 汤 = BeautifulSoup(文件) tag = soup.findAll('tr')
for i in tag:
txt.append(i.findAll('td'))
现在我想将每个子列表元素转换为文本,所以我尝试了: txt1 = [i.text for x in txt for i in x] 然而,我的 txt1 输出以平面列表而不是列表的形式出现。我究竟做错了什么?
【问题讨论】:
标签: python list beautifulsoup