将 Python beautifulsoup 的 span 结果连接成字符串答案

【问题标题】：Concatenate span results from Python beautifulsoup into string将 Python beautifulsoup 的 span 结果连接成字符串
【发布时间】：2021-07-24 00:56:56
【问题描述】：

下面的 sn-p 可以按需要工作，但作为改进的一部分，我想将项目结果加入到一个用逗号分隔的字符串中。我一直在尝试，但没有锁定。

from bs4 import BeautifulSoup
from urllib import request
from urllib.request import Request, urlopen

url = 'https://bscscan.com/tx/0xb9044e77ae66b6f128866e049d55f09b3501de6fc75478e406e4c32d1de4bd6a'
headers = {'User-Agent': 'Mozilla/5.0'}

req = Request(url, headers=headers)
html = urlopen(req).read()
soup = BeautifulSoup(html, 'html.parser')

main_data = soup.select("ul#wrapperContent div.media-body")
for item in main_data:
    all_span = item.find_all("span", class_='mr-1')
    last_span = all_span[-1]
    all_a = item.find_all("a")
    last_a = all_a[-1]
    print("{:>35} | {:18} | https://bscscan.com{}".format(last_span.get_text(strip=True), last_a.get_text(strip=True), last_a['href']))

电流输出：

                    2 ($598.51) | Wrapped BNB (WBNB) | https://bscscan.com/token/0xbb4cdb9cbd36b01bd1cbaebf2de08d9173bc095c
          13.684565595242991082 | MoMo KEY (KEY)     | https://bscscan.com/token/0x85c128ee1feeb39a59490c720a9c563554b51d33
                              4 | Chi Gastoken...(CHI) | https://bscscan.com/token/0x0000000000004946c0e9f43f4dee607b0ef1fa1c

需要改进：

                    2 ($598.51) | Wrapped BNB (WBNB) | https://bscscan.com/token/0xbb4cdb9cbd36b01bd1cbaebf2de08d9173bc095c
          13.684565595242991082 | MoMo KEY (KEY)     | https://bscscan.com/token/0x85c128ee1feeb39a59490c720a9c563554b51d33
                              4 | Chi Gastoken...(CHI) | https://bscscan.com/token/0x0000000000004946c0e9f43f4dee607b0ef1fa1c
         -> Wrapped BNB (WBNB) , MoMo KEY (KEY) , Chi Gastoken...(CHI) #-- Concatenated String

【问题讨论】：

标签： python python-3.x string beautifulsoup python-requests

【解决方案1】：

首先，您尝试连接的字符串似乎是链接中的文本，而不是跨度。

其次：初始化一个空字符串（在你的情况下它不会是空的，因为你希望它以'->'开头）然后在每次迭代中添加所需的字符串，你会得到最终的答案。请尝试以下操作：

from bs4 import BeautifulSoup
from urllib import request
from urllib.request import Request, urlopen

url = 'https://bscscan.com/tx/0xb9044e77ae66b6f128866e049d55f09b3501de6fc75478e406e4c32d1de4bd6a'
headers = {'User-Agent': 'Mozilla/5.0'}

req = Request(url, headers=headers)
html = urlopen(req).read()
soup = BeautifulSoup(html, 'html.parser')

main_data = soup.select("ul#wrapperContent div.media-body")
link_texts = '->'    # initialize a new string
for item in main_data:
    all_span = item.find_all("span", class_='mr-1')
    last_span = all_span[-1]
    all_a = item.find_all("a")
    last_a = all_a[-1]
    print("{:>35} | {:18} | https://bscscan.com{}".format(last_span.get_text(strip=True), last_a.get_text(strip=True), last_a['href']))
    link_texts += last_a.get_text(strip=True) + ","    # add the link text to the string you initialized on each iteration
link_texts = link_texts[:-1]    # slice the string so as to remove the extra comma at the last :):):)
print(link_texts)

这是输出：

  2 ($597.04) | Wrapped BNB (WBNB) | https://bscscan.com/token/0xbb4cdb9cbd36b01bd1cbaebf2de08d9173bc095c
              13.684565595242991082 | MoMo KEY (KEY)     | https://bscscan.com/token/0x85c128ee1feeb39a59490c720a9c563554b51d33
                                  4 | Chi Gastoken...(CHI) | https://bscscan.com/token/0x0000000000004946c0e9f43f4dee607b0ef1fa1c
->Wrapped BNB (WBNB),MoMo KEY (KEY),Chi Gastoken...(CHI)

【讨论】：

【解决方案2】：

您应该将值存储在一个列表中（在 for 循环之前声明），并与 ', '.join(list_variable) 连接

类似

temp_list = []
main_data = soup.select("ul#wrapperContent div.media-body")
for item in main_data:
    all_span = item.find_all("span", class_='mr-1')
    last_span = all_span[-1]
    all_a = item.find_all("a")
    last_a = all_a[-1]
    print("{:>35} | {:18} | https://bscscan.com{}".format(last_span.get_text(strip=True), last_a.get_text(strip=True), last_a['href']))
    temp_list.append(last_a.get_text(strip=True))

print(', '.join(temp_list))

【讨论】：

我做错了。我在里面创建它。