【问题标题】:adding new column with pandas用熊猫添加新列
【发布时间】:2022-01-26 23:25:41
【问题描述】:

我想抓取 google play 应用程序,如果应用程序可用/不可用,我想添加一个值为 True/False 的新列“Google”

这是我的 csv 文件“apkmonk.csv”

Id,Genre,LastUpdated,Name,Package
0, Adventure,"Dec 16, 2021",Merge Mermaids-design home&create magic fish life. apk,com.xjoy.mermaid
1, Adventure,"Dec 10, 2021",Nob's World - Super Run Game apk,org.game69studio.nobworld
2, Adventure,"Dec 15, 2021",Fps Shooting Strike: Gun Games apk,com.mizo.fps.shooting.strike
3, Adventure,"Dec 12, 2021",Ostrich Air Jet Robot Car Game apk,com.cgs.us.police.flying.transform.robot.bike.game

我的代码

from bs4 import BeautifulSoup
import requests 
import pandas as pd 

df = pd.read_csv('apkmonk.csv')

def googleplay(package):
        url=f"https://play.google.com/store/apps/details?id={package}"
        html_content = requests.get(url).text
        soup = BeautifulSoup(html_content, "lxml")
        title= soup.title.text
        if "Not Found" in title:
            print("not found")
            return False
        else:
            print(" found")
            return True
            
for package in df["Package"]:

    if googleplay(package) is True:
        df["Google"] = "True"
    else: 
        df["Google"] = "False"      

    
df.to_csv("new.csv", sep=',')

【问题讨论】:

  • 您的问题是什么? a.k.a. 在当前代码中什么不起作用?

标签: pandas csv beautifulsoup


【解决方案1】:

使用Series.apply:

df["Google"] = df["Package"].apply(googleplay)

如果需要字符串 'True', 'False' 而不是布尔值:

df["Google"] = df["Package"].apply(googleplay).astype(str)

【讨论】:

  • 这太聪明了,效果很好!谢谢
  • @siad - 超级棒!如果我的回答有帮助,请不要忘记accept。谢谢。
猜你喜欢
  • 1970-01-01
  • 2015-09-19
  • 1970-01-01
  • 2016-11-03
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2018-07-17
  • 2018-03-09
相关资源
最近更新 更多