【问题标题】:Pandas dataset filter (search) [closed]Pandas 数据集过滤器(搜索)[关闭]
【发布时间】:2019-01-31 20:16:53
【问题描述】:

我正在处理从 Sage 会计中提取的原始数据,基本上是一堆带有详细信息的发票。我的问题是如何根据发票编号列表过滤(理想情况下生成新文件)我的 CSV 文件,然后删除未列出的那些? 这是我的 CSV 文件:

CARRIER,DEVISION,WEIGHT,CLIENT,DATE,ITEMS,PRODUCT,VOLUME,NUMBER OF PACKAGES,COMMAND NUMBER,INVOICE NUMBER,CLIENT ADDRESS,ZIP CODE
UPS,DEV PARIS,0,MIROR SABI ,18/01/19,1,EXONERATION TVA ART.262 TER I CGI,0,0,CN1010090,IN1008889,VIA PO 13,20031
UPS,DEV PARIS,0,MIROR SABI ,18/01/19,1,FRAIS DE TRANSPORT / PORT AVANCE,0,0,CN1010090,IN1008889,VIA PO 13,20031
UPS,DEV PARIS,9,MIROR SABI ,18/01/19,1,MIROR SABI  56x51 VIOLET ET VERT,"0,02",1,CN1010090,IN1008889,VIA PO 13,20031
FEDEX,DEV SHANGHAI,0,CONGRES,25/01/19,1,FRAIS DE TRANSPORT/ PORT AVANCE,0,0,CN1008735,IN1008984,15 LOT DU STILETTO,20090
FEDEX,DEV SHANGHAI,17,CONGRES,25/01/19,1,ALOX BOUT DE CANAPE 65X46,"0,25",1,CN1008735,IN1008984,15 LOT DU STILETTO,20090
FEDEX,DEV SHANGHAI,33,CONGRES,25/01/19,1,ALOX TABLE BASSE 110X36,"0,53",1,CN1008735,IN1008984,15 LOT DU STILETTO,20090
DHL,DEV ATLANTA,0,EDWARDS,26/01/19,1,FRAIS D'EMBALLAGE,0,0,CN1010248,IN1009120,DEV ATLANTA,TX 77063
DHL,DEV ATLANTA,0,EDWARDS,27/01/19,1,FRAIS DE TRANSPORT/ PORT AVANCE,0,0,CN1010248,IN1009120,DEV ATLANTA,TX 77063
DHL,DEV ATLANTA,0,EDWARDS,28/01/19,1,MARCHANDISES DESTINEES A,0,0,CN1010248,IN1009120,DEV ATLANTA,TX 77063
DHL,DEV ATLANTA,0,SHOFFNER,29/01/19,1,FRAIS D'EMBALLAGE,0,0,CN1009294,IN1009119,DEV ATLANTA,TX 77063
DHL,DEV ATLANTA,0,SHOFFNER,30/01/19,1,FRAIS DE TRANSPORT/ PORT AVANCE,0,0,CN1009294,IN1009119,DEV ATLANTA,TX 77063
DHL,DEV ATLANTA,0,SHOFFNER,31/01/19,1,MARCHANDISES DESTINEES A,0,0,CN1009294,IN1009119,DEV ATLANTA,TX 77063
DHL,DEV ATLANTA,25,SHOFFNER,01/02/19,1,"Sceptre 32"" Class HD (720P) LED TV�","0,09",1,CN1009294,IN1009119,DEV ATLANTA,TX 77063
DHL,DEV ATLANTA,134,EDWARDS,02/02/19,1,VIRAX TABLE REPAS 200XH74X100,"0,59",2,CN1010248,IN1009120,DEV ATLANTA,TX 77063
FEDEX,DEV MIAMI,0,ALBERTINI GERARD 100106169,25/01/19,1,FRAIS DE TRANSPORT/ PORT AVANCE,0,0,CN1010207,IN1009046,TRANSIT EXPRESS,20620
FEDEX,DEV MIAMI,0,SANTOS MARC 100106157,11/01/19,1,FRAIS DE TRANSPORT/ PORT AVANCE,0,0,CN1010049,IN1008870,TRANSIT EXPRESS,20620
FEDEX,DEV MIAMI,28,SANTOS MARC 100106158,11/01/19,2,IRON TREE TABLE BASSE 70XH26 FIL INOX,"0,32",2,CN1010049,IN1008870,TRANSIT EXPRESS,20620
FEDEX,DEV MIAMI,79,ALBERTINI HELENE 100106169,25/01/19,1,TRAME TABLE BASSE 140X85 CARRARE ET MIROIR OR,"0,58",2,CN1010207,IN1009046,TRANSIT EXPRESS,20620
TNT,DEV BERLIN,0,GEEVE EDDY 102002796PS#2796,26/01/19,1,EXONERATION TVA ART.262 TER I CGI,0,0,CN1010210,IN1009098,INTERIOR HILLS,85609

为了解释自己,在每周结束时,我必须根据附件 CSV 中的发票编号列表向每个承运人(DHL、FEDEX、TNT 等)发送一份包含所有详细信息的 Excel 表格。

我的尝试是:

    df = pd.read_csv("invo.csv", encoding="latin")
    ready_to_ship = ["IN1008889", "IN1009120", "IN1009098"]
    df.filter(ready_to_ship)

    ## I am expecting df result will be filtered with only 
    ## "ready_to_ship" list

【问题讨论】:

  • 到目前为止你尝试了什么?
  • 欢迎来到 Stack Overflow,您在发布问题时必须提供更多详细信息,您尝试过什么?向我们展示您当前拥有的代码。请务必在提问之前研究您的问题。
  • Filter Pandas Data frame的可能重复

标签: python pandas


【解决方案1】:

我们需要看看你做了什么。你不能只要求别人做你所有的工作。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2021-12-10
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2012-11-01
    • 2021-11-06
    • 1970-01-01
    相关资源
    最近更新 更多