【发布时间】:2015-12-04 01:24:32
【问题描述】:
我在消息 A 和 B 的 pandas DF 中有以下内容:
Message_A
"(Live Storage: 20.00 included in Plan for $15.00 - Exceess of 10.0 @ $6.0)"
"(Live Storage: 5.00 included in Plan for $5.00 - Exceess of 11.0 @ $40.0)"
"(Live Storage: 10.0 out of 150.00 included in Plan for $10.00)"
"(Live Storage: 146.0 out of 200.00 included in Plan for $150.00)"
"(Live Storage: 150.0 - Tier 1501 to 2000 @ $350)"
"(PY Solution -Flat Fee- of $30.00 applied)"
"(Live Storage: 17.0 out of 40.00 included in Plan for $20.00)"
"(Live Storage: 67.0 @ $5.00)"
"(Live Storage: 5.00 included in Plan for $55.00 - Exceess of 13.0 @ $6.0)"
"(Live Storage: 741.0 @ $3.00)"
"(Live Storage: 30.00 included in Plan for $150.00 - Exceess of 39.0 @ $6.0)"
"(Live Storage: 65.0 - Tier 51 to 75 @ $250)"
"(Live Storage: 567.0 - Tier 501 to 750 @ $1750)"
Message_B
"(! Price for Live Storage not found in Pricing Plan !)"
"(! Price for Live Storage not found in Pricing Plan !) ( ABC Storage: 141.0 @ $2.00) (Discount of 10.0% applied to storage amount)"
"(! Price for Live Storage not found in Pricing Plan !)"
"(! Price for Live Storage not found in Pricing Plan !) ( ABC Storage: 1.0 @ $3.00)"
"( ABC Storage: 137.0 - Tier 1251 to 150 @ $100) (! ABC Storage Limit of 00 Exceeded !) (Local Allocated Storage: 20.00 @ $0.40) (Live Storage: 16.0 @ $??)"
"(Discount of 10.0% applied to storage amount) (! Price for Live Storage not found in Pricing Plan !)"
"(! Live Storage not found in Pricing Plan !) (Discount of 10.0% applied to storage amount)"
"(! Price for Live Storage not found in Pricing Plan !) (Local Allocated Storage: 100.00 @ $0.50)"
"(! Price for Storage not found in Pricing Plan !) (Live Storage: 18.0 @ $??)"
"(! Price for Storage not found in Pricing Plan !)(Live Storage: 69.0 @ $??) ( ABC Storage: 401.0 @ $1.50)"
"(Live Storage: 6.0 @ $??) (! Price for Storage not found in Pricing Plan !)"
"(! Price for Live Storage not found in Pricing Plan !) (Discount of 10.0% applied to storage amount)"
"(! Price for Live Storage not found in Pricing Plan !) ( ABC Storage: 270.0 - Tier 201 to 300 @ $400)"
我希望从 message_B 中删除错误消息。这些是一些文本更改的消息,但所有错误消息都包含“!”或 '?$$' 在其中。然后我想加入 message_A 以获得单列消息。 为清楚起见,中间步骤如下所示:
Message_B
Nan
"( ABC Storage: 141.0 @ $2.00) (Discount of 10.0% applied to storage amount)"
Nan
"( ABC Storage: 1.0 @ $3.00)"
"( ABC Storage: 137.0 - Tier 1251 to 150 @ $100)(Local Allocated Storage: 20.00 @ $0.40)"
"(Discount of 10.0% applied to storage amount)"
"(Discount of 10.0% applied to storage amount)"
"(Local Allocated Storage: 100.00 @ $0.50)"
Nan
"( ABC Storage: 401.0 @ $1.50)"
Nan
"(Discount of 10.0% applied to storage amount)"
"( ABC Storage: 270.0 - Tier 201 to 300 @ $400)"
最终结果只是一个单列字符串(drop Nan)。
我已经能够通过将 '(' 和 .replace ')' 删除为 '|' 来拆分 message_B给一个分隔符来分割。
我已将 message_B 拆分为(新的)不同的数据框,但如何遍历 full DF 并删除不需要的消息? (我不想删除整行)
我已经尝试过df[df['Message_B'].str.contains("(Live Storage: 18.0 @ $??)")==False] 但我需要为每种类型的消息执行此操作,并且消息中的数字会发生变化。
另外,我现在意识到我不能在完整的 DF 上使用.str.contains。
任何帮助将不胜感激,对于我在消息中设置 DF 的方式感到抱歉,发现它是最容易阅读的。谢谢
编辑 我已经能够通过以下方式取出标准错误消息:
error_msg1 = "(! Price for live Storage not found in Pricing Plan !)"
replace_with = ''
bumi_output['Message_B'] = [i.replace(error_msg1, replace_with) for i in bumi_output['Message_B']]
有没有办法使用这种方法来取出错误消息,其中部分消息可以更改?例如: (实时存储:18.0 @ $??) (实时存储:69.0 @ $??)
谢谢。
【问题讨论】:
-
你不需要包含
snippets,你可以通过每行缩进(4个空格)将内容变成代码块。或者在你写作的时候,如果你选择块并在编辑器中点击{}符号。 -
谢谢@DilithiumMatrix,我使用 {} 做了类似的事情,但我没有在不同的行上有行。我有 ',' 分隔符,这使它看起来很长且难以阅读。我没有看到太多的兴趣,所以我可能对这个不走运。
标签: python regex string pandas dataframe