【发布时间】:2018-03-03 22:18:03
【问题描述】:
我想将我的 SQL 代码转换为 Python (pandas) 过滤器函数,但这给我带来了困难。知道如何在不循环记录的情况下根据 SQL 条件过滤数据吗? Value 检查 Desc = 'Bla1' 会有所不同。
如果
joe_doe:使用Hello= 1 保存记录;其他:保存记录Hello= 0
SQL
Hello =
CASE
WHEN
(
Desc = 'Bla1'
AND Value = 'True'
)
OR
(
Desc IN('Bla2', 'Bla3')
AND Active = 'True'
)
AND Enabled = 'True'
THEN 1
ELSE 0
Python(包括熊猫)
def get_it(john_doe, df):
sentences = {
'Bla1': 'Value',
'Bla2': 'Active',
'Bla3': 'Active'
}
if john_doe:
df = df[HOW TO KEEP ALL RECORDS THAT HAVE Hello = 1?]
else:
df = df[HOW TO KEEP ALL RECORDS THAT HAVE Hello = 0?]
return df
数据帧输入
id | Desc | Active | Enabled | Value | [A LOT OF OTHER COLUMNS]
1 | Bla2 | 1 | 0 | 1 | [A LOT OF OTHER COLUMNS]
2 | Bla3 | 1 | 1 | 1 | [A LOT OF OTHER COLUMNS]
3 | Bla3 | 1 | 1 | 0 | [A LOT OF OTHER COLUMNS]
4 | Bla4 | 1 | 1 | 1 | [A LOT OF OTHER COLUMNS]
5 | Bla6 | 1 | 1 | 0 | [A LOT OF OTHER COLUMNS]
6 | Bla7 | 0 | 0 | 1 | [A LOT OF OTHER COLUMNS]
7 | Bla1 | 0 | 1 | 1 | [A LOT OF OTHER COLUMNS]
8 | Bla1 | 1 | 1 | 0 | [A LOT OF OTHER COLUMNS]
IF JOE_DOE 需要输出数据帧
id | Desc | Active | Enabled | Value | [A LOT OF OTHER COLUMNS]
2 | Bla3 | 1 | 1 | 1 | [A LOT OF OTHER COLUMNS]
3 | Bla3 | 1 | 1 | 0 | [A LOT OF OTHER COLUMNS]
7 | Bla1 | 0 | 1 | 1 | [A LOT OF OTHER COLUMNS]
ELSE 需要输出数据帧
id | Desc | Active | Enabled | Value | [A LOT OF OTHER COLUMNS]
1 | Bla2 | 1 | 0 | 1 | [A LOT OF OTHER COLUMNS]
4 | Bla4 | 1 | 1 | 1 | [A LOT OF OTHER COLUMNS]
5 | Bla6 | 1 | 1 | 0 | [A LOT OF OTHER COLUMNS]
6 | Bla7 | 0 | 0 | 1 | [A LOT OF OTHER COLUMNS]
8 | Bla1 | 1 | 1 | 0 | [A LOT OF OTHER COLUMNS]
【问题讨论】:
-
这个问题令人困惑——你能附上你的 df 样本吗?你是想模仿 case 语句,还是只需要知道在 if 语句中放什么?
-
我想根据 Python/pandas 中的 SQL 条件过滤我的
df。在if我想根据 SQL 中的条件(THEN 1)保留所有记录。在else我想保留所有不符合SQL条件的记录(ELSE 0) -
sentences字典包含所有 SQL 案例,因为Bla1会检查Value字段。另外 2 个检查Active字段。
标签: python sql pandas dataframe filter