Psycopg2：如何使用 psycopg2 和 python 插入和更新冲突？答案

【问题标题】：Psycopg2: how to insert and update on conflict using psycopg2 with python?Psycopg2：如何使用 psycopg2 和 python 插入和更新冲突？
【发布时间】：2021-02-08 15:35:53
【问题描述】：

我正在使用 psycopg2 将命令插入 postgres 数据库，当有冲突时我只想更新其他列值。

这里是查询：

        insert_sql = '''
        INSERT INTO tablename (col1, col2, col3,col4)
        VALUES (%s, %s, %s, %s) (val1,val2,val3,val4)
        ON CONFLICT (col1)
        DO UPDATE SET
        (col2, col3, col4)
        = (val2, val3, val4) ; '''

        cur.excecute(insert_sql)

我想找出我做错了什么？我使用的变量 val1 、 val2 、 val3 不是实际值。

【问题讨论】：

标签： postgresql psycopg2

【解决方案1】：

引用 psycopg2 的documentation：

警告永远，永远，永远不要使用 Python 字符串连接 (+) 或字符串参数插值 (%) 将变量传递给 SQL 查询字符串。甚至在枪口下也没有。

现在，对于 upsert 操作，您可以这样做：

insert_sql = '''
    INSERT INTO tablename (col1, col2, col3, col4)
    VALUES (%s, %s, %s, %s)
    ON CONFLICT (col1) DO UPDATE SET
    (col2, col3, col4) = (EXCLUDED.col2, EXCLUDED.col3, EXCLUDED.col4);
'''
cur.execute(insert_sql, (val1, val2, val3, val4))

请注意，查询的参数将作为元组传递给 execute 语句（这确保 psycopg2 将负责使它们适应 SQL，同时保护您免受注入攻击）。

EXCLUDED 位允许您重复使用这些值，而无需在数据参数中指定两次。

【讨论】：

非常有帮助！小记——应该是 (col2, col3, col4) = (EXCLUDED.col2, EXCLUDED.col3, EXCLUDED.col4);

【解决方案2】：

使用：

INSERT INTO members (member_id, customer_id, subscribed, customer_member_id, phone, cust_atts) VALUES (%s, %s, %s, %s, %s, %s) ON CONFLICT (customer_member_id) DO UPDATE SET (phone) = (EXCLUDED.phone);

我收到以下错误：

psycopg2.errors.FeatureNotSupported: source for a multiple-column UPDATE item must be a sub-SELECT or ROW() expression
LINE 1: ...ICT (customer_member_id) DO UPDATE SET (phone) = (EXCLUDED.p...

改为：

INSERT INTO members (member_id, customer_id, subscribed, customer_member_id, phone, cust_atts) VALUES (%s, %s, %s, %s, %s, %s) ON CONFLICT (customer_member_id) DO UPDATE SET (phone) = ROW(EXCLUDED.phone);

解决了这个问题。

【讨论】：

【解决方案3】：

试试：

INSERT INTO tablename (col1, col2, col3,col4)
        VALUES (val1,val2,val3,val4)
        ON CONFLICT (col1)
        DO UPDATE SET
        (col2, col3, col4)
        = (val2, val3, val4) ; '''

【讨论】：

它给出的错误列 val1 不存在，我不知道它为什么将 val1 视为列
您需要提供文字或常量，例如 10、'abc' 表示 val1、val2 等
我正在使用变量
我应该给 {},{} 和它们 cur.exceute( insert_sql.format(val1,val2)) 代替 val1 ， val2 ，这样对吗？
你能帮忙吗？

【解决方案4】：

这里的函数接受df、表的schemaname、表的名称、冲突名称中要用作冲突的列，以及sqlalchemy的create_engine创建的引擎。它根据冲突列更新表。这是@Ionut Ticus 解决方案的扩展解决方案。不要一起使用 pandas.to_sql() 。 pandas.to_sql() 破坏主键设置。在这种情况下，需要通过 ALTER 查询设置主键，这是下面函数的建议。大熊猫不一定会破坏主键，可能还没有设置它。在这种情况下会出现错误：引用表的给定键没有唯一约束匹配？函数会建议你在下面执行。

engine.execute('ALTER TABLE {schemaname}.{tablename} ADD PRIMARY KEY ({conflictcolumn});

功能：

def update_query(df,schemaname,tablename,conflictcolumn,engine ):
"""
This function takes dataframe as df, name of schema as schemaname,name of the table to append/add/insert as tablename, 
and column name that only  other elements of rows will be changed if it's existed as conflictname,
database engine as engine.
Example to engine : engine_portfolio_pg = create_engine('postgresql://pythonuser:vmqJRZ#dPW24d@145.239.121.143/cetrm_portfolio')
Example to schemaname,tablename : weatherofcities.sanfrancisco , schemaname = weatherofcities, tablename = sanfrancisco.

"""


excluded = ""
columns = df.columns.tolist()
deleteprimary = columns.copy()
deleteprimary.remove(conflictcolumn)
excluded = ""
replacestring = '%s,'*len(df.columns.tolist())
replacestring = replacestring[:-1]

for column in deleteprimary:
    excluded += "EXCLUDED.{}".format(column)+","
excluded = excluded[:-1]

columns = ','.join(columns)

deleteprimary  = ','.join(deleteprimary)

insert_sql = """ INSERT INTO {schemaname}.{tablename} ({allcolumns})
    VALUES ({replacestring})
    ON CONFLICT ({conflictcolumn}) DO UPDATE SET
    ({deleteprimary}) = ({excluded})""".format( tablename = tablename, schemaname=schemaname,allcolumns = columns, replacestring= replacestring,
                                               conflictcolumn= conflictcolumn,deleteprimary = deleteprimary,  excluded=excluded  )



conn = engine.raw_connection()
conn.autocommit = True

#conn = engine.connect()

cursor = conn.cursor()

i = 0
print("------------------------"*5)

print("If below error happens:")
print("there is no unique constraint matching given keys for referenced table?")    
print("Primary key is not set,you can execute:")
print("engine.execute('ALTER TABLE {}.{} ADD PRIMARY KEY ({});')".format(schemaname,tablename,conflictcolumn))
print("------------------------"*5)
for index, row in df.iterrows():

    cursor.execute(insert_sql, tuple(row.values))
    conn.commit() 
    if i == 0:
        print("Order of Columns in Operated SQL Query for Rows")
        columns = df.columns.tolist()
        print(insert_sql%tuple(columns))
        print("----")
        print("Example of Operated SQL Query for Rows")
        print(insert_sql%tuple(row.values))
        print("---")
        
    i += 1 

conn.close()

【讨论】：

请不要推荐这种使用 Python 字符串插值构建动态 SQL 的方式。请参阅此处 Parameters 和此处 Dynamic SQL 了解原因。