在 PL/pgSQL 函数中使用变量答案

【问题标题】：Using variables in a PL/pgSQL function在 PL/pgSQL 函数中使用变量
【发布时间】：2017-05-13 01:43:41
【问题描述】：

对于任何不返回行的 SQL 命令，例如INSERT 如果没有 RETURNING 子句，您可以在 PL/pgSQL函数只需编写命令即可。

出现在命令文本中的任何 PL/pgSQL 变量名都被视为一个参数，然后将变量的当前值提供为运行时的参数值。

但是当我在查询中使用变量名时，我得到一个错误：

ERROR:  syntax error at or near "email"
LINE 16: ...d,email,password) values(identity_id,current_ts,''email'',''...

这是我的功能：

CREATE OR REPLACE FUNCTION app.create_identity(email varchar,passwd varchar)
RETURNS integer as $$
DECLARE
    current_ts          integer;
    new_identity_id     integer;
    int_max             integer;
    int_min             integer;
BEGIN
    SELECT extract(epoch FROM now())::integer INTO current_ts;
    int_min:=-2147483648;
    int_max:= 2147483647;
    LOOP
        BEGIN
            SELECT floor(int_min + (int_max - int_min + 1) * random()) INTO new_identity_id;
            IF new_identity_id != 0 THEN
                INSERT into app.identity(identity_id,date_inserted,email,password) values(identity_id,current_ts,''email'',''passwd'');
                RETURN new_identity_id;
            END IF;
        EXCEPTION
            WHEN unique_violation THEN
        END;
    END LOOP;
END;
$$ LANGUAGE plpgsql;

为什么当我在查询中使用变量时，Postgres 会抛出错误。这应该怎么写？

【问题讨论】：

标签： postgresql parameter-passing naming-conventions plpgsql quotes

【解决方案1】：

您不能将参数名称放在单引号中（''email'' 并且您不能使用参数email“原样”，因为它与表中的列具有相同的名称。这种名称冲突是一个强烈建议不使用与其中一个表中的列同名的变量或参数的原因之一。您有三种选择来处理这个问题：

重命名变量。一个常见的命名约定是在参数前面加上p_，例如p_email，然后使用 insert 中明确的名称
```
INSERT into app.identity(identity_id,date_inserted,email,password) 
values(identity_id,current_ts,p_email,p_password);
```

第一个参数使用$1，第二个参数使用$2：

INSERT into app.identity(identity_id,date_inserted,email,password) 
values(identity_id,current_ts,$1,$2);

在参数名前加上函数名：

INSERT into app.identity(identity_id,date_inserted,email,password) 
values(identity_id,current_ts,create_identity.email,create_identity.password);

我强烈建议选择选项 1

不相关，但是：如果您不从表中检索这些值，则不需要 SELECT 语句来分配变量值。

SELECT extract(epoch FROM now())::integer INTO current_ts;

可以简化为：

current_ts := extract(epoch FROM now())::integer;

和

SELECT floor(int_min + (int_max - int_min + 1) * random()) INTO new_identity_id;

到

new_identity_id := floor(int_min + (int_max - int_min + 1) * random());

【讨论】：

不行！！我在表格中插入了“电子邮件”这个词，但我想要变量 email 的值
为了让未来的读者更容易理解这一点，可能应该在第一段中说明实际需要的是 no 引号。如果参数是p_email，您将不将其写为'p_email'。这不是美元报价和转义的问题，只是对如何解释变量的基本误解。
@IMSoP：实际上两者兼而有之，因为 OP also 误解了美元报价。而且还有几个误区……

【解决方案2】：

@a_horse answers您的实际问题并澄清引用问题和命名冲突。

关于报价：

Insert text with single quotes in PostgreSQL

关于命名冲突（plpgsql 的行为随时间略有变化）：

更好的解决方案

我建议以一种完全不同的方式开始：

CREATE OR REPLACE FUNCTION app.create_identity(_email text, _passwd text
                                             , OUT new_identity_id int) AS
$func$
DECLARE
   _current_ts int := extract(epoch FROM now());
BEGIN
   LOOP
      --+ Generate compeltely random int4 numbers +-----------------------------
      -- integer (= int4) in Postgres is a signed integer occupying 4 bytes   --
      -- int4 ranges from -2147483648 to +2147483647, i.e. -2^31 to 2^31 - 1  --
      -- Multiply bigint 4294967296 (= 2^32) with random() (0.0 <= x < 1.0)   --
      --   trunc() the resulting (positive!) float8 - cheaper than floor()    -- 
      --   add result to -2147483648 and cast the next result back to int4    --
      -- The result fits the int4 range *exactly*                             --
      --------------------------------------------------------------------------
      INSERT INTO app.identity
            (identity_id, date_inserted,  email ,  password)
      SELECT _random_int, _current_ts  , _email , _passwd
      FROM  (SELECT (bigint '-2147483648'       -- could be int, but sum is bigint anyway
                   + bigint '4294967296' * random())::int) AS t(_random_int)  -- random int
      WHERE  _random_int <> 0                   -- exclude 0 (no insert)
      ON     CONFLICT (identity_id) DO NOTHING  -- no exception raised!
      RETURNING identity_id                     -- return *actually* inserted identity_id
      INTO   new_identity_id;                   -- OUT parameter, returned at end

      EXIT WHEN FOUND;                          -- exit after success
      -- maybe add counter and raise exception when exceeding n (100?) iterations
   END LOOP;
END
$func$  LANGUAGE plpgsql;

要点

您的随机整数计算将导致integer out of range 错误，因为中间项int_max - int_min + 1 与integer 一起运行，但结果不适合。我建议使用上述更便宜、正确的算法。
输入带有异常子句的块比不输入要昂贵得多。幸运的是，您实际上并不需要一开始就引发异常。使用UPSERT (INSERT ... ON CONFLICT ... DO NOTHING)，以廉价而优雅的方式解决这个问题（Postgres 9.5+）。
The manual:

提示：包含EXCEPTION 子句的块明显更多进出比没有的街区贵。因此，不要无需使用EXCEPTION。
您也不需要额外的 IF 构造。将SELECT 与WHERE 一起使用。
将new_identity_id 设为OUT 参数以简化。
使用RETURNING 子句并将结果 identity_id 直接插入OUT 参数。除了更简单的代码和更快的执行之外，还有一个额外的、微妙的好处：您获得了实际插入的值。如果桌面上有触发器或规则，这可能与您使用INSERT 发送的内容不同。
PL/pgSQL 中的分配相对昂贵。将这些减少到最低限度以获得高效的代码。
您也可以删除最后一个剩余变量_current_ts，并在子查询中进行计算，那么您根本不需要DECLARE。我留下了那个，因为计算它一次可能有意义，如果函数循环多次...
剩下的就是一个 SQL 命令，包裹在LOOP 中以重试直到成功。
如果您的表格有可能溢出（使用全部或大部分 int4 数字） - 严格来说，总是机会 - 我会添加一个计数器并在大约 100 次迭代后引发异常以避免无限循环。

【讨论】：

哇！这就是我所说的工程！我猜您正在使用单个查询（全部在一个查询中），因为在解析后它们执行得非常快。现在必须升级我的 9.4 安装。谢谢！
@Nulik：基本上，是的。在幕后，plpgsql 中的每个分配都是一个单独的SELECT（非常基本且快速，但仍然如此）。只要不变得复杂，在单个查询中完成所有操作通常会更快。