使用 grep 捕获组正则表达式答案

【问题标题】：Capturing group regex with grep使用 grep 捕获组正则表达式
【发布时间】：2022-01-07 10:19:23
【问题描述】：

我正在尝试从 PostgreSQL 模式转储中捕获 SQL DDL“CREATE”，如下所示：

SET default_table_access_method = heap;

CREATE TABLE schema_name.table_name (
    col1 bigint,
    col2 text
);

ALTER TABLE schema_name.table_name OWNER TO user;

CREATE INDEX index ON schema_name.table_name USING btree (col1);

我想要的是：

CREATE TABLE schema_name.table_name (
    col1 bigint,
    col2 text
);`

为什么grep -Po "(CREATE TABLE)[\S\s]*(;)" dump.sql 不起作用？

在 PCRE2 中 /CREATE TABLE [\w]*\.[\w]*[\S\s]*(;)/U 匹配正确。

谢谢。

【问题讨论】：

“为什么正则表达式不起作用？” ：您需要添加-z 选项来启用多行grep。此外，如果您想使用.*? 匹配任何字符，您需要使用s 修饰符（请参阅Modifiers perlre）。例如：grep -zPo "(?s)CREATE TABLE.*?;" dump.sql

标签： regex perl awk sed grep

【解决方案1】：

sed 会是更好的工具：

sed -n '/^CREATE TABLE/,/;$/p' file.sql

CREATE TABLE schema_name.table_name (
    col1 bigint,
    col2 text
);

如果您真的想要gnu-grep 解决方案，请使用：

grep -zPo "(?m)^CREATE TABLE[^;]+;\R" file.sql

CREATE TABLE schema_name.table_name (
    col1 bigint,
    col2 text
);

【讨论】：

【解决方案2】：

不确定您的正则表达式，但这有效：

grep -Poz "CREATE TABLE[^;]*;" dump.sql

给予：

CREATE TABLE schema_name.table_name (
    col1 bigint,
    col2 text
);

【讨论】：

【解决方案3】：

因为它被标记为 perl... 这是一个使用我找到的旧但漂亮的模块 SQL::Script 的快速脚本来解析 SQL 转储：

#!/usr/bin/env perl
use strict;
use warnings;
use feature qw/say/;
use SQL::Script; # Install with your favorite CPAN client

# Pass the dump file name as the command-line argument

my $script = SQL::Script->new;
$script->read($ARGV[0]);
foreach my $stmt ($script->statements) {
    say "$stmt;" if $stmt =~ /^CREATE TABLE/i;
}

例子：

$ ./dump_tables test.sql
CREATE TABLE schema_name.table_name (
    col1 bigint,
    col2 text
);

【讨论】：

【解决方案4】：

使用 GNU awk，您可以尝试关注 awk 程序。

awk -v RS='\nCREATE[^)]*\n\\);' 'RT{gsub(/(^|$)\n/,"",RT);print RT}' Input_file

解释： 使用 GNU awk，将 awk 的 RS 变量设置为 \nCREATE[^)]*\n\\); 以仅获取所示示例中所需的部分。然后在主程序检查条件是否 RT 不为空，然后从中删除开始行和结束行并打印它，以从 sql 的输出中获取只需要的部分。

【讨论】：