【发布时间】:2020-08-04 19:06:32
【问题描述】:
问题类似于: Find and Extract value after specific String from a file using bash shell script?
我正在从 shell 脚本执行 hive 查询,需要在变量中提取一些值,查询如下:
sql="show create table dev.emp"
partition_col= `beeline -u $Beeline_URL -e $sql` | grep 'PARTITIONED BY' | cut -d "'" -f2`
sql查询的输出如下:
+----------------------------------------------------+
| createtab_stmt |
+----------------------------------------------------+
| CREATE EXTERNAL TABLE `dv.par_kst`( |
| `col1` string, |
| `col2` string, |
| `col3` string) |
| PARTITIONED BY ( |
| `part_col1` int, |
| `part_col2` int) |
| ROW FORMAT SERDE |
| 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' |
| STORED AS INPUTFORMAT |
| 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' |
| OUTPUTFORMAT |
| 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' |
| LOCATION |
| 'hdfs://nameservicets1/dv/hdfsdata/par_kst' |
| TBLPROPERTIES ( |
| 'spark.sql.create.version'='2.2 or prior', |
| 'spark.sql.sources.schema.numPartCols'='2', |
| 'spark.sql.sources.schema.numParts'='1', |
| 'spark.sql.sources.schema.part.0'='{"type":"struct","fields":[{"name":"col1","type":"string","nullable":true,"metadata":{}},{"name":"col2","type":"string","nullable":true,"metadata":{}},{"name":"col3","type":"integer","nullable":true,"metadata":{}},{"name":"part_col2","type":"integer","nullable":true,"metadata":{}}]}', |
| 'spark.sql.sources.schema.partCol.0'='part_col1', |
| 'spark.sql.sources.schema.partCol.1'='part_col2', |
| 'transient_lastDdlTime'='1587487456') |
+----------------------------------------------------+
从上面的 sql 中,我想提取 PARTITIONED BY details。
Desired output :
part_col1 , part_col2
尝试使用以下代码但没有得到正确的值:
partition_col=`beeline -u $Beeline_URL -e $sql` | grep 'PARTITIONED BY' | cut -d "'" -f2`
这些 PARTITIONED BY 不是固定的,意味着对于其他一些文件它可能包含 3 个或更多,所以我想提取所有 PARTITIONED BY。
PARTITIONED BY 和 ROW FORMAT SERDE 之间的所有值,删除空格“`”和数据类型!
【问题讨论】: