通过匹配模式用另一个文本文件中的行替换行[重复]答案

【问题标题】：replace lines with lines from another text file by matching patterns [duplicate]通过匹配模式用另一个文本文件中的行替换行[重复]
【发布时间】：2013-06-23 12:41:12
【问题描述】：

我有一个像这样的文件 file_pattern：

SET  default_parallel 10
SET  pig.splitCombination true
SET  pig.maxCombinedSplitSize 134217728
register 'hdfs:///usr/lib/pig/piggybank.jar'; 
define LENGTH org.apache.pig.piggybank.evaluation.string.LENGTH();

//Some other stuff goes here

还有一个像这样的文件 insert_file：

ld_DW_D_INSTALLATION_PRODUCTS = load '/dan/data/dwh/dw_d_installation_products' using PigStorage ('|') as (inst_prod_wid , bac_wid , di
strict_code , billing_account_no , inst_sequence_no , product_code , contract_type , maintenance_contract , exchange_line_indicator , p
roduct_type , quantity , first_cph_date , last_cph_date , first_cph_term_expiry_date , last_cph_term_expiry_date , last_cph_order_no ,
ts_last_updated , data_owner , source_system , etl_created_dt);
ld_DEDUP_PROD_TPC_EXTRACT = load '/dan/data/dedup/dedup_prod_tpc_extract' using PigStorage ('|') as ( productfamilyid , productfamily ,
 productgroupid , productgroup , grouplobid , productgroupowninglob , newproductid , newproductname , productowner , lifecycleid , life
cycle , buildgroupid , buildgroupname , ukbreleaseno , gs_productbuildstatusid , gs_productbuildstatus , ab_code , ab_codename , codelo
bid , codeowninglob , ab_codedestinyid , ab_codedestiny , ab_code_treatmentid , ab_code_treatmentstatus , gs_mappingstatusid , gs_mappi
ngstatus , consumercount , btb_count , gs_count , otherbu_count , operateflagid , operateflagdescription , withdrawalprojectid , withdr
awalproject , line_type , note );

现在我想要一个脚本，它会给我这样的输出：

SET  default_parallel 10
SET  pig.splitCombination true
SET  pig.maxCombinedSplitSize 134217728

ld_DW_D_INSTALLATION_PRODUCTS = load '/dan/data/dwh/dw_d_installation_products' using PigStorage ('|') as (inst_prod_wid , bac_wid , di
strict_code , billing_account_no , inst_sequence_no , product_code , contract_type , maintenance_contract , exchange_line_indicator , p
roduct_type , quantity , first_cph_date , last_cph_date , first_cph_term_expiry_date , last_cph_term_expiry_date , last_cph_order_no ,
ts_last_updated , data_owner , source_system , etl_created_dt);
ld_DEDUP_PROD_TPC_EXTRACT = load '/dan/data/dedup/dedup_prod_tpc_extract' using PigStorage ('|') as ( productfamilyid , productfamily ,
 productgroupid , productgroup , grouplobid , productgroupowninglob , newproductid , newproductname , productowner , lifecycleid , life
cycle , buildgroupid , buildgroupname , ukbreleaseno , gs_productbuildstatusid , gs_productbuildstatus , ab_code , ab_codename , codelo
bid , codeowninglob , ab_codedestinyid , ab_codedestiny , ab_code_treatmentid , ab_code_treatmentstatus , gs_mappingstatusid , gs_mappi
ngstatus , consumercount , btb_count , gs_count , otherbu_count , operateflagid , operateflagdescription , withdrawalprojectid , withdr
awalproject , line_type , note );

总的来说，我想要的是第二个文件应该在 SET 语句的最后一次出现之后插入。

提前致谢拉加文德拉

【问题讨论】：

我使用 sed 找到了答案：skip_set=10 sed "${skip_set}r file_to_insert" input_file 它将在 input_file 的第 10 行之后插入 file_to_insert 的内容

标签： regex perl sed awk pattern-matching

【解决方案1】：

GNU 代码sed：

$sed -r '/^SET/H;$bk;d;:k;x;s#.*\n(.*)\'#/\1/{\na\nr file2\n}#' file1 |sed -f - 文件1 设置 default_parallel 10 SET pig.splitCombination true SET pig.maxCombinedSplitSize 134217728 ld_DW_D_INSTALLATION_PRODUCTS =负载 '/旦/数据/ DWH / dw_d_installation_products' 使用PigStorage（ '|'）为（inst_prod_wid，bac_wid，district_code，billing_account_no，inst_sequence_no，PRODUCT_CODE，contract_type，maintenance_contract，exchange_line_indicator，产品类型，数量，first_cph_date，last_cph_date，first_cph_term_expiry_date， last_cph_term_expiry_date , last_cph_order_no ,ts_last_updated , data_owner , source_system , etl_created_dt); ld_DEDUP_PROD_TPC_EXTRACT = 使用 PigStorage ('|') 加载 '/dan/data/dedup/dedup_prod_tpc_extract' 为gs_productbuildstatusid，gs_productbuildstatus，ab_code，ab_codename，codelobid，codeowninglob，ab_codedestinyid，ab_codedestiny，ab_code_treatmentid，ab_code_treatmentstatus，gs_mappingstatusid，gs_mappingstatus，consumercount，btb_count，gs_count，otherbu_count，operateflagid，operateflagdescription，withdrawalprojectid，withdrawalproject，LINE_TYPE，注释）; 注册“hdfs:///usr/lib/pig/piggybank.jar”；定义长度 org.apache.pig.piggybank.evaluation.string.LENGTH(); //这里还有一些其他的东西

【讨论】：