【发布时间】:2015-10-20 15:01:08
【问题描述】:
假设我有两个本地文件 file1.txt 和 file2.txt。
file1.txt 的内容:
1,a
3,c
file2.txt 的内容
2,b
4,d
我已经像这样将文件放在 Hadoop 上
hadoop fs -rm -r /user/cloudera/repart2/*
hadoop fs -mkdir -p /user/cloudera/repart2/20150401
hadoop fs -put file1.txt /user/cloudera/repart2/20150401/
hadoop fs -mkdir -p /user/cloudera/repart2/20150402
hadoop fs -put file2.txt /user/cloudera/repart2/20150402/
我已经制作了一个 Hive 表
# Select a test database
use training;
# Create the table
create external table repart (
col1 int, col2 string)
PARTITIONED BY (Test int)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
STORED AS TEXTFILE
location '/user/cloudera/repart2';
# Add partititons
ALTER TABLE repart ADD PARTITION (Test='20150401') LOCATION '/user/cloudera/repart2/20150401/';
ALTER TABLE repart ADD PARTITION (Test='20150402') LOCATION '/user/cloudera/repart2/20150402/';
当我做一个选择语句时
select * from repart;
显示
1 a 20150401
3 c 20150401
2 b 20150402
4 d 20150402
我希望我的桌子最终看起来像这样
1 a 20150401
2 b 20150401
3 c 20150401
4 d 20150401
2 b 20150402
4 d 20150402
但是当我尝试插入查询时
INSERT INTO TABLE repart PARTITION (Test='20150401') select col1, col2 FROM repart where Test = 20150402;
查询使表格看起来像这样。分区 20150401 中的原始数据已被覆盖。
2 b 20150401
4 d 20150401
2 b 20150402
4 d 20150402
“hive --version”命令返回:0.12.0-cdh5.0.0。我注意到this jira,但我的表格已经全是小写了,所以我不确定出了什么问题。
【问题讨论】: