从elasticsearch连接并读取数据到hive答案

【问题标题】：Connection and read data from elasticsearch to hive从elasticsearch连接并读取数据到hive
【发布时间】：2021-04-03 11:35:18
【问题描述】：

我想将 hive 连接到 elasticsearch。我遵循了from here 的指示。我执行以下步骤

1. start-dfs.sh
2. start-yarn.sh
3. launch elasticsearch
4. launch kibana
5. launch hive
inside hive 
a- create a database
b- create a table
c- load data into the table (LOAD DATA LOCAL INPATH '/home/myuser/Documents/datacsv/myfile.csv' OVERWRITE INTO TABLE students; )
d- add jar /home/myuser/elasticsearch-hadoop-7.10.1/dist/elasticsearch-hadoop-hive-7.10.1.jar
e- create a table for Elastic. 
create table students_es (stt int not null, mahocvien varchar(10), tenho string, ten string, namsinh date, gioitinh string, noisinh string, namvaodang date, trinhdochuyenmon string, hesoluong float, phucaptrachnhiem float, chucvudct string, chucdqh string, dienuutien int, ghichu int) STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' TBLPROPERTIES('es.nodes' = '127.0.0.1', 'es.port' = '9201', 'es.resource' = 'students/student');

f- insert overwrite table students_es select * from students;

那么我得到的错误如下

FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. org/apache/commons/httpclient/protocol/ProtocolSocketFactory

我使用了组件基巴纳：7.10.1 蜂巢：3.1.2 Hadoop：3.1.2

【问题讨论】：

标签： java apache-spark elasticsearch hadoop hive

【解决方案1】：

我终于找到了解决方法。需要下载jar文件commons-httpclient-3.1.jar并放入您的配置单元库目录。

【讨论】：