基础操作内容转载自 大鱼-瓶邪 的博客:
https://blog.csdn.net/qq_25948717/article/details/81167631
基础环境
java jdk1.8.1_181
hadoop 2.7.2 一管理节点 二数据节点
zookeeper 3.4.9
hbase 1.2.6
hive 2.3.4
mysql 5.7.25
sqoop 1.4.7
sqoop镜像资源
sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz:
http://mirrors.hust.edu.cn/apache/sqoop/1.4.7/
解压
tar -zxvf sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz
mv sqoop-1.4.7.bin__hadoop-2.6.0 sqoop-1.4.7
配置环境文件
vim ~/.profile,加入
export SQOOP_HOME=/home/hadoop/sqoop-1.4.7
export PATH=$PATH:$SQOOP_HOME/bin
修改配置文件
cp sqoop-1.4.7/conf/sqoop-env-template.sh sqoop-1.4.7/conf/sqoop-env.sh
vim sqoop-1.4.7/conf/sqoop-env.sh
填入路径:
#Set path to where bin/hadoop is available
export HADOOP_COMMON_HOME=/home/hadoop/hadoop-2.7.2
#Set path to where hadoop-*-core.jar is available
export HADOOP_MAPRED_HOME=/home/hadoop/hadoop-2.7.2
#set the path to where bin/hbase is available
export HBASE_HOME=/home/hadoop/hbase-1.2.6
#Set the path to where bin/hive is available
export HIVE_HOME=/home/hadoop/apache-hive-2.3.4-bin
#Set the path for where zookeper config dir is
export ZOOCFGDIR=/home/hadoop/zookeeper-3.4.9
复制jar包
cp mysql-connector-java-6.0.6.jar /home/hadoop/sqoop-1.4.7/lib/
MySQL授权允许远程登录
root登录数据库,执行:
GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY 'root';
GRANT ALL PRIVILEGES ON *.* TO 'root'@'localhost' IDENTIFIED BY 'root';
创建数据库:
create database sqoop;
use sqoop;
创建表插入数据:
create table dept(id int,name varchar(20),primary key(id));
insert into dept values(610213,'ApplicationsCould');
insert into dept values(610215,'ApplicationsBigData');
insert into dept values(590108,'SoftwareTech');
退出数据库:
quit:
sqoop测试与mysql操作
查看MySQL数据库:
sqoop list-databases --connect jdbc:mysql://master:3306/ --username root --password root
结果:
MySQL表导入HDFS:
启动 Hadoop集群 :start-all.sh
输入指令:
sqoop import --connect jdbc:mysql://master:3306/sqoop --username root --password root --table dept -m 1 --target-dir /sqoop/dept --delete-target-dir
注意:–delete-target-dir是删除已存在目标文件夹,要清楚是否要删除该文件夹,避免数据丢失
打开网址:http://master:50070 或者输入代码查询集群目录:
hdfs dfs -cat /sqoop/dept/part-m-00000
登录mysql,清空dept表:
use sqoop;
truncate dept;
将HDFS数据导入MySQL:
sqoop export --connect jdbc:mysql://master:3306/sqoop --username root --password root --table dept -m 1 --export-dir /sqoop/dept
MySQL查询dept表:
sqoop增量导入HDFS:
在MySQL的dept表中添加一条数据:
insert into dept values(590101,'ComputerTech');
退出MySQL,输入指令:
sqoop import --connect jdbc:mysql://master:3306/sqoop --username root --password root --table dept -m 1 --target-dir /sqoop/dept --incremental append --check-column id
打开网址:http://master:50070 或者输入代码查询集群目录:
hdfs dfs -cat /sqoop/dept/part-m-00001
将MySQL的dept表导入hive:
将hive的lib文件夹下的hive-exec-**.jar 放到sqoop 的lib下
指令:
cp apache-hive-2.3.4-bin/lib/hive-exec-2.3.4.jar .sqoop-1.4.7/lib/
执行:
sqoop import --connect jdbc:mysql://master:3306/sqoop --username root --password root --table dept -m 1 --hive-import
登录hive,可以看到表已经导入了:
hive导入MySQL
清空MySQL的dept表,将hive对的数据导入,指令:
sqoop export --connect jdbc:mysql://master:3306/sqoop --username root --password root --table dept -m 1 --export-dir /user/hive/warehouse/dept --input-fields-terminated-by '\0001'
MySQL与HBase之间的数据转移:
启动集群:start-all.sh
启动zookeeper:zkServer.sh start (三台)
启动hbase:start-hbase.sh
输入 hbase shell 进入命令
创建hbase表 create ‘hbase_dept’,‘col_family’
退出,输入指令:
sqoop import --connect jdbc:mysql://master:3306/sqoop --username root --password root --table 'dept' --hbase-create-table --hbase-table hbase_dept --column-family col_family --hbase-row-key id
打开hbase shell,查看:
结束
It’s over.