基础操作内容转载自 大鱼-瓶邪 的博客:
https://blog.csdn.net/qq_25948717/article/details/81167631

基础环境

java jdk1.8.1_181
hadoop 2.7.2 一管理节点 二数据节点
zookeeper 3.4.9
hbase 1.2.6
hive 2.3.4
mysql 5.7.25
sqoop 1.4.7

sqoop镜像资源

sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz:
http://mirrors.hust.edu.cn/apache/sqoop/1.4.7/

解压

tar -zxvf sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz
mv sqoop-1.4.7.bin__hadoop-2.6.0 sqoop-1.4.7

配置环境文件

vim ~/.profile,加入

export SQOOP_HOME=/home/hadoop/sqoop-1.4.7
export PATH=$PATH:$SQOOP_HOME/bin

修改配置文件

cp sqoop-1.4.7/conf/sqoop-env-template.sh sqoop-1.4.7/conf/sqoop-env.sh

vim sqoop-1.4.7/conf/sqoop-env.sh

填入路径:
#Set path to where bin/hadoop is available
export HADOOP_COMMON_HOME=/home/hadoop/hadoop-2.7.2

#Set path to where hadoop-*-core.jar is available
export HADOOP_MAPRED_HOME=/home/hadoop/hadoop-2.7.2

#set the path to where bin/hbase is available
export HBASE_HOME=/home/hadoop/hbase-1.2.6

#Set the path to where bin/hive is available
export HIVE_HOME=/home/hadoop/apache-hive-2.3.4-bin

#Set the path for where zookeper config dir is
export ZOOCFGDIR=/home/hadoop/zookeeper-3.4.9

复制jar包

cp mysql-connector-java-6.0.6.jar /home/hadoop/sqoop-1.4.7/lib/

MySQL授权允许远程登录

root登录数据库,执行:

GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY 'root';
GRANT ALL PRIVILEGES ON *.* TO 'root'@'localhost' IDENTIFIED BY 'root';

创建数据库:

create database sqoop;

use sqoop;

创建表插入数据:

create table dept(id int,name varchar(20),primary key(id));

insert into dept values(610213,'ApplicationsCould');

insert into dept values(610215,'ApplicationsBigData');

insert into dept values(590108,'SoftwareTech');

hadoop Sqoop-1.4.7 安装部署
退出数据库:

quit:

sqoop测试与mysql操作

查看MySQL数据库:

sqoop list-databases --connect jdbc:mysql://master:3306/ --username root --password root

结果:
hadoop Sqoop-1.4.7 安装部署
MySQL表导入HDFS:
启动 Hadoop集群 :start-all.sh
输入指令:

sqoop import --connect jdbc:mysql://master:3306/sqoop --username root --password root --table dept -m 1 --target-dir /sqoop/dept --delete-target-dir

注意:–delete-target-dir是删除已存在目标文件夹,要清楚是否要删除该文件夹,避免数据丢失
打开网址:http://master:50070 或者输入代码查询集群目录:
hdfs dfs -cat /sqoop/dept/part-m-00000
hadoop Sqoop-1.4.7 安装部署
登录mysql,清空dept表:

use sqoop;
truncate dept;

将HDFS数据导入MySQL:

sqoop export --connect jdbc:mysql://master:3306/sqoop --username root --password root --table dept -m 1 --export-dir /sqoop/dept

MySQL查询dept表:
hadoop Sqoop-1.4.7 安装部署

sqoop增量导入HDFS:

在MySQL的dept表中添加一条数据:

insert into dept values(590101,'ComputerTech');

退出MySQL,输入指令:

sqoop import --connect jdbc:mysql://master:3306/sqoop --username root --password root --table dept -m 1 --target-dir /sqoop/dept --incremental append --check-column id

打开网址:http://master:50070 或者输入代码查询集群目录:
hdfs dfs -cat /sqoop/dept/part-m-00001
hadoop Sqoop-1.4.7 安装部署

将MySQL的dept表导入hive:

将hive的lib文件夹下的hive-exec-**.jar 放到sqoop 的lib下
指令:
cp apache-hive-2.3.4-bin/lib/hive-exec-2.3.4.jar .sqoop-1.4.7/lib/
执行:

sqoop import --connect jdbc:mysql://master:3306/sqoop --username root --password root --table dept -m 1 --hive-import

登录hive,可以看到表已经导入了:
hadoop Sqoop-1.4.7 安装部署

hive导入MySQL

清空MySQL的dept表,将hive对的数据导入,指令:

sqoop export --connect jdbc:mysql://master:3306/sqoop --username root --password root --table dept -m 1 --export-dir /user/hive/warehouse/dept --input-fields-terminated-by '\0001'

hadoop Sqoop-1.4.7 安装部署

MySQL与HBase之间的数据转移:

启动集群:start-all.sh
启动zookeeper:zkServer.sh start (三台)
启动hbase:start-hbase.sh
输入 hbase shell 进入命令
创建hbase表 create ‘hbase_dept’,‘col_family’
退出,输入指令:

sqoop import --connect jdbc:mysql://master:3306/sqoop --username root --password root --table 'dept' --hbase-create-table --hbase-table hbase_dept --column-family col_family --hbase-row-key id

打开hbase shell,查看:
hadoop Sqoop-1.4.7 安装部署

结束

It’s over.

相关文章: