今天给大家分享一篇关于Greenplum与Deepgreen外部数据加载的小测试。

首先必要的前提条件有:

1)Greenplum 4.3和Deepgreen 16.x安装完毕

2)已经搭建好xdrive环境和gpfdist环境

3)准备测试文件:number.csv 写入一亿条数据。例如:for((i=1;i<100000000;i++));do echo '1,2,3' >> number.csv;done     写入后文件大小:573M

4)将测试文件分别挂载到本地hdfs和本地gpfdist:

    hdfs dfs -put /home/hadoop/number.csv /home/hadoop/input

    gpfdist -d /home/hadoop -p 8081 &

5)分别创建两个外部表,对应两种方式(xdrive和gpfdist):

    create external table number_xdrive(n1 int,n2 int,n3 int) location ('xdrive://localhost:50000/dw/number.csv') format 'csv’;

    create external table number_gpfdist(n1 int,n2 int,n3 int) location ('gpfdist://localhost:8081/number.csv') format 'csv’;

6)分别执行一下select limit 10语句查看是否可以访问数据。

 

测试场景及时间对比:

1.count测试:

1)Deepgreen Xdrive

Greenplum vs Deepgreen - gpfdist外部表和xdrive hdfs对比测试

2)Deepgreen gpfdist

Greenplum vs Deepgreen - gpfdist外部表和xdrive hdfs对比测试

3)Greenplum gpfdist

Greenplum vs Deepgreen - gpfdist外部表和xdrive hdfs对比测试

2.select * 测试:

1)Deepgreen Xdrive

Greenplum vs Deepgreen - gpfdist外部表和xdrive hdfs对比测试

2)Deepgreen gpfdist

Greenplum vs Deepgreen - gpfdist外部表和xdrive hdfs对比测试

3)Greenplum gpfdist

Greenplum vs Deepgreen - gpfdist外部表和xdrive hdfs对比测试

3.group by 测试:

1)Deepgreen Xdrive

Greenplum vs Deepgreen - gpfdist外部表和xdrive hdfs对比测试

2)Deepgreen gpfdist

Greenplum vs Deepgreen - gpfdist外部表和xdrive hdfs对比测试

3)Greenplum gpfdist

Greenplum vs Deepgreen - gpfdist外部表和xdrive hdfs对比测试

4.带where条件的查询:

1)Deepgreen Xdrive &gpfdist

Greenplum vs Deepgreen - gpfdist外部表和xdrive hdfs对比测试

2)Greenplum gpfdist

Greenplum vs Deepgreen - gpfdist外部表和xdrive hdfs对比测试

 

转载于:https://my.oschina.net/javacy/blog/2998514

相关文章:

  • 2021-09-15
  • 2022-03-09
  • 2022-12-23
  • 2021-09-30
  • 2022-12-23
  • 2021-07-12
  • 2022-12-23
  • 2021-06-13
猜你喜欢
  • 2021-12-06
  • 2022-12-23
  • 2022-02-11
  • 2021-07-19
  • 2021-09-27
  • 2021-11-06
  • 2021-12-14
相关资源
相似解决方案