1.先在hive-site.xml中设置小文件的标准.

<property>
  <name>hive.merge.smallfiles.avgsize</name>
  <value>536870912</value>
  <description>When the average output file size of a job is less than this number, Hive will start an additional map-reduce job to merge the output files into bigger files.  This is only done for map-only jobs if hive.merge.mapfiles is true, and for map-reduce jobs if hive.merge.mapredfiles is true.</description>
</property>

2.为仅仅有map的mapreduce的输出并合并小文件.


<property>
  <name>hive.merge.mapfiles</name>
  <value>true</value>
  <description>Merge small files at the end of a map-only job</description>
</property>

2.为含有reduce的mapreduce的输出并合并小文件.

<property>
  <name>hive.merge.mapredfiles</name>
  <value>true</value>
  <description>Merge small files at the end of a map-reduce job</description>
</property>




相关文章:

  • 2021-10-15
  • 2022-12-23
  • 2021-09-27
  • 2021-12-22
  • 2021-04-17
  • 2022-12-23
  • 2022-12-23
  • 2022-12-23
猜你喜欢
  • 2022-12-23
  • 2021-08-10
  • 2022-12-23
  • 2021-10-26
  • 2022-12-23
  • 2022-12-23
  • 2021-11-29
相关资源
相似解决方案