在使用 BytesWritable 进行小文件合并时,发现长度与原类容不一致,会多出一些空格

 

测试代码

@Test
public void test() {
    String str = "aaa";

    BytesWritable v = new BytesWritable();
    v.set(str.getBytes(), 0, str.getBytes().length);

    System.out.println("*" + new String(v.getBytes()) + "*");
}

结果,看到多出了一个空格

BytesWritable 长度问题(多出空格)

 

查看 BytesWritable 源码,发现复制后数组大小会被处理,真正存储类容长度的为 size 属性

public void set(byte[] newData, int offset, int length) {
    setSize(0);
    setSize(length);
    System.arraycopy(newData, offset, bytes, 0, size);
}
public void setSize(int size) { if (size > getCapacity()) { // Avoid overflowing the int too early by casting to a long. long newSize = Math.min(Integer.MAX_VALUE, (3L * size) / 2L); setCapacity((int) newSize); } this.size = size; }

 

既然知道长度,在转换时设置上就好了

@Test
public void test() {
    String str = "aaa";

    BytesWritable v = new BytesWritable();
    v.set(str.getBytes(), 0, str.getBytes().length);

    // getSize()为过期方法,使用 getLength()
    System.out.println("*" + new String(v.getBytes(),0,v.getLength()) + "*");
}

BytesWritable 长度问题(多出空格)


http://hadoop.apache.org/docs/r2.9.2/api/org/apache/hadoop/io/BytesWritable.html

相关文章:

  • 2022-12-23
  • 2022-12-23
  • 2022-12-23
  • 2022-12-23
  • 2022-12-23
  • 2022-12-23
  • 2022-12-23
  • 2022-12-23
猜你喜欢
  • 2021-07-30
  • 2022-12-23
  • 2021-12-26
  • 2022-12-23
  • 2022-12-23
  • 2021-11-10
  • 2022-12-23
相关资源
相似解决方案