【问题标题】:automation of data format conversion to parent child format自动将数据格式转换为父子格式
【发布时间】:2011-01-24 08:37:58
【问题描述】:

这是一个 Excel 工作表,每行仅填充一列。 (解释:所有CITY类别都属于V21,所有手机类别都属于CityJ等等)

   V21                  
       CITYR
       CITYJ
           HandsetS
           HandsetHW
           HandsetHA
               LOWER_AGE<=20
               LOWER_AGE>20     
                   SMS_COUNT<=0 
                       RECHARGE_MRP<=122
                       RECHARGE_MRP>122
                   SMS_COUNT>0

我需要将此格式更改为双列格式 具有父子类别格式。 所以 输出表将是

    V21           CITYR
    V21           CITYJ
    CITYJ         HandsetS
    CITYJ         HandsetHW
    CITYJ         HandsetHA
    HandsetHA     LOWER_AGE<=20
    HandsetHA     LOWER_AGE>20      
    LOWER_AGE>20    SMS_COUNT<=0    
    SMS_COUNT<=0    RECHARGE_MRP<=122
    SMS_COUNT<=0    RECHARGE_MRP>122
    LOWER_AGE>20    SMS_COUNT>0

数据量很大,所以我不能手动完成。我怎样才能自动化呢?

【问题讨论】:

  • 您提到“数据量很大”,因为这是 excel 文档,我假设它的行数不超过 65535。对吗?

标签: java excel vbscript automation


【解决方案1】:

任务分为 3 个部分,所以我想知道您在寻求什么帮助。

  1. 将excel表格数据读入Java
  2. 处理数据
  3. 将数据写回到 Excel 工作表中。

你说过数据表很大,不能作为一个整体拉入内存。请问你有多少顶级元素?即,您有多少个 V21?如果只有一个,那你有多少个 CITYR/CITYJ?

--

从我之前的回答中添加一些关于如何操作数据的源代码。我给了它一个由制表符分隔的输入文件(在 excel 中,4 个空格等于一列),下面的代码整齐地打印出来。请注意,级别 == 1 的条件为空。如果你认为你的 JVM 有太多对象,你可以在那时清除条目和堆栈:)

package com.ekanathk;

import java.io.BufferedReader;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.Stack;
import java.util.logging.Logger;

import org.junit.Test;

class Entry {
    private String input;
    private int level;
    public Entry(String input, int level) {
        this.input = input;
        this.level = level;
    }
    public String getInput() {
        return input;
    }
    public int getLevel() {
        return level;
    }
    @Override
    public String toString() {
        return "Entry [input=" + input + ", level=" + level + "]";
    }
}

public class Tester {

    private static final Logger logger = Logger.getLogger(Tester.class.getName());

    @SuppressWarnings("unchecked")
    @Test
    public void testSomething() throws Exception {

        InputStream is = Thread.currentThread().getContextClassLoader().getResourceAsStream("samplecsv.txt");
        BufferedReader b = new BufferedReader(new InputStreamReader(is));
        String input = null;
        List entries = new ArrayList();
        Stack<Entry> stack = new Stack<Entry>();
        stack.push(new Entry("ROOT", -1));
        while((input = b.readLine()) != null){
            int level = whatIsTheLevel(input);
            input = input.trim();
            logger.info("input = " + input + " at level " + level); 
            Entry entry = new Entry(input, level);
            if(level == 1) {
                //periodically clear out the map and write it to another excel sheet
            }
            if (stack.peek().getLevel() == entry.getLevel()) {
                stack.pop();
            }
            Entry parent = stack.peek();
            logger.info("parent = " + parent);
            entries.add(new String[]{parent.getInput(), entry.getInput()});
            stack.push(entry);
        }
        for(Object entry : entries) {
            System.out.println(Arrays.toString((String[])entry));
        }
    }

    private int whatIsTheLevel(String input) {
        int numberOfSpaces = 0;
        for(int i = 0 ; i < input.length(); i++) {
            if(input.charAt(i) != ' ') {
                return numberOfSpaces/4;
            } else {
                numberOfSpaces++;
            }
        }
        return numberOfSpaces/4;
    }
}

【讨论】:

    【解决方案2】:

    这认为您的文件足够小以适合计算机内存。即使是 10MB 的文件也应该不错。

    它有两个部分:

    DataTransformer 完成所有 所需的数据转换

    TreeNode 是自定义的简单树数据 结构

    public class DataTransformer {
    
        public static void main(String[] args) throws IOException {
            InputStream in = DataTransformer.class
                    .getResourceAsStream("source_data.tab");
            BufferedReader br = new BufferedReader(
                    new InputStreamReader(in));
            String line;
            TreeNode root = new TreeNode("ROOT", Integer.MIN_VALUE);
            TreeNode currentNode = root;
            while ((line = br.readLine()) != null) {
                int level = getLevel(line);
                String value = line.trim();
                TreeNode nextNode = new TreeNode(value, level);
                relateNextNode(currentNode, nextNode);
                currentNode = nextNode;
            }
            printAll(root);
        }
    
        public static int getLevel(String line) {
            final char TAB = '\t';
            int numberOfTabs = 0;
            for (int i = 0; i < line.length(); i++) {
                if (line.charAt(i) != TAB) {
                    break;
                }
                numberOfTabs++;
            }
            return numberOfTabs;
        }
    
        public static void relateNextNode(
                TreeNode currentNode, TreeNode nextNode) {
            if (currentNode.getLevel() < nextNode.getLevel()) {
                currentNode.addChild(nextNode);
            } else {
                relateNextNode(currentNode.getParent(), nextNode);
            }
        }
    
        public static void printAll(TreeNode node) {
            if (!node.isRoot() && !node.getParent().isRoot()) {
                System.out.println(node);
            }
            for (TreeNode childNode : node.getChildren()) {
                printAll(childNode);
            }
        }
    }
    
    class TreeNode implements Serializable {
    
        private static final long serialVersionUID = 1L;
    
        private TreeNode parent;
        private List<TreeNode> children = new ArrayList<TreeNode>();
        private String value;
        private int level;
    
        public TreeNode(String value, int level) {
            this.value = value;
            this.level = level;
        }
    
        public void addChild(TreeNode child) {
            child.parent = this;
            this.children.add(child);
        }
    
        public void addSibbling(TreeNode sibbling) {
            TreeNode parent = this.parent;
            parent.addChild(sibbling);
        }
    
        public TreeNode getParent() {
            return parent;
        }
    
        public List<TreeNode> getChildren() {
            return children;
        }
    
        public String getValue() {
            return value;
        }
    
        public int getLevel() {
            return level;
        }
    
        public boolean isRoot() {
            return this.parent == null;
        }
    
        public String toString() {
            String str;
            if (this.parent != null) {
                str = this.parent.value + '\t' + this.value;
            } else {
                str = this.value;
            }
            return str;
        }
    }
    

    【讨论】:

      猜你喜欢
      • 2021-03-19
      • 1970-01-01
      • 2023-02-14
      • 2023-04-11
      • 1970-01-01
      • 1970-01-01
      • 2023-02-25
      • 2016-08-05
      • 2011-12-10
      相关资源
      最近更新 更多