【问题标题】:Spring data neo4j -- tx.close() lasts too longSpring data neo4j -- tx.close() 持续时间过长
【发布时间】:2014-11-27 03:36:30
【问题描述】:

我正在尝试将数据从关系数据库导入 neo4j。流程是这样的(稍微简化了一点):

while (!sarBatchService.isFinished()) {

        logger.info("New batch started");

        Date loadDeklFrom = sarBatchService.getStartDateForNewBatch();
        Date loadDeklTo = sarBatchService
                .getEndDateForNewBatch(loadDeklFrom);   // = loadDeklFrom + 2 hours

        logger.info("Dates calculated");

        Date startTime = new Date();

        List<Dek> deks = dekLoadManager
                .loadAllDeks(loadDeklFrom, loadDeklTo); // loading data from the relational database (POINT A)

        Date endLoadTime = new Date();

        logger.info("Deks loaded");

        GraphDatabase gdb = template.getGraphDatabase();
        Transaction tx = gdb.beginTx();

        logger.info("Transaction started!");

        try {

            for (Deks dek : deks) {

                /* transform dek into nodes, and save 
                this nodes with Neo4jTemplate.save() */
            }

            logger.info("Deks saved");

            Date endImportTime = new Date();

            int aff = sarBatchService.insertBatchData(loadDeklFrom,
                    loadDeklTo, startTime, endLoadTime, endImportTime,
                    deks.size()); // (POINT B)
            if (aff != 1) {
                String msg = "Something went wrong",

                throw new RuntimeException(msg);
            }

            logger.info("Batch data saved into relational database");

            tx.success();

            logger.info("Transaction marked as success.");
        } catch (NoSuchFieldException | SecurityException
                | IllegalArgumentException | IllegalAccessException
                | NoSuchMethodException | InstantiationException
                | InvocationTargetException e1) {

            logger.error("Something bad happend :(");
            logger.error(e1.getStackTrace().toString());
        } finally {
            logger.info("Closing transaction...");

            tx.close(); // (POINT C)

            logger.info("Transaction closed!");
            logger.info("Need more work? " + !sarBatchService.isFinished());
        }
    }

因此,关系数据库中的数据有一个时间戳,它指示它的存储时间,我以两小时两小时的时间间隔加载它(代码中的POINT A)。之后,我将遍历加载的数据,将其转换为节点(spring-data-neo4j 节点),存储在 neo4j 中并将有关当前批次的信息(POINT B)存储在关系数据库中。我正在记录几乎每一步以更轻松地进行调试。

程序成功完成 158 个批次。问题从第 159 批开始。程序在代码 (tx.close()) 中的 POINT C 处停止并在那里等待 4 小时(通常持续几秒钟)。之后继续正常。

我尝试在 10GB 堆大小和 4GB 堆大小的 tomcat 7 上运行它。结果是相同的(第 159 批的块)。一个事务的最大节点数在 10k 到 15k 之间(平均为 7k),第 159 批的节点数少于 10k。

有趣的是,如果数据以 4 x 4 小时或 12 x 12 小时加载,一切都会顺利进行。此外,如果我重新启动 Tomcat 或只执行第 159 批,一切都会顺利通过。

我正在使用 spring 3.2.8 和 spring-data-neo4j 3.0.2。

这是neo4j的message.log:

...
2014-11-24 15:21:38.080+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application    threads blocked for an additional 418ms [total block time: 150.973s]
2014-11-24 15:21:45.722+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 377ms [total block time: 151.35s]
...
2014-11-24 15:23:57.381+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 392ms [total block time: 156.593s]
2014-11-24 15:24:06.758+0000 INFO  [o.n.k.i.t.x.XaLogicalLog]: Rotating     [/home/pravila/data/neo4j/nioneo_logical.log.1] @ version=22 to   /home/pravila/data/neo4j/nioneo_logical.log.2 from position 26214444
2014-11-24 15:24:06.763+0000 INFO  [o.n.k.i.t.x.XaLogicalLog]: Rotate log first start entry @ pos=24149878 out of [339=Start[339,xid=GlobalId[NEOKERNL|5889317606667601380|364|-1],  BranchId[ 52 49 52 49 52 49 ],master=-1,me=-1,time=2014-11-24    15:23:13.021+0000/1416842593021,lastCommittedTxWhenTransactionStarted=267]]
2014-11-24 15:24:07.401+0000 INFO  [o.n.k.i.t.x.XaLogicalLog]: Rotate: old log scanned, newLog @ pos=2064582
2014-11-24 15:24:07.402+0000 INFO  [o.n.k.i.t.x.XaLogicalLog]: Log rotated, newLog @ pos=2064582, version 23 and last tx 267
2014-11-24 15:24:07.684+0000 INFO  [o.n.k.i.t.x.XaLogicalLog]: Rotating [/home/pravila/data/neo4j/index/lucene.log.1] @ version=6 to /home/pravila/data/neo4j/index/lucene.log.2 from position 26214408
2014-11-24 15:24:07.772+0000 INFO  [o.n.k.i.t.x.XaLogicalLog]: Rotate log first start entry @ pos=25902494 out of [134=Start[134,xid=GlobalId[NEOKERNL|5889317606667601380|364|-1], BranchId[ 49 54 50 51 55 52 ],master=-1,me=-1,time=2014-11-24     15:23:13.023+0000/1416842593023,lastCommittedTxWhenTransactionStarted=133]]
2014-11-24 15:24:07.871+0000 INFO  [o.n.k.i.t.x.XaLogicalLog]: Rotate: old log scanned, newLog @ pos=311930
2014-11-24 15:24:07.878+0000 INFO  [o.n.k.i.t.x.XaLogicalLog]: Log rotated, newLog @ pos=311930, version 7 and last tx 133
2014-11-24 15:24:10.919+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 214ms [total block time: 156.807s]
2014-11-24 15:24:17.486+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 405ms [total block time: 157.212s]
...
2014-11-24 15:25:28.692+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 195ms [total block time: 159.316s]
2014-11-24 15:25:33.238+0000 INFO  [o.n.k.i.t.x.XaLogicalLog]: Rotating [/home/pravila/data/neo4j/nioneo_logical.log.2] @ version=23 to /home/pravila/data/neo4j/nioneo_logical.log.1 from position 26214459
2014-11-24 15:25:33.242+0000 INFO  [o.n.k.i.t.x.XaLogicalLog]: Rotate log first start entry @ pos=24835943 out of [349=Start[349,xid=GlobalId[NEOKERNL|-6436474643536791121|374|-1], BranchId[ 52 49 52 49 52 49 ],master=-1,me=-1,time=2014-11-24 15:25:27.038+0000/1416842727038,lastCommittedTxWhenTransactionStarted=277]]
2014-11-24 15:25:33.761+0000 INFO  [o.n.k.i.t.x.XaLogicalLog]: Rotate: old log scanned, newLog @ pos=1378532
2014-11-24 15:25:33.763+0000 INFO  [o.n.k.i.t.x.XaLogicalLog]: Log rotated, newLog @ pos=1378532, version 24 and last tx 277
2014-11-24 15:25:37.031+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 148ms [total block time: 159.464s]
2014-11-24 15:25:45.891+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 153ms [total block time: 159.617s]
....    
2014-11-24 15:26:48.447+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 221ms [total block time: 161.641s]

我不知道这里发生了什么......

请帮忙。

【问题讨论】:

    标签: neo4j spring-data-neo4j


    【解决方案1】:

    看起来你有一个泄露的外部事务。

    这样您显示的内部事务实际上完成了,但外部事务继续累积状态。由于 Neo 不会暂停外部事务,而是纯粹嵌套它们,因此在您点击外部 tx.success(); tx.close();

    之前不会有真正的提交

    如果你在线程阻塞时进行线程转储,看看它是否真的卡在提交中,你应该会看到它。

    【讨论】:

    • 嗨迈克尔,tnx 为您的回复。我写的while循环基本上就是整个函数体(调度函数)。除了循环内的事务外,我没有明确打开任何其他事务。是否还有可能存在外部 neo4j 事务?另外,我从来没有读过线程转储。我执行了 kill -3 [tomcat pid],现在我有一个大文件,我现在不知道如何阅读 :) 我应该搜索什么?
    • 我尝试运行 4 x 4 小时批处理,但在第 145 笔交易时它也停止了。看起来程序在连续打开和关闭大量事务时遇到了困难。
    【解决方案2】:

    经过一个小时的搜索和测试后,我尝试以 4 x 4 小时的时间间隔重新运行整个批次。在第 145 批(交易)之后也停止了。不同之处在于它引发了错误(打开的文件太多)。我将打开文件的 ulimit 设置为无限制,现在它可以工作了。唯一的谜团是为什么程序第一次没有抛出错误。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多