【发布时间】:2017-10-02 11:40:44
【问题描述】:
我编写了一个程序,对对象列表(最大 800 个)进行一些数据处理。这份名单上所做的工作主要有以下几点:
- 大量的 SQL 查询
- 处理查询的数据
- 分组和匹配
- 将它们写入 CSV 文件
一切都很好,但是数据处理部分和SQL数据的大小每天都在增加,程序开始耗尽内存并经常崩溃。
为了避免这种情况,我决定将这个大列表分成几个较小的块,然后尝试在这些较小的列表上做同样的工作(我会在进入下一个小列表之前清除并取消当前的小列表)希望它会解决问题。但这根本没有帮助,程序仍然内存不足。
程序在for循环的第一次迭代中并没有耗尽内存,而是在第二次或第三次左右。
我是否正确清除并取消了 for 循环中的所有列表和对象,以便为下一次迭代释放内存?
我该如何解决这个问题?我已经把我的代码放在下面了。
任何建议/解决方案将不胜感激。
提前致谢。 干杯!
List<someObject> unchoppedList = new ArrayList<someObject>();
for (String pb : listOfNames) {
someObject tccw = null;
tccw = new someObject(...);
unchoppedList.add(tccw);
}
Collections.shuffle(unchoppedList);
List<List<someObject>> master = null;
if (unchoppedList.size() > 0 && unchoppedList.size() <= 175) {
master = chopped(unchoppedList, 1);
} else if (unchoppedList.size() > 175 && unchoppedList.size() <= 355) {
master = chopped(unchoppedList, 2);
} else if (unchoppedList.size() > 355 && unchoppedList.size() <= 535) {
master = chopped(unchoppedList, 3);
} else if (unchoppedList.size() > 535&& unchoppedList.size() <= 800)) {
master = chopped(unchoppedList, 4);
}
for (int i = 0 ; i < master.size() ; i++) {
List<someObject> m = master.get(i);
System.gc(); // I insterted this statement to force GC
executor1 = Executors.newFixedThreadPool(Configuration.getNumberOfProcessors());
generalList = new ArrayList<ProductBean>();
try {
m.parallelStream().forEach(work -> {
try {
generalList.addAll(executor1.submit(work).get());
work = null;
} catch (Exception e) {
logError(e);
}
});
} catch (Exception e) {
logError(e);
}
executor1.shutdown();
executor1.awaitTermination(30, TimeUnit.SECONDS);
m.clear();
m = null;
executor1 = null;
//once the general list is produced the program randomly matches some "good" products to highly similar "not-so-good" products
List<ProductBean> controlList = new ArrayList<ProductBean>();
List<ProductBean> tempKaseList = new ArrayList<ProductBean>();
for (ProductBean kase : generalList) {
if (kase.getGoodStatus() == 0 && kase.getBadStatus() == 1) {
controlList.add(kase1);
} else if (kase.getGoodStatus() == 1 && kase.getBadStatus() == 0) {
tempKaseList.add(kase1);
}
}
generalList = new ArrayList<ProductBean>(tempKaseList);
tempKaseList.clear();
tempKaseList = null;
Collections.shuffle(generalList);
Collections.shuffle(controlList);
final List<List<ProductBean>> compliCases = chopped(generalList, 3);
final List<List<ProductBean>> compliControls = chopped(controlList, 3);
generalList.clear();
controlList.clear();
generalList = null;
controlList = null;
final List<ProductBean> remainingCases = Collections.synchronizedList(new ArrayList<ProductBean>());
IntStream.range(0, compliCases.size()).parallel().forEach(i -> {
compliCases.get(i).forEach(c -> {
TheRandomMatchWorker tRMW = new TheRandomMatchWorker(compliControls.get(i), c);
List<String[]> reportData = tRMW.generateReport();
writeToCSVFile(reportData);
// if the program cannot find required number of products to match it is added to a new list to look for matching candidates elsewhere
if (tRMW.getTheKase().isEverythingMathced == false) {
remainingCases.add(tRMW.getTheKase());
}
compliControls.get(i).removeAll(tRMW.getTheMatchedControls());
tRMW = null;
stuff.clear();
});
});
controlList = new ArrayList<ProductBean>();
for (List<ProductBean> c10 : compliControls) {
controlList.addAll(c10);
}
compliCases.clear();
compliControls.clear();
//last sweep where the program for last time tries to match some "good" products to highly similar "not-so-good" products
try {
for (ProductBean kase : remainingCases) {
if (kase.getNoOfContrls() < ccv.getNoofctrl()) {
TheRandomMatchWorker tRMW = new TheRandomMatchWorker(controlList, kase );
List<String[]> reportData = tRMW.generateReport();
writeToCSVFile(reportData);
if (tRMW.getTheKase().isEverythingMathced == false) {
remainingCases.add(tRMW.getTheKase());
}
compliControls.get(i).removeAll(tRMW.getTheMatchedControls());
tRMW = null;
stuff.clear();
}
}
} catch (Exception e) {
logError(e);
}
remainingCases.clear();
controlList.clear();
controlList = null;
master.get(i).clear();
master.set(i, null);
System.gc();
}
master.clear();
master = null;
这里是切碎的方法
static <T> List<List<T>> chopped(List<T> list, final int L) {
List<List<T>> parts = new ArrayList<List<T>>();
final int N = list.size();
int y = N / L, m = 0, c = y;
int r = c * L;
for (int i = 1; i <= L; i++) {
if (i == L) {
c += (N - r);
}
parts.add(new ArrayList<T>(list.subList(m, c)));
m = c;
c += y;
}
return parts;
}
这是所要求的堆栈跟踪
java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at Controller.MasterStudyController.lambda$1(MasterStudyController.java:212)
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.ForEachOps$ForEachTask.compute(ForEachOps.java:291)
at java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:731)
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
at java.util.concurrent.ForkJoinPool$WorkQueue.execLocalTasks(ForkJoinPool.java:1040)
at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1058)
at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.postgresql.core.Encoding.decode(Encoding.java:204)
at org.postgresql.core.Encoding.decode(Encoding.java:215)
at org.postgresql.jdbc.PgResultSet.getString(PgResultSet.java:1913)
at org.postgresql.jdbc.PgResultSet.getString(PgResultSet.java:2484)
at Controller.someObject.findControls(someObject.java:214)
at Controller.someObject.call(someObject.java:81)
at Controller.someObject.call(someObject.java:1)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
[19:13:35][ERROR] Jarvis: Exception:
java.util.concurrent.ExecutionException: java.lang.AssertionError: Failed generating bytecode for <eval>:-1
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at Controller.MasterStudyController.lambda$1(MasterStudyController.java:212)
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.ForEachOps$ForEachTask.compute(ForEachOps.java:291)
at java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:731)
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
at java.util.concurrent.ForkJoinPool$WorkQueue.execLocalTasks(ForkJoinPool.java:1040)
at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1058)
at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
Caused by: java.lang.AssertionError: Failed generating bytecode for <eval>:-1
at jdk.nashorn.internal.codegen.CompilationPhase$BytecodeGenerationPhase.transform(CompilationPhase.java:431)
at jdk.nashorn.internal.codegen.CompilationPhase.apply(CompilationPhase.java:624)
at jdk.nashorn.internal.codegen.Compiler.compile(Compiler.java:655)
at jdk.nashorn.internal.runtime.Context.compile(Context.java:1317)
at jdk.nashorn.internal.runtime.Context.compileScript(Context.java:1251)
at jdk.nashorn.internal.runtime.Context.compileScript(Context.java:627)
at jdk.nashorn.api.scripting.NashornScriptEngine.compileImpl(NashornScriptEngine.java:535)
at jdk.nashorn.api.scripting.NashornScriptEngine.compileImpl(NashornScriptEngine.java:524)
at jdk.nashorn.api.scripting.NashornScriptEngine.evalImpl(NashornScriptEngine.java:402)
at jdk.nashorn.api.scripting.NashornScriptEngine.eval(NashornScriptEngine.java:155)
at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:264)
at Controller.someObject.findCases(someObject.java:108)
at Controller.someObject.call(someObject.java:72)
at Controller.someObject.call(someObject.java:1)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
[19:13:52][ERROR] Jarvis: Exception:
[19:51:41][ERROR] Jarvis: Exception:
org.postgresql.util.PSQLException: Ran out of memory retrieving query results.
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2157)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:300)
at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:428)
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:354)
at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:169)
at org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:117)
at Controller.someObject.lookForSomething(someObject.java:763)
at Controller.someObject.call(someObject.java:70)
at Controller.someObject.call(someObject.java:1)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
【问题讨论】:
-
你能提供堆栈跟踪吗?
-
你给了JVM多少内存?您是否尝试过使用 VisualVM 查看内存使用情况?
-
48GB 是我在启动此程序之前分配的大小。是的,我使用过 VisualVM,但我无法从中识别出任何东西。
-
您给出的堆栈跟踪似乎没有出现在上面的代码中,所以我认为它们没有多大帮助。给定的代码也有点太大,但仍然不完整。您应该将其缩减为 minimal reproducible example,这表明仅运行这段代码就可以重现 OOME。最后,似乎有几个多线程问题,例如并行添加到不同步的
ArrayList。 -
Didier L - 我认为添加到不同步的 ArrayList 不是问题。是的,堆栈跟踪不会与代码同步,因为 OOME 发生在循环中间的某个位置。在其他一些运行中,它发生在其他地方,它不会每次都发生在同一位置。这里最大的问题是内存没有得到释放,尽管我正在清除所有列表并取消它们
标签: java arraylist java-8 out-of-memory