MongoDB：如何使用大数据优化聚合查询答案

【问题标题】：MongoDB: How can I optimize the aggregation query with big dataMongoDB：如何使用大数据优化聚合查询
【发布时间】：2019-09-21 15:21:13
【问题描述】：

在 Mongodb 集合中，有 3000 万条记录。当查询存在值时，它会在 1 秒内给出结果。但是使用不存在值的查询需要 40 或 45 秒才能将结果设为 null 或 0。为什么会这样

String cDate = dateFormat.format(date);
String pastTime = timeFormatMin.format(new Date(System.currentTimeMillis() - 3600 * 1000));
Criteria dateQuery = Criteria.where("Date").is(cDate);
Criteria timeQuery = Criteria.where("Time").gt(pastTime);
Criteria appQuery = Criteria.where("appID").is(appId).andOperator(Criteria.where("appID").exists(true));
Criteria criteria = new Criteria().andOperator(dateQuery, timeQuery, appQuery);

MatchOperation matchOperation = match(criteria);
GroupOperation groupOperation = group("appID").count().as("idcount");
ProjectionOperation projection = project()
            .andExpression("_id").as("appID")
            .andExpression("idcount").as("Count");
SortOperation sortOperation = sort(new Sort(Sort.Direction.DESC, "_id"));
LimitOperation limitOperation = limit(1);

Aggregation aggregation = newAggregation(matchOperation, sortOperation, limitOperation);
AggregationResults<CommonLogic> logResult = mongoTemplate.aggregate(aggregation, "commonLogic", CommonLogic.class);
List<CommonLogic> list = logResult.getMappedResults();

【问题讨论】：

标签： java mongodb spring-boot

【解决方案1】：

假设“appID”字段已在数据库中建立索引（这对聚合管道有很大帮助）：对于提到的数据大小，上述查询最有可能在排序操作中消耗时间。尝试给出“allowDiskUse：true”的聚合选项。看一下 this SO question

【讨论】：

感谢您的支持。 Yes.field 'appID' 已在 DB 中建立索引。好的，我会检查...