【问题标题】:Cosmos DB Pagination giving multiplied page recordsCosmos DB 分页提供成倍的页面记录
【发布时间】:2021-10-27 07:31:54
【问题描述】:

我有一个场景,我需要根据文档内数组中存在的元素过滤集合。任何人都可以建议如何在文档中使用OFFSETLIMIT 与嵌套数组

{
  "id": "abcd",
  "pqrs": 1,
  "xyz": "UNKNOWN_594",
  "arrayList": [
    {
      "Id": 2,
      "def": true
    },
    {
      "Id": 302,
      "def": true
    }
  ]
}

现在我需要过滤并从集合中获取 10 10 条记录。我尝试了以下查询

SELECT * FROM collections c
WHERE ARRAY_CONTAINS(c.arrayList , {"Id":302 },true) or ARRAY_CONTAINS(c.arrayList , {"Id":2 },true)
ORDER BY c._ts DESC 
OFFSET 10 LIMIT 10

现在,当我运行此查询时,它返回了 40 条记录

【问题讨论】:

    标签: azure azure-cosmosdb azure-cosmosdb-sqlapi


    【解决方案1】:

    在下一个OFFSET的每一步,RU都会不断增加,你可以使用ContinuationToken

            private static async Task QueryWithPagingAsync(Uri collectionUri)
            {
                // The .NET client automatically iterates through all the pages of query results 
                // Developers can explicitly control paging by creating an IDocumentQueryable 
                // using the IQueryable object, then by reading the ResponseContinuationToken values 
                // and passing them back as RequestContinuationToken in FeedOptions.
    
                List<Family> families = new List<Family>();
    
                // tell server we only want 1 record
                FeedOptions options = new FeedOptions { MaxItemCount = 1, EnableCrossPartitionQuery = true };
    
                // using AsDocumentQuery you get access to whether or not the query HasMoreResults
                // If it does, just call ExecuteNextAsync until there are no more results
                // No need to supply a continuation token here as the server keeps track of progress
                var query = client.CreateDocumentQuery<Family>(collectionUri, options).AsDocumentQuery();
                while (query.HasMoreResults)
                {
                    foreach (Family family in await query.ExecuteNextAsync())
                    {
                        families.Add(family);
                    }
                }
    
                // The above sample works fine whilst in a loop as above, but 
                // what if you load a page of 1 record and then in a different 
                // Session at a later stage want to continue from where you were?
                // well, now you need to capture the continuation token 
                // and use it on subsequent queries
    
                query = client.CreateDocumentQuery<Family>(
                    collectionUri,
                    new FeedOptions { MaxItemCount = 1, EnableCrossPartitionQuery = true }).AsDocumentQuery();
    
                var feedResponse = await query.ExecuteNextAsync<Family>();
                string continuation = feedResponse.ResponseContinuation;
    
                foreach (var f in feedResponse.AsEnumerable().OrderBy(f => f.Id))
                {
                   
                }
    
                // Now the second time around use the contiuation token you got
                // and start the process from that point
                query = client.CreateDocumentQuery<Family>(
                    collectionUri,
                    new FeedOptions
                    {
                        MaxItemCount = 1,
                        RequestContinuation = continuation,
                        EnableCrossPartitionQuery = true
                    }).AsDocumentQuery();
    
                feedResponse = await query.ExecuteNextAsync<Family>();
    
                foreach (var f in feedResponse.AsEnumerable().OrderBy(f => f.Id))
                {
                   
                }
            }
    

    要跳过特定页面,请 pfb 代码

    private static async Task QueryPageByPage(int currentPageNumber = 1, int documentNumber = 1)
        {
            // Number of documents per page
            const int PAGE_SIZE = 3 // configurable;
    
          
    
            // Continuation token for subsequent queries (NULL for the very first request/page)
            string continuationToken = null;
    
            do
            {
                Console.WriteLine($"----- PAGE {currentPageNumber} -----");
    
                // Loads ALL documents for the current page
                KeyValuePair<string, IEnumerable<Family>> currentPage = await QueryDocumentsByPage(currentPageNumber, PAGE_SIZE, continuationToken);
    
                foreach (Family celeryTask in currentPage.Value)
                {
                   
                    documentNumber++;
                }
    
                // Ensure the continuation token is kept for the next page query execution
                continuationToken = currentPage.Key;
                currentPageNumber++;
            } while (continuationToken != null);
    
            Console.WriteLine("\n--- END: Finished Querying ALL Dcuments ---");
        }
    
    

    和QueryDocumentsByPage函数如下

        private static async Task<KeyValuePair<string, IEnumerable<Family>>> QueryDocumentsByPage(int pageNumber, int pageSize, string continuationToken)
        {
            DocumentClient documentClient = new DocumentClient(new Uri("https://{CosmosDB/SQL Account Name}.documents.azure.com:443/"), "{CosmosDB/SQL Account Key}");
    
            var feedOptions = new FeedOptions {
                MaxItemCount = pageSize,
                EnableCrossPartitionQuery = true,
    
                // IMPORTANT: Set the continuation token (NULL for the first ever request/page)
                RequestContinuation = continuationToken 
            };
    
            IQueryable<Family> filter = documentClient.CreateDocumentQuery<Family>("dbs/{Database Name}/colls/{Collection Name}", feedOptions);
            IDocumentQuery<Family> query = filter.AsDocumentQuery();
    
            FeedResponse<Family> feedRespose = await query.ExecuteNextAsync<Family>();
    
            List<Family> documents = new List<Family>();
            foreach (CeleryTask t in feedRespose)
            {
                documents.Add(t);
            }
    
            // IMPORTANT: Ensure the continuation token is kept for the next requests
            return new KeyValuePair<string, IEnumerable<Family>>(feedRespose.ResponseContinuation, documents);
        }
    
    

    【讨论】:

    • 当我们使用连续令牌时。假设我们需要转到第 10 页,然后我们需要使用这种方法运行此查询 10 次。如果我错了,请纠正我
    • 不,我已经编辑了答案并添加了另一个代码,只是为了跳过特定页面
    • 我假设QueryDocumentsByPage 作为第一个方法名。我说的对吗?
    • 对不起,我错过了添加QueryDocumentsByPage功能,现在添加
    • Do{...}While() 进入无限循环。请建议
    【解决方案2】:

    您实际上在结果中收到了 40 个元素吗?还是您正在返回 10 个文档,但您的 Cosmos 本身可能有 40 个用于此查询的文档?

    使用 ORDER by 子句根据查询检索所有文档,在数据库中对其进行排序,然后应用 OFFSET 和 LIMIT 值来传递最终结果。

    我已经从下面的快照中说明了这一点。

    • 我的 Cosmos 帐户有 14 个与查询匹配的文档,这是 与检索到的文档计数匹配的内容。
    • 输出文档为 10,因为 DB 必须跳过前 5 个并且 然后交付接下来的 5 个。
    • 但我的实际结果只有 5 个文档,因为那是我 要求。

    连续令牌对于分页很有效,但有局限性。如果您直接想跳过页面(例如从第 1 页跳转到第 10 页),则不能使用它们。您需要遍历第一个文档中的页面并继续使用令牌转到下一页。由于限制,如果您有大量文档用于单个查询,通常建议使用。

    另一个建议是在使用 ORDER BY 时使用索引来提高 RU/s 使用率。看到这个link

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2011-06-12
      • 2021-03-10
      • 2016-03-11
      • 2020-03-05
      • 2018-08-03
      • 2021-03-29
      相关资源
      最近更新 更多