【问题标题】:Bing Search API v7 Pagination必应搜索 API v7 分页
【发布时间】:2020-04-07 11:27:28
【问题描述】:

我在 Bing News API v7 integration 工作。更准确地说,我使用https://api.cognitive.microsoft.com/bing/v7.0/news/search API 端点。

我发现了一些“意外”的分页行为。 (预期的行为是每个页面都有恒定的大小)。

在此页面上解释了如何How to page through results

我遵循这种方法。我使用 30 作为页面大小;因此,偏移量的值为 0、30、60 等。

例如,使用这些参数时:查询“Java 14”、市场“en-US”、按日期排序,偏移量的值为 0、30、60、90、120、150 (/bing/v7.0/news/search?q=Java 14&count=30&offset=0&mkt=en-US&sortBy=date)。

我得到六页结果,每页包含少于 30 个 URL。

Page: 0 Total: 27 results
Page: 1 Total: 26 results
Page: 2 Total: 26 results
Page: 3 Total: 29 results
Page: 4 Total: 29 results
Page: 5 Total: 7 results
...

此 * What's the expected behavior of the Bing Search API v5 when deeply paginating? 与 Bing API v5 相关。分页值不遵循固定大小的顺序,但公式为previous result size + 1

所以,我的问题是: 我应该使用哪些值作为第二页的偏移量 (Page: 1)?是28还是30?第三页 (Page 2) 的值是 54 还是 60?

【问题讨论】:

    标签: bing bing-api bing-news-search-api


    【解决方案1】:

    第一次传递给 api 以确定 totalEstimatedMatches。除以 totalEstimatedMatches / 25 或每个页面的大小以获得要进行的 api 调用数。例如,如果 totalEstimatedMatches = 100 则进行 4 次 api 调用,每个调用应返回 25 个 url。我谨慎行事并将其减少 1,但您可以将其放在 try catch 中。本例中的 s.Count 为 25。VB.Net 中的解决方案,但您明白了。

            'the secret key 
            Dim accessKey As String = "xxxxxxxxxxxxxxxxxxxxxxxxx"
            Dim endpoint As String = "https://api.cognitive.microsoft.com/bing/v7.0/news/search?"
    
            Dim queryString = HttpUtility.ParseQueryString(String.Empty)
            queryString("q") = search_criteria 'Uri.EscapeDataString(search_criteria)
            queryString("mkt") = market
            queryString("count") = "25"
            queryString("offset") = "0"
            queryString("freshness") = freshness
            queryString("SafeSearch") = "strict"
    
            ' Construct the URI of the search request
            uriQuery = endpoint & queryString.ToString
    
            ' Perform the Web request and get the response
            request = HttpWebRequest.Create(uriQuery)
            request.Headers.Add("Ocp-Apim-Subscription-Key", accessKey)
    
            response = CType(request.GetResponseAsync.Result, HttpWebResponse)
            json = (New StreamReader(response.GetResponseStream)).ReadToEnd
    
            'create json object
            Dim converter = New ExpandoObjectConverter()
            Dim message As Object = JsonConvert.DeserializeObject(Of ExpandoObject)(json, converter)
    
            'get top level object and its sub objects
            s = message.value
    
            Try
                totalEstimatedMatches = CInt(message.totalEstimatedMatches)
                total_available_for_processing = s.Count
            Catch ex As Exception
            End Try
    
            'get total number of pages availble at 25 records per page, so we page thru 25 records at a time and then call api
            Dim page_count As Integer = totalEstimatedMatches / 25
    
            'loop thru page_count and 
            For p As Integer = 0 To page_count - 1
    
                If p = 0 Then
                    queryString("count") = "25"
                    queryString("offset") = "0"
                Else
                    'determine offset
                    queryString("count") = "25"
                    queryString("offset") = p * 25
                End If
    
                ' Construct the URI of the search request
                uriQuery = endpoint & queryString.ToString
    
                ' Perform the Web request and get the response
                request = HttpWebRequest.Create(uriQuery)
                request.Headers.Add("Ocp-Apim-Subscription-Key", accessKey)
    
                response = CType(request.GetResponseAsync.Result, HttpWebResponse)
                json = (New StreamReader(response.GetResponseStream)).ReadToEnd
    
                'create json object
                message = JsonConvert.DeserializeObject(Of ExpandoObject)(json, converter)
    
                'get top level object and its sub objects
                s = message.value
    
                For i As Integer = 0 To s.Count - 1
    
                    Dim myuri As Uri = New Uri(s(i).url.ToString)
                    Dim vendor_domain As String = myuri.Host
    
                    System.Diagnostics.Debug.WriteLine(icount & "," & myuri.ToString & "," & vendor_domain)
                    icount = icount + 1
                Next
                System.Threading.Thread.Sleep(100)
    
            Next
    

    【讨论】: