查询所有大小为 K 的连续子数组答案

【问题标题】：Query on all contiguous subarray of size K查询所有大小为 K 的连续子数组
【发布时间】：2017-08-06 06:50:05
【问题描述】：

让我们对大小为K 的数组B[1:K] 定义一个操作，即计算子数组B[2:K] 中小于B[1] 的元素数。

现在我有一个大小为N 的数组A[1:N]，我的目标是对大小为K 的所有连续子数组执行上述操作。

例子

A = [4, 3, 6, 2, 1] and K = 3 有3 大小为3 的连续子数组。

B = [4, 3, 6]count = 1[(3 < 4)]
B = [3, 6, 2]count = 1[(2 < 3)]
B = [6, 2, 1]count = 2[(2 < 6), (1 < 6)]

蛮力方法的时间复杂度将为O((N-K+1)*K)，因为对大小为K 的连续子数组执行上述操作为O(K)。

如果我可以设计数据结构，我可以有效地做到这一点，即在Nlog(M) 它具有以下属性

插入log(M)
删除log(M)
计算log(M) 中小于X 的元素数

我是C++ 用户，我认为没有任何数据结构可以满足所有提到的要求。还有其他改进的方法吗？请帮忙。

【问题讨论】：

前两个是std::set，但最后一个操作将是O(M)，尽管找到上限是O(logM)本身。
如果您的目标只是计数，我会想到一个算法，它在 O(nlogn) 中运行
@StoryTeller 是的，我知道。
@marvel308 是的，计数就足够了。

标签： c++ algorithm data-structures

【解决方案1】：

您可能希望将 set 与计数小于 k 的元素的附加操作一起使用。这可以实现为二叉搜索树（经典集合实现），每个节点都有额外的统计信息（基本上是树中节点的大小）。

更多详情：https://stackoverflow.com/a/15321444/1391392 以及这里的一些实现：https://sourceforge.net/projects/orderstatistics/

其他可能看起来更直接的选项是使用跳过列表。 https://en.wikipedia.org/wiki/Skip_list

【讨论】：

【解决方案2】：

这有帮助吗？

#include <iostream>
#include <cstdio>
#include <set>
using namespace std;
int bit[100005]={0};
// using BIT since numbers can repeat and set won't work
void update(int idx, int val, int n){
    while(idx < n){
        bit[idx] += val;
        idx += (idx & -idx);
    }
}
int get(int idx){
    int ret = 0;
    while(idx > 0){
        ret += bit[idx];
        idx -= (idx & -idx);
    }
    return ret;
}
int main() {
    int n, a[100005] = {0}, i, ans=0, k, maxx = -1;
    scanf("%d%d", &n, &k);
    for(i=0; i<n; i++){
        scanf("%d", &a[i]);
        if(maxx < a[i]){
            maxx = a[i];
        }
    }
    maxx++;
    for(i=0;i<n;i++){
        a[i] = maxx - a[i];
    }

    // now the problem becomes the opposite of what it initially was
    // now a[i] would contribute to ans if we can find an element a[j] where a[j] < a[i] and (i-j)<=k

    for(i=0;i<n;i++){
        if(i-k>=0){
            // remove i-k'th element from the BIT since it doesn't contribute
            update(a[i-k], -1, maxx);   
        }
        if(i >= k-1){
            // add how a[i] is gonna contribute to the final answer, it would be the number of elements less than a[i]
            ans += get(a[i]);
        }
        // add a[i] to the BIT
        update(a[i], 1, maxx);
    }
    printf("%d\n", ans);
    return 0;
}

【讨论】：

此代码假设数组的所有元素都在[0, 100004] 范围内。它可能不成立。通过在运行算法之前将所有数字映射到 [0, N - 1] 范围内，可以将其修改为适用于元素 a 的任意值。