【问题标题】:Algorithm for finding the maximum difference in an array of numbers查找数字数组中最大差异的算法
【发布时间】:2010-09-13 23:07:03
【问题描述】:

我有一个包含几百万个数字的数组。

double* const data = new double (3600000);

我需要遍历数组并找到范围(数组中的最大值减去最小值)。但是,有一个问题。我只想找到最小值和最大值在 1000 个样本内的范围。

所以我需要找到最大值:range(data + 0, data + 1000), range(data + 1, data + 1001), range(data + 2, data + 1002), ....,范围(数据 + 3599000,数据 + 3600000)。

我希望这是有道理的。基本上我可以像上面那样做,但是如果存在的话,我正在寻找一种更有效的算法。我觉得上面的算法是O(n),但是我觉得可以优化。我正在玩的一个想法是跟踪最近的最大值和最小值以及它们有多远,然后仅在必要时回溯。

我将在 C++ 中对此进行编码,但是在伪代码中使用一个不错的算法就可以了。另外,如果我要查找的这个号码有名字,我很想知道它是什么。

谢谢。

【问题讨论】:

  • 更恰当地说,您的算法是 O(O*m),其中 m 是您正在查看的范围的大小。

标签: c++ algorithm statistics


【解决方案1】:

这类问题属于称为流式算法的算法分支。它研究的问题不仅需要 O(n) 解决方案,还需要一次性处理数据。数据作为流输入到算法中,算法无法保存所有数据,然后永远丢失。算法需要得到一些关于数据的答案,例如最小值或中值。

具体来说,您正在寻找流上方窗口中的最大值(或更常见的文献中的最小值)。

Here's a presentationarticle 上提到这个问题是他们试图解决的问题的一个子问题。它可能会给你一些想法。

我认为解决方案的概要是这样的——在流上维护窗口,在每个步骤中,一个元素被插入到窗口中,一个元素从另一侧移除(滑动窗口)。您实际保存在内存中的项目并不是窗口中的 1000 个项目中的所有项目,而是选定的代表,这些代表将成为最小值(或最大值)的良好候选者。

阅读文章。它有点复杂,但经过 2-3 次阅读后,您就可以掌握它的窍门了。

【讨论】:

    【解决方案2】:

    算法思路:

    获取数据的前 1000 个值并对其进行排序
    排序中的最后一个 - 第一个是 range(data + 0, data + 999)。
    然后从排序堆中删除值为 data[0]
    的第一个元素 并添加元素数据[1000]
    现在,排序中的最后一个 - 第一个是 range(data + 1, data + 1000)。
    重复直到完成

    // This should run in (DATA_LEN - RANGE_WIDTH)log(RANGE_WIDTH)
    #include <set>
    #include <algorithm>
    using namespace std;
    
    const int DATA_LEN = 3600000;
    double* const data = new double (DATA_LEN);
    
    ....
    ....
    
    const int RANGE_WIDTH = 1000;
    double range = new double(DATA_LEN - RANGE_WIDTH);
    multiset<double> data_set;
    data_set.insert(data[i], data[RANGE_WIDTH]);
    
    for (int i = 0 ; i < DATA_LEN - RANGE_WIDTH - 1 ; i++)
    {
       range[i] = *data_set.end() - *data_set.begin();
       multiset<double>::iterator iter = data_set.find(data[i]);
       data_set.erase(iter);
       data_set.insert(data[i+1]);
    }
    range[i] = *data_set.end() - *data_set.begin();
    
    // range now holds the values you seek
    

    您可能应该检查一下是否有 1 个错误,但想法就在那里。

    【讨论】:

      【解决方案3】:

      我决定看看我能想到的解决这个问题的最有效算法是使用实​​际代码和实际时间。我首先创建了一个简单的解决方案,一个使用循环缓冲区跟踪前 n 个条目的最小/最大值的解决方案,以及一个测量速度的测试工具。在简单的解决方案中,将每个数据值与一组最小值/最大值进行比较,这就是关于 window_size * count 测试(原始问题中的窗口大小为 1000,count 为 3600000)。

      然后我想到了如何让它更快。首先,我创建了一个解决方案,它使用 fifo 队列来存储 window_size 值和链表以升序存储值,其中链表中的每个节点也是队列中的一个节点。为了处理一个数据值,fifo 末尾的项被从链表和队列中删除。新值被添加到队列的开头,并使用线性搜索来查找链表中的位置。然后可以从链表的开头和结尾读取最小值和最大值。这很快,但不能随着 window_size 的增加(即线性)很好地扩展。

      所以我决定在系统中添加一棵二叉树来尝试加速算法的搜索部分。 window_size = 1000 和 count = 3600000 的最终时间是:

      Simple: 106875
      Quite Complex: 1218
      Complex: 1219
      

      这既是意料之中的,也是出乎意料的。预期使用排序链表会有所帮助,但意外的是拥有自平衡树的开销并没有抵消更快搜索的优势。我尝试了增加窗口大小的后两个,发现它们总是几乎相同,直到 window_size 为 100000。

      这一切都表明,关于算法的理论化是一回事,实现它们是另一回事。

      不管怎样,对于那些感兴趣的人,这是我写的代码(有很多!):

      范围.h:

      #include <algorithm>
      #include <iostream>
      #include <ctime>
      
      using namespace std;
      
      //  Callback types.
      typedef void (*OutputCallback) (int min, int max);
      typedef int (*GeneratorCallback) ();
      
      //  Declarations of the test functions.
      clock_t Simple (int, int, GeneratorCallback, OutputCallback);
      clock_t QuiteComplex (int, int, GeneratorCallback, OutputCallback);
      clock_t Complex (int, int, GeneratorCallback, OutputCallback);
      

      main.cpp:

      #include "Range.h"
      
      int
        checksum;
      
      //  This callback is used to get data.
      int CreateData ()
      {
        return rand ();
      }
      
      //  This callback is used to output the results.
      void OutputResults (int min, int max)
      {
        //cout << min << " - " << max << endl;
        checksum += max - min;
      }
      
      //  The program entry point.
      void main ()
      {
        int
          count = 3600000,
          window = 1000;
      
        srand (0);
        checksum = 0;
        std::cout << "Simple: Ticks = " << Simple (count, window, CreateData, OutputResults) << ", checksum = " << checksum << std::endl;
        srand (0);
        checksum = 0;
        std::cout << "Quite Complex: Ticks = " << QuiteComplex (count, window, CreateData, OutputResults) << ", checksum = " << checksum << std::endl;
        srand (0);
        checksum = 0;
        std::cout << "Complex: Ticks = " << Complex (count, window, CreateData, OutputResults) << ", checksum = " << checksum << std::endl;
      }
      

      Simple.cpp:

      #include "Range.h"
      
      //  Function to actually process the data.
      //  A circular buffer of min/max values for the current window is filled
      //  and once full, the oldest min/max pair is sent to the output callback
      //  and replaced with the newest input value. Each value inputted is 
      //  compared against all min/max pairs.
      void ProcessData
      (
        int count,
        int window,
        GeneratorCallback input,
        OutputCallback output,
        int *min_buffer,
        int *max_buffer
      )
      {
        int
          i;
      
        for (i = 0 ; i < window ; ++i)
        {
          int
            value = input ();
      
          min_buffer [i] = max_buffer [i] = value;
      
          for (int j = 0 ; j < i ; ++j)
          {
            min_buffer [j] = min (min_buffer [j], value);
            max_buffer [j] = max (max_buffer [j], value);
          }
        }
      
        for ( ; i < count ; ++i)
        {
          int
            index = i % window;
      
          output (min_buffer [index], max_buffer [index]);
      
          int
            value = input ();
      
          min_buffer [index] = max_buffer [index] = value;
      
          for (int k = (i + 1) % window ; k != index ; k = (k + 1) % window)
          {
            min_buffer [k] = min (min_buffer [k], value);
            max_buffer [k] = max (max_buffer [k], value);
          }
        }
      
        output (min_buffer [count % window], max_buffer [count % window]);
      }
      
      //  A simple method of calculating the results.
      //  Memory management is done here outside of the timing portion.
      clock_t Simple
      (
        int count,
        int window,
        GeneratorCallback input,
        OutputCallback output
      )
      {
        int
          *min_buffer = new int [window],
          *max_buffer = new int [window];
      
        clock_t
          start = clock ();
      
        ProcessData (count, window, input, output, min_buffer, max_buffer);
      
        clock_t
          end = clock ();
      
        delete [] max_buffer;
        delete [] min_buffer;
      
        return end - start;
      }
      

      相当复杂的.cpp:

      #include "Range.h"
      
      template <class T>
      class Range
      {
      private:
        //  Class Types
      
        //  Node Data
        //  Stores a value and its position in various lists.
        struct Node
        {
          Node
            *m_queue_next,
            *m_list_greater,
            *m_list_lower;
      
          T
            m_value;
        };
      
      public:
        //  Constructor
        //  Allocates memory for the node data and adds all the allocated
        //  nodes to the unused/free list of nodes.
        Range
        (
          int window_size
        ) :
          m_nodes (new Node [window_size]),
          m_queue_tail (m_nodes),
          m_queue_head (0),
          m_list_min (0),
          m_list_max (0),
          m_free_list (m_nodes)
        {
          for (int i = 0 ; i < window_size - 1 ; ++i)
          {
            m_nodes [i].m_list_lower = &m_nodes [i + 1];
          }
      
          m_nodes [window_size - 1].m_list_lower = 0;
        }
      
        //  Destructor
        //  Tidy up allocated data.
        ~Range ()
        {
          delete [] m_nodes;
        }
      
        //  Function to add a new value into the data structure.
        void AddValue
        (
          T value
        )
        {
          Node
            *node = GetNode ();
      
          //  clear links
          node->m_queue_next = 0;
      
          //  set value of node
          node->m_value = value;
      
          //  find place to add node into linked list
          Node
            *search;
      
          for (search = m_list_max ; search ; search = search->m_list_lower)
          {
            if (search->m_value < value)
            {
              if (search->m_list_greater)
              {
                node->m_list_greater = search->m_list_greater;
                search->m_list_greater->m_list_lower = node;
              }
              else
              {
                m_list_max = node;
              }
      
              node->m_list_lower = search;
              search->m_list_greater = node;
            }
          }
      
          if (!search)
          {
            m_list_min->m_list_lower = node;
            node->m_list_greater = m_list_min;
            m_list_min = node;
          }
        }
      
        //  Accessor to determine if the first output value is ready for use.
        bool RangeAvailable ()
        {
          return !m_free_list;
        }
      
        //  Accessor to get the minimum value of all values in the current window.
        T Min ()
        {
          return m_list_min->m_value;
        }
      
        //  Accessor to get the maximum value of all values in the current window.
        T Max ()
        {
          return m_list_max->m_value;
        }
      
      private:
        //  Function to get a node to store a value into.
        //  This function gets nodes from one of two places:
        //    1. From the unused/free list
        //    2. From the end of the fifo queue, this requires removing the node from the list and tree
        Node *GetNode ()
        {
          Node
            *node;
      
          if (m_free_list)
          {
            //  get new node from unused/free list and place at head
            node = m_free_list;
      
            m_free_list = node->m_list_lower;
      
            if (m_queue_head)
            {
              m_queue_head->m_queue_next = node;
            }
      
            m_queue_head = node;
          }
          else
          {
            //  get node from tail of queue and place at head
            node = m_queue_tail;
      
            m_queue_tail = node->m_queue_next;
            m_queue_head->m_queue_next = node;
            m_queue_head = node;
      
            //  remove node from linked list
            if (node->m_list_lower)
            {
              node->m_list_lower->m_list_greater = node->m_list_greater;
            }
            else
            {
              m_list_min = node->m_list_greater;
            }
      
            if (node->m_list_greater)
            {
              node->m_list_greater->m_list_lower = node->m_list_lower;
            }
            else
            {
              m_list_max = node->m_list_lower;
            }
          }
      
          return node;
        }
      
        //  Member Data.
        Node
          *m_nodes,
          *m_queue_tail,
          *m_queue_head,
          *m_list_min,
          *m_list_max,
          *m_free_list;
      };
      
      //  A reasonable complex but more efficent method of calculating the results.
      //  Memory management is done here outside of the timing portion.
      clock_t QuiteComplex
      (
        int size,
        int window,
        GeneratorCallback input,
        OutputCallback output
      )
      {
        Range <int>
          range (window);
      
        clock_t
          start = clock ();
      
        for (int i = 0 ; i < size ; ++i)
        {   
          range.AddValue (input ());
      
          if (range.RangeAvailable ())
          {
            output (range.Min (), range.Max ());
          }
        }
      
        clock_t
          end = clock ();
      
        return end - start;
      }
      

      复杂的.cpp:

      #include "Range.h"
      
      template <class T>
      class Range
      {
      private:
        //  Class Types
      
        //  Red/Black tree node colours.
        enum NodeColour
        {
          Red,
          Black
        };
      
        //  Node Data
        //  Stores a value and its position in various lists and trees.
        struct Node
        {
          //  Function to get the sibling of a node.
          //  Because leaves are stored as null pointers, it must be possible
          //  to get the sibling of a null pointer. If the object is a null pointer
          //  then the parent pointer is used to determine the sibling.
          Node *Sibling
          (
            Node *parent
          )
          {
            Node
              *sibling;
      
            if (this)
            {
              sibling = m_tree_parent->m_tree_less == this ? m_tree_parent->m_tree_more : m_tree_parent->m_tree_less;
            }
            else
            {
              sibling = parent->m_tree_less ? parent->m_tree_less : parent->m_tree_more;
            }
      
            return sibling;
          }
      
          //  Node Members
          Node
            *m_queue_next,
            *m_tree_less,
            *m_tree_more,
            *m_tree_parent,
            *m_list_greater,
            *m_list_lower;
      
          NodeColour
            m_colour;
      
          T
            m_value;
        };
      
      public:
        //  Constructor
        //  Allocates memory for the node data and adds all the allocated
        //  nodes to the unused/free list of nodes.
        Range
        (
          int window_size
        ) :
          m_nodes (new Node [window_size]),
          m_queue_tail (m_nodes),
          m_queue_head (0),
          m_tree_root (0),
          m_list_min (0),
          m_list_max (0),
          m_free_list (m_nodes)
        {
          for (int i = 0 ; i < window_size - 1 ; ++i)
          {
            m_nodes [i].m_list_lower = &m_nodes [i + 1];
          }
      
          m_nodes [window_size - 1].m_list_lower = 0;
        }
      
        //  Destructor
        //  Tidy up allocated data.
        ~Range ()
        {
          delete [] m_nodes;
        }
      
        //  Function to add a new value into the data structure.
        void AddValue
        (
          T value
        )
        {
          Node
            *node = GetNode ();
      
          //  clear links
          node->m_queue_next = node->m_tree_more = node->m_tree_less = node->m_tree_parent = 0;
      
          //  set value of node
          node->m_value = value;
      
          //  insert node into tree
          if (m_tree_root)
          {
            InsertNodeIntoTree (node);
            BalanceTreeAfterInsertion (node);
          }
          else
          {
            m_tree_root = m_list_max = m_list_min = node;
            node->m_tree_parent = node->m_list_greater = node->m_list_lower = 0;
          }
      
          m_tree_root->m_colour = Black;
        }
      
        //  Accessor to determine if the first output value is ready for use.
        bool RangeAvailable ()
        {
          return !m_free_list;
        }
      
        //  Accessor to get the minimum value of all values in the current window.
        T Min ()
        {
          return m_list_min->m_value;
        }
      
        //  Accessor to get the maximum value of all values in the current window.
        T Max ()
        {
          return m_list_max->m_value;
        }
      
      private:
        //  Function to get a node to store a value into.
        //  This function gets nodes from one of two places:
        //    1. From the unused/free list
        //    2. From the end of the fifo queue, this requires removing the node from the list and tree
        Node *GetNode ()
        {
          Node
            *node;
      
          if (m_free_list)
          {
            //  get new node from unused/free list and place at head
            node = m_free_list;
      
            m_free_list = node->m_list_lower;
      
            if (m_queue_head)
            {
              m_queue_head->m_queue_next = node;
            }
      
            m_queue_head = node;
          }
          else
          {
            //  get node from tail of queue and place at head
            node = m_queue_tail;
      
            m_queue_tail = node->m_queue_next;
            m_queue_head->m_queue_next = node;
            m_queue_head = node;
      
            //  remove node from tree
            node = RemoveNodeFromTree (node);
            RebalanceTreeAfterDeletion (node);
      
            //  remove node from linked list
            if (node->m_list_lower)
            {
              node->m_list_lower->m_list_greater = node->m_list_greater;
            }
            else
            {
              m_list_min = node->m_list_greater;
            }
      
            if (node->m_list_greater)
            {
              node->m_list_greater->m_list_lower = node->m_list_lower;
            }
            else
            {
              m_list_max = node->m_list_lower;
            }
          }
      
          return node;
        }
      
        //  Rebalances the tree after insertion
        void BalanceTreeAfterInsertion
        (
          Node *node
        )
        {
          node->m_colour = Red;
      
          while (node != m_tree_root && node->m_tree_parent->m_colour == Red)
          {
            if (node->m_tree_parent == node->m_tree_parent->m_tree_parent->m_tree_more)
            {
              Node
                *uncle = node->m_tree_parent->m_tree_parent->m_tree_less;
      
              if (uncle && uncle->m_colour == Red)
              {
                node->m_tree_parent->m_colour = Black;
                uncle->m_colour = Black;
                node->m_tree_parent->m_tree_parent->m_colour = Red;
                node = node->m_tree_parent->m_tree_parent;
              }
              else
              {
                if (node == node->m_tree_parent->m_tree_less)
                {
                  node = node->m_tree_parent;
                  LeftRotate (node);
                }
      
                node->m_tree_parent->m_colour = Black;
                node->m_tree_parent->m_tree_parent->m_colour = Red;
                RightRotate (node->m_tree_parent->m_tree_parent);
              }
            }
            else
            {
              Node
                *uncle = node->m_tree_parent->m_tree_parent->m_tree_more;
      
              if (uncle && uncle->m_colour == Red)
              {
                node->m_tree_parent->m_colour = Black;
                uncle->m_colour = Black;
                node->m_tree_parent->m_tree_parent->m_colour = Red;
                node = node->m_tree_parent->m_tree_parent;
              }
              else
              {
                if (node == node->m_tree_parent->m_tree_more)
                {
                  node = node->m_tree_parent;
                  RightRotate (node);
                }
      
                node->m_tree_parent->m_colour = Black;
                node->m_tree_parent->m_tree_parent->m_colour = Red;
                LeftRotate (node->m_tree_parent->m_tree_parent);
              }
            }
          }
        }
      
        //  Adds a node into the tree and sorted linked list
        void InsertNodeIntoTree
        (
          Node *node
        )
        {
          Node
            *parent = 0,
            *child = m_tree_root;
      
          bool
            greater;
      
          while (child)
          {
            parent = child;
            child = (greater = node->m_value > child->m_value) ? child->m_tree_more : child->m_tree_less;
          }
      
          node->m_tree_parent = parent;
      
          if (greater)
          {
            parent->m_tree_more = node;
      
            //  insert node into linked list
            if (parent->m_list_greater)
            {
              parent->m_list_greater->m_list_lower = node;
            }
            else
            {
              m_list_max = node;
            }
      
            node->m_list_greater = parent->m_list_greater;
            node->m_list_lower = parent;
            parent->m_list_greater = node;
          }
          else
          {
            parent->m_tree_less = node;
      
            //  insert node into linked list
            if (parent->m_list_lower)
            {
              parent->m_list_lower->m_list_greater = node;
            }
            else
            {
              m_list_min = node;
            }
      
            node->m_list_lower = parent->m_list_lower;
            node->m_list_greater = parent;
            parent->m_list_lower = node;
          }
        }
      
        //  Red/Black tree manipulation routine, used for removing a node
        Node *RemoveNodeFromTree
        (
          Node *node
        )
        {
          if (node->m_tree_less && node->m_tree_more)
          {
            //  the complex case, swap node with a child node
            Node
              *child;
      
            if (node->m_tree_less)
            {
              // find largest value in lesser half (node with no greater pointer)
              for (child = node->m_tree_less ; child->m_tree_more ; child = child->m_tree_more)
              {
              }
            }
            else
            {
              // find smallest value in greater half (node with no lesser pointer)
              for (child = node->m_tree_more ; child->m_tree_less ; child = child->m_tree_less)
              {
              }
            }
      
            swap (child->m_colour, node->m_colour);
      
            if (child->m_tree_parent != node)
            {
              swap (child->m_tree_less, node->m_tree_less);
              swap (child->m_tree_more, node->m_tree_more);
              swap (child->m_tree_parent, node->m_tree_parent);
      
              if (!child->m_tree_parent)
              {
                m_tree_root = child;
              }
              else
              {
                if (child->m_tree_parent->m_tree_less == node)
                {
                  child->m_tree_parent->m_tree_less = child;
                }
                else
                {
                  child->m_tree_parent->m_tree_more = child;
                }
              }
      
              if (node->m_tree_parent->m_tree_less == child)
              {
                node->m_tree_parent->m_tree_less = node;
              }
              else
              {
                node->m_tree_parent->m_tree_more = node;
              }
            }
            else
            {
              child->m_tree_parent = node->m_tree_parent;
              node->m_tree_parent = child;
      
              Node
                *child_less = child->m_tree_less,
                *child_more = child->m_tree_more;
      
              if (node->m_tree_less == child)
              {
                child->m_tree_less = node;
                child->m_tree_more = node->m_tree_more;
                node->m_tree_less = child_less;
                node->m_tree_more = child_more;
              }
              else
              {
                child->m_tree_less = node->m_tree_less;
                child->m_tree_more = node;
                node->m_tree_less = child_less;
                node->m_tree_more = child_more;
              }
      
              if (!child->m_tree_parent)
              {
                m_tree_root = child;
              }
              else
              {
                if (child->m_tree_parent->m_tree_less == node)
                {
                  child->m_tree_parent->m_tree_less = child;
                }
                else
                {
                  child->m_tree_parent->m_tree_more = child;
                }
              }
            }
      
            if (child->m_tree_less)
            {
              child->m_tree_less->m_tree_parent = child;
            }
      
            if (child->m_tree_more)
            {
              child->m_tree_more->m_tree_parent = child;
            }
      
            if (node->m_tree_less)
            {
              node->m_tree_less->m_tree_parent = node;
            }
      
            if (node->m_tree_more)
            {
              node->m_tree_more->m_tree_parent = node;
            }
          }
      
          Node
            *child = node->m_tree_less ? node->m_tree_less : node->m_tree_more;
      
          if (node->m_tree_parent->m_tree_less == node)
          {
            node->m_tree_parent->m_tree_less = child;
          }
          else
          {
            node->m_tree_parent->m_tree_more = child;
          }
      
          if (child)
          {
            child->m_tree_parent = node->m_tree_parent;
          }
      
          return node;
        }
      
        //  Red/Black tree manipulation routine, used for rebalancing a tree after a deletion
        void RebalanceTreeAfterDeletion
        (
          Node *node
        )
        {
          Node
            *child = node->m_tree_less ? node->m_tree_less : node->m_tree_more;
      
          if (node->m_colour == Black)
          {
            if (child && child->m_colour == Red)
            {
              child->m_colour = Black;
            }
            else
            {
              Node
                *parent = node->m_tree_parent,
                *n = child;
      
              while (parent)
              {
                Node
                  *sibling = n->Sibling (parent);
      
                if (sibling && sibling->m_colour == Red)
                {
                  parent->m_colour = Red;
                  sibling->m_colour = Black;
      
                  if (n == parent->m_tree_more)
                  {
                    LeftRotate (parent);
                  }
                  else
                  {
                    RightRotate (parent);
                  }
                }
      
                sibling = n->Sibling (parent);
      
                if (parent->m_colour == Black &&
                  sibling->m_colour == Black &&
                  (!sibling->m_tree_more || sibling->m_tree_more->m_colour == Black) &&
                  (!sibling->m_tree_less || sibling->m_tree_less->m_colour == Black))
                {
                  sibling->m_colour = Red;
                  n = parent;
                  parent = n->m_tree_parent;
                  continue;
                }
                else
                {
                  if (parent->m_colour == Red &&
                    sibling->m_colour == Black &&
                    (!sibling->m_tree_more || sibling->m_tree_more->m_colour == Black) &&
                    (!sibling->m_tree_less || sibling->m_tree_less->m_colour == Black))
                  {
                    sibling->m_colour = Red;
                    parent->m_colour = Black;
                    break;
                  }
                  else
                  {
                    if (n == parent->m_tree_more &&
                      sibling->m_colour == Black &&
                      (sibling->m_tree_more && sibling->m_tree_more->m_colour == Red) &&
                      (!sibling->m_tree_less || sibling->m_tree_less->m_colour == Black))
                    {
                      sibling->m_colour = Red;
                      sibling->m_tree_more->m_colour = Black;
                      RightRotate (sibling);
                    }
                    else
                    {
                      if (n == parent->m_tree_less &&
                        sibling->m_colour == Black &&
                        (!sibling->m_tree_more || sibling->m_tree_more->m_colour == Black) &&
                        (sibling->m_tree_less && sibling->m_tree_less->m_colour == Red))
                      {
                        sibling->m_colour = Red;
                        sibling->m_tree_less->m_colour = Black;
                        LeftRotate (sibling);
                      }
                    }
      
                    sibling = n->Sibling (parent);
                    sibling->m_colour = parent->m_colour;
                    parent->m_colour = Black;
      
                    if (n == parent->m_tree_more)
                    {
                      sibling->m_tree_less->m_colour = Black;
                      LeftRotate (parent);
                    }
                    else
                    {
                      sibling->m_tree_more->m_colour = Black;
                      RightRotate (parent);
                    }
                    break;
                  }
                }
              }
            }
          }
        }
      
        //  Red/Black tree manipulation routine, used for balancing the tree
        void LeftRotate
        (
          Node *node
        )
        {
          Node
            *less = node->m_tree_less;
      
          node->m_tree_less = less->m_tree_more;
      
          if (less->m_tree_more)
          {
            less->m_tree_more->m_tree_parent = node;
          }
      
          less->m_tree_parent = node->m_tree_parent;
      
          if (!node->m_tree_parent)
          {
            m_tree_root = less;
          }
          else
          {
            if (node == node->m_tree_parent->m_tree_more)
            {
              node->m_tree_parent->m_tree_more = less;
            }
            else
            {
              node->m_tree_parent->m_tree_less = less;
            }
          }
      
          less->m_tree_more = node;
          node->m_tree_parent = less;
        }
      
        //  Red/Black tree manipulation routine, used for balancing the tree
        void RightRotate
        (
          Node *node
        )
        {
          Node
            *more = node->m_tree_more;
      
          node->m_tree_more = more->m_tree_less;
      
          if (more->m_tree_less)
          {
            more->m_tree_less->m_tree_parent = node;
          }
      
          more->m_tree_parent = node->m_tree_parent;
      
          if (!node->m_tree_parent)
          {
            m_tree_root = more;
          }
          else
          {
            if (node == node->m_tree_parent->m_tree_less)
            {
              node->m_tree_parent->m_tree_less = more;
            }
            else
            {
              node->m_tree_parent->m_tree_more = more;
            }
          }
      
          more->m_tree_less = node;
          node->m_tree_parent = more;
        }
      
        //  Member Data.
        Node
          *m_nodes,
          *m_queue_tail,
          *m_queue_head,
          *m_tree_root,
          *m_list_min,
          *m_list_max,
          *m_free_list;
      };
      
      //  A complex but more efficent method of calculating the results.
      //  Memory management is done here outside of the timing portion.
      clock_t Complex
      (
        int count,
        int window,
        GeneratorCallback input,
        OutputCallback output
      )
      {
        Range <int>
          range (window);
      
        clock_t
          start = clock ();
      
        for (int i = 0 ; i < count ; ++i)
        {   
          range.AddValue (input ());
      
          if (range.RangeAvailable ())
          {
            output (range.Min (), range.Max ());
          }
        }
      
        clock_t
          end = clock ();
      
        return end - start;
      }
      

      【讨论】:

        【解决方案4】:
        1. 读入前 1000 个数字。
        2. 创建一个包含 1000 个元素的链表来跟踪当前的 1000 个数字。
        3. 创建一个包含 1000 个元素的指向链表节点的指针数组,1-1 映射。
        4. 根据链表节点的值对指针数组进行排序。这将重新排列数组,但保持链表不变。
        5. 您现在可以通过检查指针数组的第一个和最后一个元素来计算前 1000 个数字的范围。
        6. 删除第一个插入的元素,它可以是头部或尾部,具体取决于您创建链接列表的方式。使用节点的值对指针数组进行二分查找,找到待删除节点的指针,将数组移一位删除。
        7. 通过执行插入排序的一步,将第 1001 个元素添加到链表中,并在数组中的正确位置插入指向它的指针。这将使数组保持排序。
        8. 现在您有了最小值。和最大。 1 到 1001 之间的数字的值,并且可以使用指针数组的第一个和最后一个元素计算范围。
        9. 现在应该很明显您需要对阵列的其余部分执行什么操作。

        算法应该是 O(n),因为删除和插入是由 log(1e3) 限制的,其他一切都需要恒定的时间。

        【讨论】:

        • 您的插入排序和项目删除(将较高的值向下移动一位)会破坏这里的性能 - 涉及大量内存复制,这将是主要瓶颈。
        • 由于整个阵列应该适合现代 CPU 的 L2 缓存,我不认为这是一个多大的问题。
        【解决方案5】:
        std::multiset<double> range;
        double currentmax = 0.0;
        for (int i = 0;  i < 3600000;  ++i)
        {
            if (i >= 1000)
                range.erase(range.find(data[i-1000]));
            range.insert(data[i]);
            if (i >= 999)
                currentmax = max(currentmax, *range.rbegin());
        }
        

        注意未经测试的代码。

        编辑:修复了一个错误。

        【讨论】:

          【解决方案6】:

          这是 min-queue 的一个很好的应用 - 一个队列(先进先出 = FIFO),它可以同时跟踪它包含的最小元素,并具有摊销常数 -时间更新。当然,max-queue 基本上是一样的。

          一旦你有了这个数据结构,你可以考虑 CurrentMax(过去 1000 个元素)减去 CurrentMin,将其存储为 BestSoFar,然后推送一个新值并弹出旧值,然后再次检查。这样,不断更新 BestSoFar 直到最终值是您问题的解决方案。每一步都需要摊销的常数时间,所以整个事情都是线性的,而且我知道的实现有一个很好的标量常数(它很快)。

          我不知道有关 min-queue 的任何文档 - 这是我与同事合作提出的数据结构。您可以通过在内部跟踪数据的每个连续子序列中最少元素的二叉树来实现它。它简化了您只能从结构的一端弹出数据的问题。

          如果您对更多细节感兴趣,我可以尝试提供。我正在考虑将这个数据结构写成 arxiv 的论文。另请注意,Tarjan 和其他人之前提出了一个更强大的 min-deque 结构,可以在这里工作,但实现要复杂得多。您可以google for "mindeque" 了解 Tarjan 等人的工作。

          【讨论】:

          • 你描述的数据结构听起来很像堆:en.wikipedia.org/wiki/Heap_(data_structure%29
          • 类似,但又不一样。堆不允许您在摊销的常数时间内删除元素。
          • “每一步都需要摊销的常数时间,所以整个事情是线性的”。这是否意味着通过迭代弹出 min 元素可以在线性时间内对元素进行排序?
          • 不行,因为你只能从结构后面弹出,并且找到最小元素(不能删除/弹出)。抱歉,我的描述含糊不清。
          【解决方案7】:

          您描述的算法实际上是 O(N),但我认为常数太高了。另一种看起来合理的解决方案是通过以下方式使用 O(N*log(N)) 算法:

          * create sorted container (std::multiset) of first 1000 numbers
          * in loop (j=1, j<(3600000-1000); ++j)
             - calculate range
             - remove from the set number which is now irrelevant (i.e. in index *j - 1* of the array)
             - add to set new relevant number  (i.e. in index *j+1000-1* of the array)
          

          我相信它应该更快,因为常数要低得多。

          【讨论】:

          • 我认为在实践中这不会比普通方法快,因为您将复杂性转移到操作排序集上。如果 set 实现有任何内存分配,这将是一个很大的开销。
          • 人们是如何想出这些巧妙的想法的?我从来没有想过使用一组来找到最大最小值。无论如何,这看起来很简单,你的解释很好。我要试试看。
          • 这个算法也是 O(N),因为一旦它拥有 1,000 个项目,维护集合应该花费恒定的时间。我今天将对此与天真的解决方案进行基准测试。
          • Skizz - 您可以使用 boost::intrusive::multiset 而不是使用 std::multiset 为每个节点分配 1 个堆,并且只为最初的 1000 个元素分配空间并重用已删除的空间插入元素的元素。
          猜你喜欢
          • 2021-12-24
          • 2016-11-17
          • 2013-03-14
          • 2018-08-05
          • 2012-09-04
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 2017-04-17
          相关资源
          最近更新 更多