强制重新计算列表答案

【问题标题】：forcing a list to be recomputed强制重新计算列表
【发布时间】：2012-06-07 02:45:17
【问题描述】：

下面的函数search 搜索在某个函数下具有相同输出的两个输入。在搜索过程中，它会遍历输入列表xs 两次，这个输入列表可能非常大，例如[0..1000000000]。我宁愿使用内存来存储碰撞创建的 HashSet 而不是存储xs 的元素，我的理解是即使xs 可以被延迟计算，它也会被保留以防调用需要它到find。

问题：

这种理解正确吗？
如果我将其保留为列表，如果将 xs 传递给 find，我可以重新计算它吗？
是否有可用于xs 的替代数据结构，它允许我控制使用的空间？ xs 仅用于指定要检查的输入。

请注意，xs 没有类型限制 - 它可以是任何类型的集合。

import Data.HashSet as Set
import Data.Hashable
import Data.List

search :: (Hashable b, Eq b) => (a->b) -> [a] -> Maybe (a,a)
search h xs =
  do x0 <- collision h xs
     let h0 = h x0
     x1 <- find (\x -> (h x) == h0) xs
     return (x0,x1)

collision :: (Hashable b, Eq b) => (a->b) -> [a] -> Maybe a
collision h xs = go Set.empty xs
  where
    go s [] = Nothing
    go s (x:xs) =
      if y `Set.member` s
        then Just x
        else go (Set.insert y s) xs
      where y = h x

main = print $ search (\x -> x `mod` 21)  ([10,20..2100] :: [Int])

【问题讨论】：

您真的是指x1 <- find (\x -> (h x) `Set.member` s) xs 而不是h x == h0？
很好 - 这要简单得多
您也许可以调整Beautiful Folding 中的想法以产生漂亮的扫描效果。
如果我理解你在做什么，顺便说一句，你应该可以通过xs 完成它。只需从一组切换到一张地图。当然，这需要更多的内存，但不应该改变空间复杂度......
是的，但这是一个时间/空间权衡的决定，在这种情况下，我不介意花费额外的时间来构建更大的 HashSet 并增加发现冲突的概率.

标签： list haskell memory-management

【解决方案1】：

我在这里基本上回答了这个问题：https://stackoverflow.com/a/6209279/371753

这是相关代码。

import Data.Stream.Branching(Stream(..))
import qualified Data.Stream.Branching as S
import Control.Arrow
import Control.Applicative
import Data.List

data UM s a = UM (s -> Maybe a) deriving Functor
type UStream s a = Stream (UM s) a

runUM s (UM f) = f s
liftUM x = UM $ const (Just x)
nullUM = UM $ const Nothing

buildUStream :: Int -> Int -> Stream (UM ()) Int
buildUStream start end = S.unfold (\x -> (x, go x)) start
    where go x
           | x < end = liftUM (x + 1)
           | otherwise = nullUM

usToList x = unfoldr (\um -> (S.head &&& S.tail) <$> runUM () um) x

长话短说，不是传递一个列表，而是传递一个描述如何生成列表的数据类型。现在您可以直接在流上编写函数，也可以使用usToList 函数来使用您已有的列表函数。

【讨论】：