Haskell 函数返回列表中出现次数超过给定的元素列表答案

【问题标题】：Haskell function that returns a list of elements in a list with more than given amount of occurrencesHaskell 函数返回列表中出现次数超过给定的元素列表
【发布时间】：2021-09-01 01:11:05
【问题描述】：

我尝试制作一个函数，如标题所示，它需要 2 个参数，一个指定该数字必须出现多少次的数字和一个我们正在处理的列表，我制作了一个计算给定数字出现次数的函数一个列表，我尝试在我的主要功能中使用它，但我无法理解 if else 和缩进在 Haskell 中是如何工作的，它比其他语言更难修复错误，我认为我是缺少 else 语句，但即便如此我也不知道该放在那里

count el list = count el list 0
     where count el list output
             | list==[] = output
             | head(list)==el = count el (tail(list)) output+1
             | otherwise = count el (tail(list)) output


moreThan :: Eq a => Int -> [a] -> [a]
moreThan a [] = []
moreThan a list = moreThan a list output i
    where moreThan a list [] 0
            if i == length (list)
                then output
            else if elem (list!!i) output
                 then moreThan a list output i+1
            else if (count (list!!i) list) >= a 
                then moreThan a list (output ++ [list!!i]) i+1

我现在得到的只是

parse error (possibly incorrect indentation or mismatched brackets)

【问题讨论】：

标签： if-statement haskell indentation parse-error

【解决方案1】：

您只是忘记了= 符号和一些括号，以及最后的else 案例。但是你也切换了内部函数声明和调用的顺序：

    moreThan :: Eq a => Int -> [a] -> [a]
    moreThan a [] = []
    moreThan a list = go a list [] 0   -- call
        where go a list output i =      --  declaration  =
                if i == length (list)
                    then output
                else if elem (list!!i) output
                     then go a list output (i+1)    -- (i+1) !
                else if (count (list!!i) list) >= a 
                    then go a list (output ++ [list!!i]) (i+1)   -- (i+1) !
                else
                    undefined

我确实将您的内部函数重命名为 go，这是自定义的。

至于一般如何修复错误，只需缓慢而仔细地阅读错误消息——它们通常会说明哪里出了问题。

这可以解决您询问的语法问题。

至于在缺少的 else 子句中添加什么，您刚刚在上面的行中处理了这个问题——如果列表中的计数大于或，则在输出中包含 ith 元素等于给定参数a。怎么办否则，我们在else 子句中说。

也就是说，很可能不在输出中包含该元素：

                    then go a list (output ++ [list!!i]) (i+1)
                else               ---------------------
                    undefined

所以，只需保留 output 的原样，而不是轮廓部分，并放置该行而不是 undefined。

更重要的是，通过索引访问列表元素是一种反模式，通过在每个递归步骤中采用tail 来“滑动”要好得多，并且始终只处理head 元素，例如您在count 代码中执行此操作（但最好使用模式匹配，而不是直接使用这些函数）。这样我们的代码就变成了线性的，而不是现在的二次方。

【讨论】：

【解决方案2】：

Will Ness 的回答是正确的。我只是想为 Haskell 提供一些一般性建议和一些改进代码的技巧。

首先，我总是避免使用警卫。语法与 Haskell 的通常票价完全不一致，并且守卫不像其他 Haskell 语法那样可组合。如果我是你，我会坚持使用let、if/then/else 和模式匹配。

其次，Haskell 中的if 语句通常不是正确答案。在许多情况下，最好完全（或至少尽可能）避免使用 if 语句。例如，count 的可读性更强的版本如下所示：

count el list = go list 0 where
    go [] output = output
    go (x:xs) output = go xs (if x == el
                              then 1 + output
                              else output)

但是，此代码仍然存在缺陷，因为它在 output 中的严格程度不高。例如，考虑表达式 count 1 [1, 1, 1, 1] 的求值，其过程如下：

count 1 [1, 1, 1, 1]
go [1, 1, 1, 1] 0
go [1, 1, 1] (1 + 0)
go [1, 1] (1 + (1 + 0))
go [1] (1 + (1 + (1 + 0)))
go [] (1 + (1 + (1 + (1 + 0))))
(1 + (1 + (1 + (1 + 0))))
(1 + (1 + 2))
(1 + 3)
4

请注意此评估的空间使用量激增。我们需要强制go 确保在进行递归调用之前评估output。我们可以使用seq 来做到这一点。表达式seq a b 的求值如下：首先，a 被部分求值。然后，seq a b 的计算结果为 b。对于数字的情况，“部分评估”与完全评估是一样的。

所以代码实际上应该是

count el list = go list 0 where
    go [] output = output
    go (x:xs) output = 
        let new_output = if x == el
                         then 1 + output
                         else output
        in seq new_output (go xs new_output)

使用这个定义，我们可以再次追踪执行：

go [1, 1, 1, 1] 0
go [1, 1, 1] 1
go [1, 1] 2
go [1] 3
go [] 4
4

这是评估表达式的更有效方法。在不使用库函数的情况下，这基本上与编写 count 函数一样好。

但我们实际上使用的是一种非常常见的模式——一种如此常见的模式，有一个以它命名的高阶函数。我们使用foldl'（必须使用import Data.List (foldl') 语句从Data.List 导入）。该函数定义如下：

foldl' :: (b -> a -> b) -> b -> [a] -> b
foldl' f = go where
    go output [] = output
    go output (x:xs) =
       let new_output = f output x
       in seq new_output (go new_output xs)

所以我们可以进一步重写我们的count函数为

count el list = foldl' f 0 list where
    f output x = if x == el
                 then 1 + output
                 else output

这很好，但实际上我们可以通过将count 步骤分成两部分来进一步改进此代码。

count el list 应该是el 在list 中出现的次数。我们可以将这个计算分成两个概念步骤。首先，构造列表list'，它由list 中等于el 的所有元素组成。然后，计算list'的长度。

在代码中：

count el list = length (filter (el ==) list)

在我看来，这是迄今为止最易读的版本。由于懒惰，它也与count 的foldl' 版本一样高效。在这里，Haskell 的length 函数负责找到执行count 的计数部分的最佳方法，而filter (el ==) 负责我们检查是否增加output 的循环部分。一般来说，如果您正在迭代一个列表并且有一个 if P x 语句，您通常可以将其替换为对 filter P 的调用。

我们可以用“point-free style”再次将其重写为

count el = length . filter (el ==)

这很可能是函数在库中的编写方式。 . 指的是函数组合。其含义如下：

要将函数count el应用于列表，我们首先过滤列表以仅保留el ==的元素，然后取长度。

顺便说一句，filter 函数正是我们需要紧凑地编写 moreThan 的内容：

moreThan a list = filter occursOften list where
    occursOften x = count x list >= a

故事的寓意：尽可能使用高阶函数。

每当您在 Haskell 中解决列表问题时，您应该使用的第一个工具是 Data.List 中定义的函数，尤其是 map、foldl'/foldr、filter 和 concatMap。大多数列表问题归结为 map/fold/filter。这些应该是循环的首选替代品。如果要替换嵌套循环，则应使用concatMap。

【讨论】：

【解决方案3】：

以功能的方式，;)

moreThan n xs = nub $ concat [ x | x <- ( group(sort(xs))), length x > n ]

...或者以一种奇特的方式，哈哈

moreThan n xs = map head [ x | x <- ( group(sort(xs))), length x > n ]

...

mt1 n xs =  [ head x | x <- ( group(sort(xs))), length x > n ]

【讨论】：

是的，您找到了另一个问题的孪生问题。仍然，为什么nub 在concat 之后，你可以从map head 开始——为什么在我们可以避免首先做工作的情况下创建工作然后做工作。 :) 即[ head x | x <- ( group(sort(xs))), length x > n ].
另外，这会改变顺序，这可能是也可能不是要求。