【问题标题】：Is it possible to map a function over a Vec without allocating a new Vec?是否可以在不分配新 Vec 的情况下将函数映射到 Vec？
【发布时间】：2017-01-24 15:30:38
【问题描述】：

我有以下几点：

enum SomeType {
    VariantA(String),
    VariantB(String, i32),
}

fn transform(x: SomeType) -> SomeType {
    // very complicated transformation, reusing parts of x in order to produce result:
    match x {
        SomeType::VariantA(s) => SomeType::VariantB(s, 0),
        SomeType::VariantB(s, i) => SomeType::VariantB(s, 2 * i),
    }
}

fn main() {
    let mut data = vec![
        SomeType::VariantA("hello".to_string()),
        SomeType::VariantA("bye".to_string()),
        SomeType::VariantB("asdf".to_string(), 34),
    ];
}

我现在想在data 的每个元素上调用transform 并将结果值存储回data。我可以做类似data.into_iter().map(transform).collect() 的事情，但这会分配一个新的Vec。有没有办法就地执行此操作，重用分配的data 内存？ Rust 中曾经有 Vec::map_in_place，但它已在前一段时间被删除。

作为一种变通方法，我向SomeType 添加了一个Dummy 变体，然后执行以下操作：

for x in &mut data {
    let original = ::std::mem::replace(x, SomeType::Dummy);
    *x = transform(original);
}

这感觉不对，我必须在代码中的其他任何地方处理SomeType::Dummy，尽管它在此循环之外永远不可见。有更好的方法吗？

【问题讨论】：

另见Using map with Vectors。
您应该更改transform() 的签名以接受可变引用而不是使用SomeType。
是的，使用可变引用会很容易，但是transform() 做了一些非常复杂的转换，特别是重新使用传入x 的部分以产生结果，这是以这样的“函数式”风格更容易做到（如果可能的话，使用可变引用）。

标签： rust

【解决方案1】：

你的第一个问题不是map，而是transform。

transform 拥有其参数的所有权，而Vec 拥有其参数的所有权。任何一方都必须付出，而在Vec 中戳一个洞是个坏主意：如果transform 出现恐慌怎么办？

因此，最好的解决方法是将transform 的签名更改为：

fn transform(x: &mut SomeType) { ... }

那么你可以这样做：

for x in &mut data { transform(x) }

其他解决方案会很笨拙，因为它们需要处理transform 可能会出现恐慌的事实。

【讨论】：

【解决方案2】：

不，这通常是不可能的，因为每个元素的大小可能会随着映射的执行而改变 (fn transform(u8) -> u32)。

即使大小相同，也很重要。

在这种情况中，您不需要创建Dummy 变体，因为创建一个空的String 很便宜；只有 3 个指针大小的值并且没有堆分配：

impl SomeType {
    fn transform(&mut self) {
        use SomeType::*;

        let old = std::mem::replace(self, VariantA(String::new()));

        // Note this line for the detailed explanation

        *self = match old {
            VariantA(s) => VariantB(s, 0),
            VariantB(s, i) => VariantB(s, 2 * i),
        };
    }
}

for x in &mut data {
    x.transform();
}

一个替代实现，只是替换了String：

impl SomeType {
    fn transform(&mut self) {
        use SomeType::*;

        *self = match self {
            VariantA(s) => {
                let s = std::mem::replace(s, String::new());
                VariantB(s, 0)
            }
            VariantB(s, i) => {
                let s = std::mem::replace(s, String::new());
                VariantB(s, 2 * *i)
            }
        };
    }
}

一般来说，是的，您必须创建一些虚拟值才能通用并使用安全代码执行此操作。很多时候，您可以将整个元素包裹在Option 中并调用Option::take 以达到相同的效果。

另见：

Change enum variant while moving the field to the new variant

为什么这么复杂？

请参阅此proposed and now-closed RFC 进行大量相关讨论。我对该 RFC（及其背后的复杂性）的理解是，在某个时间段内，您的值会有一个未定义的值，这是 不安全。如果在那一秒发生恐慌，那么当你的值被删除时，你可能会触发未定义的行为，这是一件坏事。

如果您的代码在注释行出现恐慌，那么self 的值是一个具体的已知值。如果它是某个未知值，删除该字符串将尝试删除该未知值，然后我们回到 C 中。这就是 Dummy 值的目的 - 始终存储一个已知良好的值。

你甚至暗示过这个（强调我的）：

我必须处理代码中其他任何地方的SomeType::Dummy，尽管它应该在此循环之外永远不可见

“应该”是问题所在。在恐慌期间，该虚拟值可见。

另见：

now-removed implementation of Vec::map_in_place 包含近 175 行代码，其中大部分代码必须处理不安全的代码并推理为什么它实际上是安全的！一些板条箱重新实现了这一概念并试图使其安全；您可以在Sebastian Redl's answer 中查看示例。

【讨论】：

【解决方案3】：

您可以在take_mut 或replace_with crates 中写一个map_in_place：

fn map_in_place<T, F>(v: &mut [T], f: F)
where
    F: Fn(T) -> T,
{
    for e in v {
        take_mut::take(e, f);
    }
}

但是，如果在提供的函数中出现恐慌，程序将完全中止；你无法从恐慌中恢复过来。

或者，您可以在内部函数执行时提供一个占位符元素，该元素位于空白处：

use std::mem;

fn map_in_place_with_placeholder<T, F>(v: &mut [T], f: F, mut placeholder: T)
where
    F: Fn(T) -> T,
{
    for e in v {
        let mut tmp = mem::replace(e, placeholder);
        tmp = f(tmp);
        placeholder = mem::replace(e, tmp);
    }
}

如果出现恐慌，您提供的占位符将位于恐慌插槽中。

最后，您可以按需生成占位符；基本上在第一个版本中将take_mut::take替换为take_mut::take_or_recover。

【讨论】：