使用惰性 val 序列化案例类会导致 StackOverflow答案

【问题标题】：Serializing a case class with a lazy val causes a StackOverflow使用惰性 val 序列化案例类会导致 StackOverflow
【发布时间】：2021-03-01 17:28:12
【问题描述】：

假设我定义了以下案例类：

case class C(i: Int) {
    lazy val incremented = copy(i = i + 1)
}

然后尝试序列化为json：

val mapper = new ObjectMapper()
mapper.registerModule(DefaultScalaModule)
val out = new StringWriter
mapper.writeValue(out, C(4))
val json = out.toString()
println("Json is: " + json)

会抛出以下异常：

Exception in thread "main" com.fasterxml.jackson.databind.JsonMappingException: Infinite recursion (*Error) (through reference chain: C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]->C["incremented"]-
...

我不知道为什么它首先默认尝试序列化惰性 val？这在我看来不是合乎逻辑的方法

我可以禁用此功能吗？

【问题讨论】：

如果您不仅限于使用 jackson-module-scala，请尝试jsoniter-scala。它doesn't serialize fields which are not defined in the constructor.
我在 akka 中使用它，所以如果我可以与 akka 集成非常值得一试
话虽如此，将 Json4s 与 Jackson 一起使用时，这不是问题。因此，如果无法禁用惰性 val 的序列化，更好的选择可能是将其用作 akka 中的自定义序列化程序。
你可以试试@transient注解。
我只是建议您使用真正的 Scala 库而不是 Java 库。例如，circe 开箱即用地支持此功能，请参阅this。

标签： scala jackson-databind akka-cluster json4s

【解决方案1】：

您不能序列化该类，因为该值是无限递归的（因此堆栈溢出）。具体来说，C(4) 的incremented 的值是C(5) 的一个实例。对于C(5)，incremented 的值为C(6)。 incremented 对于C(6) 的值是C(7) 等等...

由于C(n) 的实例包含C(n+1) 的实例，它永远无法完全序列化。

如果您不希望某个字段出现在 JSON 中，请将其设为函数：

case class C(i: Int) {
    def incremented = copy(i = i + 1)
}

这个问题的根源是试图序列化一个也实现应用程序逻辑的类，这违反了关注点分离原则（SOLID 中的 S）。

最好有不同的序列化类，并根据需要从应用程序数据中填充它们。这允许使用不同形式的序列化，而无需更改应用程序逻辑。

【讨论】：

我明白为什么会有溢出。我只是不明白为什么它的 serliazing 懒惰 vals。无关紧要。它违背了惰性值的全部目的，并且通过转换为函数，我失去了代码中的全部好处。我的问题是我可以禁用惰性 val 的序列化吗？
@user79074 序列化惰性值并不重要，重要的是序列化应用程序逻辑时应该只序列化纯数据对象。停止尝试序列化代码，问题就会消失。

【解决方案2】：

我找到的解决方案是使用json4s 进行序列化而不是杰克逊数据绑定。我的问题是使用 akka 集群出现的，所以我必须在我的项目中添加一个自定义序列化器。以下是我的完整实现供参考：

class Json4sSerializer(system: ExtendedActorSystem) extends Serializer {

private val actorRefResolver = ActorRefResolver(system.toTyped)

object ActorRefSerializer extends CustomSerializer[ActorRef[_]](format => (
    {
        case JString(str) =>
            actorRefResolver.resolveActorRef[AnyRef](str)
    },
    {
        case actorRef: ActorRef[_] =>
            JString(actorRefResolver.toSerializationFormat(actorRef))
    }
))

implicit private val formats = DefaultFormats + ActorRefSerializer

def includeManifest: Boolean = true
def identifier = 1234567

def toBinary(obj: AnyRef): Array[Byte] = {
    write(obj).getBytes(StandardCharsets.UTF_8)
}

def fromBinary(bytes: Array[Byte], clazz: Option[Class[_]]): AnyRef = clazz match {
    case Some(cls) =>
        read[AnyRef](new String(bytes, StandardCharsets.UTF_8))(formats, ManifestFactory.classType(cls))
    case None =>
        throw new RuntimeException("Specified includeManifest but it was never passed")
}
}

【讨论】：

【解决方案3】：

这是因为 Jackson 是为 Java 设计的。具体来说，请注意：

Java 不知道lazy val
Java 围绕字段和构造函数的正常语义不允许将字段划分为“构造所需”和“构造派生”（这两个都不是技术术语），Scala 在默认构造函数中结合了val（隐式出现在 case class) 和 val 在类的主体中提供

第二个的结果是（有时除了 bean），面向 Java 的序列化方法倾向于假设任何是字段（包括 private 字段，因为 Java 习惯用法是通过默认）在对象中需要被序列化，可以通过@transient注解选择退出。

反过来，第一个意味着lazy vals 由编译器以包含private 字段的方式实现。

因此对于像 Jackson 这样面向 Java 的序列化程序，没有 @transient 注释的 lazy val 会被序列化。

面向 Scala 的序列化方法（例如 circe、play-json 等）倾向于通过仅序列化构造函数参数来序列化 case classes。

【讨论】：

附带说明，拥有lazy val 缓存的转换可能会比获得更多的痛苦，因为只要源对象是活动的，转换的对象就会是活动的.由于您使用的是 Akka 和集群，因此这样做的额外内存消耗很可能（最终概率接近 100%）导致 GC 压力，这将使您的集群分崩离析。
如果你想缓存昂贵的转换结果，使用sealed abstract case class 技巧让你覆盖apply 方法来将创建的实例保存在缓存中几乎肯定会更好（你会还必须提供copy的重新实现。
另一种选择是通过结合弱引用巧妙地重新实现lazy val。但需要说明的是，lazy val 作为性能优化不仅不是免费的，而且成本非常高，通常不是净收益。