【问题标题】:Spark streaming elasticsearch dependenciesSpark 流式 Elasticsearch 依赖项
【发布时间】:2015-05-12 23:51:17
【问题描述】:

我正在尝试在 scala 中集成 Spark 和 Elasticsearch,如 Elasticsearch Guide 中所述

我在编译时遇到了依赖问题:

[trace] Stack trace suppressed: run last *:update for the full output.
[error] (*:update) sbt.ResolveException: unresolved dependency: cascading#ing-local;2.5.6: not found
[error] unresolved dependency: clj-time#clj-time;0.4.1: not found
[error] unresolved dependency: compojure#compojure;1.1.3: not found
[error] unresolved dependency: hiccup#hiccup;0.3.6: not found
[error] unresolved dependency: ring#ring-devel;0.3.11: not found
[error] unresolved dependency: ring#ring-jetty-adapter;0.3.11: not found
[error] unresolved dependency: com.twitter#carbonite;1.4.0: not found
[error] unresolved dependency: cascading#cascading-hadoop;2.5.6: not found
[error] Total time: 86 s, completed 19 nov. 2014 08:42:58

我的 build.sbt 文件看起来像这样

name := "twitter-sparkstreaming-elasticsearch"

version := "0.0.1"

scalaVersion := "2.10.4"

// additional libraries
libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % "1.1.0",
  "org.apache.spark" %% "spark-streaming" % "1.1.0",
  "org.apache.spark" %% "spark-streaming-twitter" % "1.1.0",
  "org.elasticsearch" % "elasticsearch-hadoop" % "2.1.0"
)

帮助? 谢谢。

【问题讨论】:

    标签: scala twitter elasticsearch streaming apache-spark


    【解决方案1】:

    级联及其依赖项在 Maven 中心中不可用,但在它们自己的存储库中(es-hadoop 无法通过其 pom 指定)。

    我使用elasticsearch-spark_2.10解决了这个问题

    http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/master/install.html

    【讨论】:

      【解决方案2】:

      Sbt 无法解析某些依赖项,因为它们不是 Maven 存储库的一部分。 但是,您可以在 clojarsconjars 上找到它们。 您需要添加以下行以便 sbt 可以解决它们:

      resolvers += "clojars" at "https://clojars.org/repo"
      resolvers += "conjars" at "http://conjars.org/repo"
      

      此外,elasticsearch-hadoop "2.1.0" 依赖项不存在(还没有?),您应该使用“2.1.0.Beta4”(或您阅读本文时的最新版本)

      您的 sbt 文件应如下所示:

      name := "twitter-sparkstreaming-elasticsearch"
      
      version := "0.0.1"
      
      scalaVersion := "2.10.4"
      
      libraryDependencies ++= Seq(
          "org.apache.spark" %% "spark-core" % "1.1.0",
          "org.apache.spark" %% "spark-streaming" % "1.1.0",
          "org.apache.spark" %% "spark-streaming-twitter" % "1.1.0",
          "org.elasticsearch" % "elasticsearch-hadoop" % "2.1.0.Beta4"
      )
      
      resolvers += "clojars" at "https://clojars.org/repo"
      resolvers += "conjars" at "http://conjars.org/repo"
      

      这已经过测试(使用 spark-core 1.3.1 并且没有 spark-streaming,但它应该适合您)。 希望对您有所帮助。

      【讨论】:

        【解决方案3】:

        这是因为级联及其依赖项不在 Maven 中。 您必须添加解析器才能获取它们

        将此行添加到您的 build.sbt

        resolvers += "conjars.org" at "http://conjars.org/repo"
        

        您的 build.sbt 应该如下所示:

        name := "twitter-sparkstreaming-elasticsearch"
        
        version := "0.0.1"
        
        scalaVersion := "2.10.4"
        
        // additional libraries
        libraryDependencies ++= Seq(
          "org.apache.spark" %% "spark-core" % "1.1.0",
          "org.apache.spark" %% "spark-streaming" % "1.1.0",
          "org.apache.spark" %% "spark-streaming-twitter" % "1.1.0",
          "org.elasticsearch" % "elasticsearch-hadoop" % "2.1.0"
        )
        
        resolvers += "conjars.org" at "http://conjars.org/repo"
        

        注意:此问题已在https://github.com/elasticsearch/elasticsearch-hadoop/issues/304 提出并关闭,解决方案与上述相同

        【讨论】:

          【解决方案4】:

          "org.elasticsearch" % "elasticsearch-hadoop" % "2.1.0" 尚不可用。你可以使用"org.elasticsearch" % "elasticsearch-hadoop" % "2.1.0.Beta2"

          在这里查看:

          EDIT1

          即使在使用"org.elasticsearch" % "elasticsearch-hadoop" % "2.1.0.Beta2" 之后,我也遇到了同样的错误:

          [info] Resolving org.fusesource.jansi#jansi;1.4 ...
          [warn]  ::::::::::::::::::::::::::::::::::::::::::::::
          [warn]  ::          UNRESOLVED DEPENDENCIES         ::
          [warn]  ::::::::::::::::::::::::::::::::::::::::::::::
          [warn]  :: cascading#cascading-local;2.5.6: not found
          [warn]  :: clj-time#clj-time;0.4.1: not found
          [warn]  :: compojure#compojure;1.1.3: not found
          [warn]  :: hiccup#hiccup;0.3.6: not found
          [warn]  :: ring#ring-devel;0.3.11: not found
          [warn]  :: ring#ring-jetty-adapter;0.3.11: not found
          [warn]  :: com.twitter#carbonite;1.4.0: not found
          [warn]  :: cascading#cascading-hadoop;2.5.6: not found
          [warn]  ::::::::::::::::::::::::::::::::::::::::::::::
          [trace] Stack trace suppressed: run last *:update for the full output.
          [error] (*:update) sbt.ResolveException: unresolved dependency: cascading#cascading-local;2.5.6: not found
          [error] unresolved dependency: clj-time#clj-time;0.4.1: not found
          [error] unresolved dependency: compojure#compojure;1.1.3: not found
          [error] unresolved dependency: hiccup#hiccup;0.3.6: not found
          [error] unresolved dependency: ring#ring-devel;0.3.11: not found
          [error] unresolved dependency: ring#ring-jetty-adapter;0.3.11: not found
          [error] unresolved dependency: com.twitter#carbonite;1.4.0: not found
          [error] unresolved dependency: cascading#cascading-hadoop;2.5.6: not found
          [error] Total time: 41 s, completed Nov 19, 2014 8:44:04 PM
          

          我的build.sbt 看起来像这样

          name := "twitter-sparkstreaming-elasticsearch"
          
          version := "0.0.1"
          
          scalaVersion := "2.10.4"
          
          // additional libraries
          libraryDependencies ++= Seq(
            "org.apache.spark" %% "spark-core" % "1.1.0",
            "org.apache.spark" %% "spark-streaming" % "1.1.0",
            "org.apache.spark" %% "spark-streaming-twitter" % "1.1.0",
            "org.elasticsearch" % "elasticsearch-hadoop" % "2.1.0.Beta2",
            "org.elasticsearch" % "elasticsearch-hadoop-cascading" % "2.1.0.Beta2"
          )
          
          resolvers += "sonatype-oss" at "http://oss.sonatype.org/content/repositories/snapshots"
          
          resolvers += "Typesafe Repo" at "http://repo.typesafe.com/typesafe/releases/"
          

          【讨论】:

            【解决方案5】:

            对于 Spark 仅支持,您可以使用简约二进制文件。 将以下内容添加到libraryDependencies in build.sbt

            "org.elasticsearch" % "elasticsearch-spark_2.10" % "2.1.0.Beta3"
            

            注意:“2.10”指的是兼容的 Scala 版本!

            并删除

            "org.elasticsearch" % "elasticsearch-hadoop" % "2.1.0.Beta3"
            

            这将避免问题中列出的未解决的部门。

            【讨论】:

              猜你喜欢
              • 1970-01-01
              • 1970-01-01
              • 2016-03-13
              • 2019-10-28
              • 2015-01-02
              • 2019-08-12
              • 2020-05-08
              • 2016-04-01
              • 1970-01-01
              相关资源
              最近更新 更多