【问题标题】:Google secret manager API and google storage API not working with Apache Spark谷歌秘密管理器 API 和谷歌存储 API 不适用于 Apache Spark
【发布时间】:2021-09-25 12:15:00
【问题描述】:

StakOverflowers 同学您好,我目前正在开发一个使用 Scala 编写的基于 Apache Spark 的应用程序。我使用 Maven 作为构建工具。该应用程序运行良好,直到我有一个需要从 Google Secrets 管理器中提取一些秘密的用例。为此,我使用了 secret manager API,并使用了 maven 依赖项来导入 JAR。

当我在本地 (Intellij) 上运行该代码时,它似乎工作正常,但一旦我将它部署到我的 Spark 集群上,它就会开始失败并出现以下错误:

Exception in thread "main" java.lang.NoClassDefFoundError: com/google/common/util/concurrent/internal/InternalFutureFailureAccess
    at java.base/java.lang.ClassLoader.defineClass1(Native Method)
    at java.base/java.lang.ClassLoader.defineClass(Unknown Source)
    at java.base/java.security.SecureClassLoader.defineClass(Unknown Source)
    at java.base/jdk.internal.loader.BuiltinClassLoader.defineClass(Unknown Source)
    at java.base/jdk.internal.loader.BuiltinClassLoader.findClassOnClassPathOrNull(Unknown Source)
    at java.base/jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(Unknown Source)
    at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(Unknown Source)
    at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(Unknown Source)
    at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
    at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
    at java.base/java.lang.ClassLoader.defineClass1(Native Method)
    at java.base/java.lang.ClassLoader.defineClass(Unknown Source)
    at java.base/java.security.SecureClassLoader.defineClass(Unknown Source)
    at java.base/java.net.URLClassLoader.defineClass(Unknown Source)
    at java.base/java.net.URLClassLoader$1.run(Unknown Source)
    at java.base/java.net.URLClassLoader$1.run(Unknown Source)
    at java.base/java.security.AccessController.doPrivileged(Native Method)
    at java.base/java.net.URLClassLoader.findClass(Unknown Source)
    at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
    at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
    at java.base/java.lang.ClassLoader.defineClass1(Native Method)
    at java.base/java.lang.ClassLoader.defineClass(Unknown Source)
    at java.base/java.security.SecureClassLoader.defineClass(Unknown Source)
    at java.base/java.net.URLClassLoader.defineClass(Unknown Source)
    at java.base/java.net.URLClassLoader$1.run(Unknown Source)
    at java.base/java.net.URLClassLoader$1.run(Unknown Source)
    at java.base/java.security.AccessController.doPrivileged(Native Method)
    at java.base/java.net.URLClassLoader.findClass(Unknown Source)
    at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
    at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
    at com.google.api.gax.retrying.ScheduledRetryingExecutor.createFuture(ScheduledRetryingExecutor.java:102)
    at com.google.api.gax.rpc.RetryingCallable.futureCall(RetryingCallable.java:61)
    at com.google.api.gax.rpc.RetryingCallable.futureCall(RetryingCallable.java:41)
    at com.google.api.gax.tracing.TracedUnaryCallable.futureCall(TracedUnaryCallable.java:75)
    at com.google.api.gax.rpc.UnaryCallable$1.futureCall(UnaryCallable.java:126)
    at com.google.api.gax.rpc.UnaryCallable.futureCall(UnaryCallable.java:87)
    at com.google.api.gax.rpc.UnaryCallable.call(UnaryCallable.java:112)
    at com.google.cloud.secretmanager.v1.SecretManagerServiceClient.accessSecretVersion(SecretManagerServiceClient.java:1185)
    at com.google.cloud.secretmanager.v1.SecretManagerServiceClient.accessSecretVersion(SecretManagerServiceClient.java:1125)
    at cl.falabella.symphony.commons.IngestionOracleSparkApplication$.main(IngestionOracleSparkApplication.scala:104)
    at cl.falabella.symphony.commons.IngestionOracleSparkApplication.main(IngestionOracleSparkApplication.scala)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.base/java.lang.reflect.Method.invoke(Unknown Source)
    at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
    at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951)
    at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
    at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
    at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
    at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1030)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1039)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: com.google.common.util.concurrent.internal.InternalFutureFailureAccess
    at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(Unknown Source)
    at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(Unknown Source)
    at java.base/java.lang.ClassLoader.loadClass(Unknown Source)

我知道这是因为与 Guava 的依赖项冲突而发生的,所以我从 $SPARK_HOME/jars 中删除了旧的 guava 版本并将其更改为 guava-30.1.1-JRE,但没有用。

我在此处附上我的 pom.xml

<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.example.ingestion</groupId>
    <artifactId>fwk-ingestion-spark-extraction-core</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <packaging>jar</packaging>

    <properties>
        <maven.compiler.source>1.8</maven.compiler.source>
        <maven.compiler.target>1.8</maven.compiler.target>
        <encoding>UTF-8</encoding>
        <scala.version>2.12.10</scala.version>
        <scala-maven-plugin.version>4.0.0</scala-maven-plugin.version>
    </properties>

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-surefire-plugin</artifactId>
                <version>2.7</version>
                <configuration>
                    <skipTests>true</skipTests>
                </configuration>
            </plugin>
            <plugin>
                <!-- see http://davidb.github.com/scala-maven-plugin -->
                <groupId>net.alchim31.maven</groupId>
                <artifactId>scala-maven-plugin</artifactId>
                <version>4.3.0</version>
                <executions>
                    <execution>
                        <goals>
                            <goal>compile</goal>
                            <goal>testCompile</goal>
                        </goals>
                        <configuration>
                            <args>
                                <!-- <arg>-make:transitive</arg> -->
                                <arg>-dependencyfile</arg>
                                <arg>${project.build.directory}/.scala_dependencies</arg>
                            </args>
                        </configuration>
                    </execution>
                </executions>
            </plugin>

            <plugin>
                <artifactId>maven-assembly-plugin</artifactId>
                <version>3.3.0</version>
                <configuration>
                    <descriptorRefs>
                        <descriptorRef>jar-with-dependencies</descriptorRef>
                    </descriptorRefs>
                    <archive>
                        <manifest>
                            <mainClass>com.example.symphony.commons.EncryptApplication</mainClass>
                        </manifest>
                    </archive>
                </configuration>
                <executions>
                    <execution>
                        <id>make-assembly</id>
                        <phase>package</phase>
                        <goals>
                            <goal>single</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>

            <plugin>
                <groupId>org.scalatest</groupId>
                <artifactId>scalatest-maven-plugin</artifactId>
                <version>1.0</version>
                <configuration>
                    <reportsDirectory>${project.build.directory}/surefire-reports</reportsDirectory>
                    <junitxml>.</junitxml>
                    <filereports>WDF TestSuite.txt</filereports>
                </configuration>
                <executions>
                    <execution>
                        <id>test</id>
                        <goals>
                            <goal>test</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>

            <plugin>
                <groupId>org.jacoco</groupId>
                <artifactId>jacoco-maven-plugin</artifactId>
                <version>0.8.5</version>
                <executions>
                    <execution>
                        <goals>
                            <goal>prepare-agent</goal>
                        </goals>
                    </execution>
                    <execution>
                        <id>report</id>
                        <phase>test</phase>
                        <goals>
                            <goal>report</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
            <plugin>
                <artifactId>maven-shade-plugin</artifactId>
                <version>3.2.1</version>
                <executions>
                    <execution>
                        <phase>package</phase>
                        <goals>
                            <goal>shade</goal>
                        </goals>
                        <configuration>
                            <transformers>
                                <transformer
                                        implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer"/>
                            </transformers>
                            <filters>
                                <filter>
                                    <artifact>*:*</artifact>
                                    <excludes>
                                        <exclude>META-INF/*.SF</exclude>
                                        <exclude>META-INF/*.DSA</exclude>
                                        <exclude>META-INF/*.RSA</exclude>
                                    </excludes>
                                </filter>
                            </filters>
                            <shadedArtifactAttached>true</shadedArtifactAttached>
                        </configuration>
                    </execution>
                </executions>
            </plugin>

        </plugins>
    </build>

    <dependencyManagement>
        <dependencies>
            <dependency>
                <groupId>com.google.cloud</groupId>
                <artifactId>libraries-bom</artifactId>
                <version>20.8.0</version>
                <type>pom</type>
                <scope>import</scope>
            </dependency>
        </dependencies>
    </dependencyManagement>

    <dependencies>

        <dependency>
            <groupId>org.scala-lang</groupId>
            <artifactId>scala-library</artifactId>
            <version>${scala.version}</version>
        </dependency>
        <dependency>
            <groupId>org.scala-lang</groupId>
            <artifactId>scalap</artifactId>
            <version>2.12.10</version>
        </dependency>
        <dependency>
            <groupId>com.typesafe.scala-logging</groupId>
            <artifactId>scala-logging_2.12</artifactId>
            <version>3.9.2</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-avro_2.12</artifactId>
            <version>3.1.1</version>
        </dependency>
        <dependency>
            <groupId>com.google.cloud.bigdataoss</groupId>
            <artifactId>gcs-connector</artifactId>
            <version>hadoop2-1.9.17</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_2.12</artifactId>
            <version>3.1.1</version>
        </dependency>
        <dependency>
            <groupId>com.google.cloud.bigdataoss</groupId>
            <artifactId>bigquery-connector</artifactId>
            <version>hadoop2-0.13.9</version>
        </dependency>
        <dependency>
            <groupId>org.springframework.security</groupId>
            <artifactId>spring-security-core</artifactId>
            <version>5.2.1.RELEASE</version>
        </dependency>
        <dependency>
            <groupId>org.scala-lang.modules</groupId>
            <artifactId>scala-xml_2.12</artifactId>
            <version>1.3.0</version>
        </dependency>
        <dependency>
            <groupId>org.scalactic</groupId>
            <artifactId>scalactic_2.12</artifactId>
            <version>3.2.5</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.scalamock</groupId>
            <artifactId>scalamock-scalatest-support_2.12</artifactId>
            <version>3.6.0</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.scalatest</groupId>
            <artifactId>scalatest_2.12</artifactId>
            <version>3.2.2</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.12</artifactId>
            <version>3.1.1</version>
        </dependency>

        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-core</artifactId>
            <version>2.13.1</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-catalyst_2.12</artifactId>
            <version>3.1.1</version>
        </dependency>

        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-api-scala_2.12</artifactId>
            <version>11.0</version>
        </dependency>
        <dependency>
            <groupId>com.typesafe</groupId>
            <artifactId>config</artifactId>
            <version>1.2.0</version>
        </dependency>
        <dependency>
            <groupId>org.scalacheck</groupId>
            <artifactId>scalacheck_2.12</artifactId>
            <version>1.15.3</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>net.liftweb</groupId>
            <artifactId>lift-json_2.12</artifactId>
            <version>3.4.3</version>
        </dependency>
        <dependency>
            <groupId>com.google.cloud</groupId>
            <artifactId>google-cloud-storage</artifactId>
        </dependency>
        <dependency>
            <groupId>mysql</groupId>
            <artifactId>mysql-connector-java</artifactId>
            <version>8.0.23</version>
        </dependency>
        <dependency>
            <groupId>com.google.cloud</groupId>
            <artifactId>google-cloud-secretmanager</artifactId>
        </dependency>
        <dependency>
            <groupId>org.apache.httpcomponents</groupId>
            <artifactId>httpclient</artifactId>
            <version>4.5.13</version>
        </dependency>
        <dependency>
            <groupId>com.google.code.gson</groupId>
            <artifactId>gson</artifactId>
            <version>2.8.7</version>
        </dependency>



    </dependencies>
</project>

我们将不胜感激任何帮助或领导。另外,我正在使用带阴影的罐子。

【问题讨论】:

    标签: java scala maven apache-spark guava


    【解决方案1】:

    您需要向 google-cloud-secretmanager 添加排除项:

    <dependency>
      <groupId>com.google.cloud</groupId>
      <artifactId>google-cloud-secretmanager</artifactId>
      <exclusions>
        <exclusion>
          <groupId>com.google.api</groupId>
          <artifactId>gax</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    

    并添加以下依赖项:

    <dependency>
      <groupId>com.google.api</groupId>
      <artifactId>gax</artifactId>
      <version>1.66.0</version>
      <classifier>guavashaded</classifier>
    </dependency>
    

    FWIW 前段时间我提交了一个 PR 来升级 Apache Spark 中的 guava,但显然其中的 guava 情况由于其自身的依赖问题而变得复杂。

    【讨论】:

    • 您好@Steve,感谢您的回答。我认为这可能会有所帮助,但我已经通过将 jars 添加到 spark jars 文件夹来解决问题。不过,我有一个问题:为什么 Spark 无法引用与代码一起打包的 jar?即使我使用了阴影插件。
    • 堆栈跟踪中的所有代码看起来都没有阴影,否则包名称会不同
    • 您需要小心地将 jars 添加到 spark/jars 中,特别是如果其中之一是 Guava。这将在最负担不起的情况下破坏 Spark
    猜你喜欢
    • 1970-01-01
    • 2020-11-13
    • 1970-01-01
    • 1970-01-01
    • 2020-09-06
    • 2021-01-05
    • 2014-08-29
    • 2017-09-05
    • 1970-01-01
    相关资源
    最近更新 更多