【问题标题】:Problems while compiling a spark project in eclipse在eclipse中编译spark项目时出现问题
【发布时间】:2015-08-17 13:40:39
【问题描述】:

我已经根据spark website 配置了所有内容,以启动一个简单的 spark 应用程序读取和计算文件中的行数并在另一个文件中显示数字 .但我无法运行该应用程序,因为我遇到了很多错误而且我不明白出了什么问题。

这是我的项目结构:

sparkExamples
|-- pom.xml
`-- src
    |-- main/java/org/sparkExamplex/App.java
`-- resources
          |-- readLine
          |-- outputReadLine

pom.xml

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>org.sparkexamples</groupId>
    <artifactId>sparkExamples</artifactId>
    <version>0.0.1-SNAPSHOT</version>

    <name>sparkExamples</name>
    <url>http://maven.apache.org</url>


    <dependencies>
        <dependency> <!-- Spark dependency -->
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.10</artifactId>
            <version>1.3.1</version>
        </dependency>
    </dependencies>
</project>

App.java

import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.api.java.JavaPairRDD;
import org.apache.spark.api.java.function.FlatMapFunction;
import org.apache.spark.api.java.function.Function2;
import org.apache.spark.api.java.function.PairFunction;

import scala.Tuple2;

import java.util.Arrays;
import java.util.regex.Pattern;

public final class App {
    private static final Pattern SPACE = Pattern.compile(" ");

    public static void main(String[] args) throws Exception {

        String inputFile = "resources/readLine";
        String outputFile = "resources/outputReadLine";
        // Create a Java Spark Context.
        SparkConf conf = new SparkConf().setAppName("wordCount").setMaster("spark://127.0.0.1:7077");
        JavaSparkContext sc = new JavaSparkContext(conf);
        // Load our input data.
        JavaRDD<String> input = sc.textFile(inputFile);
        // Split up into words.
        JavaRDD<String> words = input.flatMap(new FlatMapFunction<String, String>() {
            public Iterable<String> call(String x) {
                return Arrays.asList(x.split(" "));
            }
        });
        // Transform into word and count.
        JavaPairRDD<String, Integer> counts = words.mapToPair(new PairFunction<String, String, Integer>() {
            public Tuple2<String, Integer> call(String x) {
                return new Tuple2(x, 1);
            }
        }).reduceByKey(new Function2<Integer, Integer, Integer>() {
            public Integer call(Integer x, Integer y) {
                return x + y;
            }
        });
        // Save the word count back out to a text file, causing evaluation.
        counts.saveAsTextFile(outputFile);
    }
}


    }

显示的错误:

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
15/06/03 13:11:46 INFO SparkContext: Running Spark version 1.3.1
15/06/03 13:11:47 INFO SecurityManager: Changing view acls to: Administrator
15/06/03 13:11:47 INFO SecurityManager: Changing modify acls to: Administrator
15/06/03 13:11:47 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(Administrator); users with modify permissions: Set(Administrator)
15/06/03 13:11:47 INFO Slf4jLogger: Slf4jLogger started
15/06/03 13:11:47 INFO Remoting: Starting remoting
15/06/03 13:11:47 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@mvettos.romelab.it.ibm.com:9164]
15/06/03 13:11:47 INFO Utils: Successfully started service 'sparkDriver' on port 9164.
15/06/03 13:11:47 INFO SparkEnv: Registering MapOutputTracker
15/06/03 13:11:47 INFO SparkEnv: Registering BlockManagerMaster
15/06/03 13:11:47 INFO DiskBlockManager: Created local directory at C:\Users\ADMINI~1\AppData\Local\Temp\spark-e6bb5cfc-6b96-4105-9a1c-843832ba60f9\blockmgr-dea7bb85-954c-4a4d-b3fb-74d7b6b1d9f5
15/06/03 13:11:47 INFO MemoryStore: MemoryStore started with capacity 467.6 MB
15/06/03 13:11:47 INFO HttpFileServer: HTTP File server directory is C:\Users\ADMINI~1\AppData\Local\Temp\spark-6e11c6bc-2743-4172-8d74-f3abc08d9f46\httpd-2bfa61a2-a1fd-4bd3-85c2-bcbc05d2ec27
15/06/03 13:11:47 INFO HttpServer: Starting HTTP Server
15/06/03 13:11:48 INFO Server: jetty-8.y.z-SNAPSHOT
15/06/03 13:11:48 INFO AbstractConnector: Started SocketConnector@0.0.0.0:9165
15/06/03 13:11:48 INFO Utils: Successfully started service 'HTTP file server' on port 9165.
15/06/03 13:11:48 INFO SparkEnv: Registering OutputCommitCoordinator
15/06/03 13:11:48 INFO Server: jetty-8.y.z-SNAPSHOT
15/06/03 13:11:48 INFO AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040
15/06/03 13:11:48 INFO Utils: Successfully started service 'SparkUI' on port 4040.
15/06/03 13:11:48 INFO SparkUI: Started SparkUI at http://mvettos.romelab.it.ibm.com:4040
15/06/03 13:11:48 INFO AppClient$ClientActor: Connecting to master akka.tcp://sparkMaster@127.0.0.1:7077/user/Master...
15/06/03 13:11:49 WARN AppClient$ClientActor: Could not connect to akka.tcp://sparkMaster@127.0.0.1:7077: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster@127.0.0.1:7077
15/06/03 13:11:49 WARN Remoting: Tried to associate with unreachable remote 
address [akka.tcp://sparkMaster@127.0.0.1:7077]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: Connection refused: no further information: /127.0.0.1:7077
15/06/03 13:12:08 INFO AppClient$ClientActor: Connecting to master akka.tcp://sparkMaster@127.0.0.1:7077/user/Master...
15/06/03 13:12:09 WARN AppClient$ClientActor: Could not connect to akka.tcp://sparkMaster@127.0.0.1:7077: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster@127.0.0.1:7077
15/06/03 13:12:09 WARN Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster@127.0.0.1:7077]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: Connection refused: no further information: /127.0.0.1:7077
15/06/03 13:12:28 INFO AppClient$ClientActor: Connecting to master akka.tcp://sparkMaster@127.0.0.1:7077/user/Master...
15/06/03 13:12:29 WARN AppClient$ClientActor: Could not connect to akka.tcp://sparkMaster@127.0.0.1:7077: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster@127.0.0.1:7077
15/06/03 13:12:29 WARN Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster@127.0.0.1:7077]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: Connection refused: no further information: /127.0.0.1:7077
15/06/03 13:12:48 ERROR SparkDeploySchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up.
15/06/03 13:12:48 WARN SparkDeploySchedulerBackend: Application ID is not initialized yet.
15/06/03 13:12:48 ERROR TaskSchedulerImpl: Exiting due to error from cluster scheduler: All masters are unresponsive! Giving up.

有人可以告诉我如何计算这个吗?

提前致谢。

【问题讨论】:

  • 您的程序似乎可以正确连接到火花。查看spark over 7077的方向和服务。我认为spark服务没有正常运行。
  • @JTejedor 你能说得更具体点吗?
  • 假设您使用的是 Spark 独立模式。在手册link。您可以通过“spark://HOST:PORT” ping 检查服务是否正常工作。如果不是,问题是因为s​​park的服务宕机了。
  • 我没有下载spark包,我只创建了一个maven项目,我链接了依赖项,我应该下载this吗? p.s 我希望 spark 独立运行,没有任何 hadoop 实例化。
  • 所以你需要安装 spark 独立包(是的,那个链接),因为你的程序现在正在尝试连接到不存在的 spark 服务

标签: java eclipse maven apache-spark


【解决方案1】:

经过几次cmet,我们可以总结出如下答案:

  • 组件配置良好,您可以正确遵循 cloudera 教程步骤。
  • 该应用无法与 Spark 连接,因为您的系统中当前安装了任何 Spark 服务。
  • 所以,您现在需要一个教程来在 Windows 环境中将 Spark 作为独立安装。在stackoverflow中搜索,我发现a question等于这个。

简而言之,这个问题是重复的。 我希望这个答案可以帮助你。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2011-04-27
    • 1970-01-01
    • 1970-01-01
    • 2021-12-14
    • 2018-01-12
    • 1970-01-01
    • 1970-01-01
    • 2016-02-27
    相关资源
    最近更新 更多