【发布时间】:2020-01-01 17:10:06
【问题描述】:
我正在使用 spark-sql-2.4.1 ,spark-cassandra-connector_2.11-2.4.1 和 java8。
我正在做如下简单的查询来获取 C* 表的行数。
JavaSparkContext sc = new JavaSparkContext(spark.sparkContext());
long recCount = javaFunctions(sc).cassandraTable(keyspace, columnFamilyName).cassandraCount();
但它正在超时并出现以下错误。
java.io.IOException: Exception during execution of SELECT count(*) FROM "radata"."model_vals" WHERE token("model_id", "type", "value", "code") > ? AND token("model_id", "type", "value", "code") <= ? ALLOW FILTERING: Cassandra timeout during read query at consistency LOCAL_ONE (1 responses were required but only 0 replica responded)
at com.datastax.spark.connector.rdd.CassandraTableScanRDD.com$datastax$spark$connector$rdd$CassandraTableScanRDD$$fetchTokenRange(CassandraTableScanRDD.scala:350)
at com.datastax.spark.connector.rdd.CassandraTableScanRDD$$anonfun$17.apply(CassandraTableScanRDD.scala:367)
Caused by: com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout during read query at consistency LOCAL_ONE (1 responses were required but only 0 replica responded)
我正在使用具有以下设置的 Cassandra 6 节点集群:
cassandra.output.consistency.level=ANY
cassandra.concurrent.writes=1500
cassandra.output.batch.size.bytes=2056
cassandra.output.batch.grouping.key=partition
cassandra.output.batch.grouping.buffer.size=3000
cassandra.output.throughput_mb_per_sec=128
cassandra.connection.keep_alive_ms=30000
cassandra.read.timeout_ms=600000
1) 为什么在解释之前附加“ALLOW FILTERING” 实际执行?
2) 即使我设置了“cassandra.output.consistency.level=ANY”为什么会这样 使用“一致性 LOCAL_ONE”执行?
如何解决这些问题?
【问题讨论】:
标签: cassandra apache-spark-sql datastax-java-driver