【发布时间】:2019-07-06 21:12:44
【问题描述】:
我正在阅读 cassandra 冲洗策略并遇到以下声明 -
If the data to be flushed exceeds the memtable_cleanup_threshold, Cassandra blocks writes until the next flush succeeds.
现在我的查询是,假设我们每秒向 cassandra 疯狂写入大约 10K 条记录,并且应用程序正在 24*7 运行。我们应该在以下参数中进行哪些设置以避免阻塞。
memtable_heap_space_in_mb
memtable_offheap_space_in_mb
memtable_cleanup_threshold
& 由于它是时间序列数据,我是否还需要对压缩策略进行任何更改。如果是,什么最适合我的情况。
我从 kafka 获取数据并不断插入 Cassandra 的 spark 应用程序在特定时间后挂起,我当时分析过,nodetool compactionstats 中有很多待处理的任务。
nodetool tablehistograms
% SSTables WL RL P Size Cell Count
(ms) (ms) (bytes)
50% 642.00 88.15 25109.16 310 24
75% 770.00 263.21 668489.53 535 50
95% 770.00 4055.27 668489.53 3311 310
98% 770.00 8409.01 668489.53 73457 6866
99% 770.00 12108.97 668489.53 219342 20501
Min 4.00 11.87 20924.30 150 9
Max 770.00 1996099.05 668489.53 4866323 454826
Keyspace : trackfleet_db
Read Count: 7183347
Read Latency: 15.153115504235004 ms
Write Count: 2402229293
Write Latency: 0.7495135263492935 ms
Pending Flushes: 1
Table: locationinfo
SSTable count: 3307
Space used (live): 62736956804
Space used (total): 62736956804
Space used by snapshots (total): 10469827269
Off heap memory used (total): 56708763
SSTable Compression Ratio: 0.38214618375483633
Number of partitions (estimate): 493571
Memtable cell count: 2089
Memtable data size: 1168808
Memtable off heap memory used: 0
Memtable switch count: 88033
Local read count: 765497
Local read latency: 162.880 ms
Local write count: 782044138
Local write latency: 1.859 ms
Pending flushes: 0
Percent repaired: 0.0
Bloom filter false positives: 368
Bloom filter false ratio: 0.00000
Bloom filter space used: 29158176
Bloom filter off heap memory used: 29104216
Index summary off heap memory used: 7883835
Compression metadata off heap memory used: 19720712
Compacted partition minimum bytes: 150
Compacted partition maximum bytes: 4866323
Compacted partition mean bytes: 7626
Average live cells per slice (last five minutes): 3.5
Maximum live cells per slice (last five minutes): 6
Average tombstones per slice (last five minutes): 1.0
Maximum tombstones per slice (last five minutes): 1
Dropped Mutations: 359
更改压缩策略后:-
Keyspace : trackfleet_db
Read Count: 8568544
Read Latency: 15.943608060365916 ms
Write Count: 2568676920
Write Latency: 0.8019530641630868 ms
Pending Flushes: 1
Table: locationinfo
SSTable count: 5843
SSTables in each level: [5842/4, 0, 0, 0, 0, 0, 0, 0, 0]
Space used (live): 71317936302
Space used (total): 71317936302
Space used by snapshots (total): 10469827269
Off heap memory used (total): 105205165
SSTable Compression Ratio: 0.3889946058934169
Number of partitions (estimate): 542002
Memtable cell count: 235
Memtable data size: 131501
Memtable off heap memory used: 0
Memtable switch count: 93947
Local read count: 768148
Local read latency: NaN ms
Local write count: 839003671
Local write latency: 1.127 ms
Pending flushes: 1
Percent repaired: 0.0
Bloom filter false positives: 1345
Bloom filter false ratio: 0.00000
Bloom filter space used: 54904960
Bloom filter off heap memory used: 55402400
Index summary off heap memory used: 14884149
Compression metadata off heap memory used: 34918616
Compacted partition minimum bytes: 150
Compacted partition maximum bytes: 4866323
Compacted partition mean bytes: 4478
Average live cells per slice (last five minutes): NaN
Maximum live cells per slice (last five minutes): 0
Average tombstones per slice (last five minutes): NaN
Maximum tombstones per slice (last five minutes): 0
Dropped Mutations: 660
谢谢,
【问题讨论】:
-
您可以为您的餐桌添加
nodetool tablestats吗?
标签: scala cassandra spark-streaming cassandra-3.0