统计cassandra单表数据量 | 您所在的位置:网站首页 › cassandra分组查询 › 统计cassandra单表数据量 |
当cassandra数据量很大时使用select count(*)这种方式基本上是无法统计的,会返回如下类似错误信息: Cassandra timeout during read query at Consitency ONE(1 responses were required but only 0 replica responed) 这时候可以借助cassandra-count这个工具来实现count的统计,需要注意的是这个工具在工作时会对cassandra服务器CPU以及内存使用都会带来不同程度的压力,所以在线上尽量不要执行count操作,cassandra不适合做count统计, 1、下载cassandra-count工具,地址GitHub - brianmhess/cassandra-count: Count rows in Cassandra Table 2、执行如下命令,数据量很大时可以通过调大numSplits值来避免read timeout问题 ./cassandra-count -host xx.xx.xx.xx -keyspace ks -table table1 -numSplits 1024 PS:指令参考 Switch Option Default Description -host IP Address Cassandra connection point - required. -keyspace Keyspace Name Cassandra keyspace - required. -table Table Name Cassandra table name - required. -configFile Filename none Filename of configuration options -port Port Number 9042 Cassandra native protocol port number -user Username none Cassandra username -pw Password none Cassandra password -ssl-truststore-path Truststore Path none Path to SSL truststore -ssl-truststore-pwd Truststore Password none Password to SSL truststore -ssl-keystore-path Keystore Path none Path to SSL keystore -ssl-keystore-path Keystore Password none Password to SSL keystore '-consistencyLevel Consistency Level LOCAL_ONE CQL Consistency Level -numSplits Number of Splits Number of Token Ranges Number of splits/queries to create -numFutures Number of Futures 1000 Number of Java driver futures in flight. -splitSize Size of Split in MB 16 Split size in MB -debug Debug mode 0 Debug printing verbosity (0=none, 1=some, 2=verbose)PS: 上面工具如果不行,可以试用下GitHub - lkycxb/cassandra-count: Count rows in Cassandra Table,Test data count reached 3 billion. |
CopyRight 2018-2019 实验室设备网 版权所有 |