Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
569 views
in Technique[技术] by (71.8m points)

hadoop - Hbase quickly count number of rows

Right now I implement row count over ResultScanner like this

for (Result rs = scanner.next(); rs != null; rs = scanner.next()) {
    number++;
}

If data reaching millions time computing is large.I want to compute in real time that i don't want to use Mapreduce

How to quickly count number of rows.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Use RowCounter in HBase RowCounter is a mapreduce job to count all the rows of a table. This is a good utility to use as a sanity check to ensure that HBase can read all the blocks of a table if there are any concerns of metadata inconsistency. It will run the mapreduce all in a single process but it will run faster if you have a MapReduce cluster in place for it to exploit.

$ hbase org.apache.hadoop.hbase.mapreduce.RowCounter <tablename>

Usage: RowCounter [options] 
    <tablename> [          
        --starttime=[start] 
        --endtime=[end] 
        [--range=[startKey],[endKey]] 
        [<column1> <column2>...]
    ]

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

56.9k users

...