In the world of NoSQL

My HBase cluster refused to start after upgrading from CDH3 to CDH4. This is a known issue according to the cloudera documentation, and the workaround is to delete the /hbase ZNode.

— During an upgrade from CDH3 to CDH4, regions in transition may cause HBase startup failures.

Bug: None
Severity: Medium
Anticipated Resolution: To be fixed in a future release.
Workaround: Delete the /hbase ZNode in ZooKeeper before starting up CDH4.

So to delete the ZNode I did the following:

[root@hbase1 ~]# /usr/lib/zookeeper/bin/
Connecting to localhost:2181
... log entries
[zk: localhost:2181(CONNECTED) 0] rmr /hbase

After doing this the cluster started as it should.


§567 · oktober 2, 2013 · Hadoop, HBase, ZooKeeper · (No comments) · Tags: , , , ,

I ran out of space on the server running namenode, hbase master, hbase regionserver and a datanode and during the subsequent restarts hbase master wouldn’t start.
During log splitting it died with the following error:

2013-07-02 19:52:12,269 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown.
        at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader$WALReaderFSDataInputStream.getPos(
        at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.<init>(
        at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.init(
        at org.apache.hadoop.hbase.regionserver.wal.HLog.getReader(
        at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.getReader(
        at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.getReader(
        at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLog(
        at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLog(
        at org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(
        at org.apache.hadoop.hbase.master.MasterFileSystem.splitLogAfterStartup(
        at org.apache.hadoop.hbase.master.HMaster.finishInitialization(
2013-07-02 19:52:12,271 INFO org.apache.hadoop.hbase.master.HMaster: Aborting

I found two ways to get it to start up again, the first one I tried was to move away the log splitting directory in hdfs with the following command (strongly discouraged to do this):

$ hadoop fs -mv /hbase/.logs/,60020,1367325077343-splitting /user/hdfs

After some help from #hbase on I moved it back and tried starting hbase master with java assertions disabled, and that solved the issue.

To disable assertions in the JVM you make sure that the parameter -da (or -disableassertions) is passed to java when invoked.

I did this by editing /etc/hbase/conf/ and adding -da to the HBASE_MASTER_OPTS environment variable.


HBase crashed for me this night, due to the extra leap second inserted (2012-06-30 23:59:60).


When attempting to restart HBase, it just didn’t start. I found this resource for a tip that might work to get it up (although I found it after rebooting my servers, so I didn’t try it):


All java processes(including all HDFS-related) were using 100% CPU, together with ksoftirq. I turned off ntpd autostart(chkconfig ntpd off), and rebooted the servers, and then started my HBase cluster back up. This solved the issue.


§163 · juli 1, 2012 · Hadoop, HBase · 1 comment · Tags: , ,

I found a neat trick to enable a history file for the HBase shell, put the following into ~/.irbrc:

require 'irb/ext/save-history'
IRB.conf[:SAVE_HISTORY] = 100
IRB.conf[:HISTORY_FILE] = "#{ENV['HOME']}/.irb-save-history"

This enabled history saving for me when running irb directly, but didn’t work in the HBase shell, so I also added the following to the end of ~/.irbrc:

Kernel.at_exit do
  IRB.conf[:AT_EXIT].each do |i|

In CentOS you also need to make sure that the package ruby-irb is installed, and in debian the package is named irb1.8.


§150 · juni 28, 2012 · HBase · Kommentarer inaktiverade för History in the HBase shell · Tags: , , ,

This is an example on how to import data into hbase with importtsv and completebulkload:

Step 1, run the TSV file through importtsv to create the HFiles:

[root@hbase1 bulktest]# HADOOP_CLASSPATH=$(hbase classpath) sudo -u hdfs -E hadoop jar \
/usr/lib/hbase/hbase-0.90.4-cdh3u3.jar importtsv \
-Dimporttsv.bulk.output=/bulktest-hfiles \
-Dimporttsv.columns=HBASE_ROW_KEY,a:b,a:c bulktest /bulktest-tsv

This will generate HFiles from /bulktest-tsv and store in to /bulktest-hfiles.

I have three columns in the TSV files, first being the row key, second being what I want stored in columnfamily a with qualifier b, and third with qualifier c (this was controlled by importtsv.columns).

After that job is done, you need to change the permissions of /bulktest-hfiles so that the HBase user owns the HFiles, and then run completebulkload so HBase finds the HFiles:

[root@hbase1 bulktest]# sudo -u hdfs hadoop dfs -chown -R hbase /bulktest-hfiles
[root@hbase1 bulktest]# HADOOP_CLASSPATH=$(hbase classpath) sudo -u hdfs -E hadoop jar \
/usr/lib/hbase/hbase-0.90.4-cdh3u3.jar completebulkload /bulktest-hfiles bulktest

HBase should now see the new data. For usage help, run importtsv or completebulkload without any parameters.



§134 · maj 23, 2012 · HBase · Kommentarer inaktiverade för Bulk load in HBase with importtsv and completebulkload · Tags: , , , , ,

I’ve previously found a great addon to hadoop streaming called ”hadoop hbase streaming” which enables you to use a HBase table as input or output format for your hadoop streaming map reduce jobs, but it’s not been working since a recent API change.

The error it was saying was:

Error: java.lang.ClassNotFoundException:

I just found a fork of it on github by David Maust that has been updated for newer versions of HBase.

You can find the fork here:
And the original branch here:


§98 · maj 2, 2012 · Hadoop, HBase · Kommentarer inaktiverade för Fork of hadoop-hbase-streaming with support for CDH3u3 · Tags: , , ,

I got the following exceptions whenever running heavy map reduce jobs towards my HBase tables:

INFO mapred.JobClient: Task Id : attempt_201204240028_0048_m_000015_2, Status : FAILED
      lease '-8170712910260484725' does not exist
 at org.apache.hadoop.hbase.regionserver.Leases.removeLease(
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(
 at java.lang.reflect.Method.invoke(

Most oftenly they were severe enough to cause the entire job to fail. This indicated that I needed to raise, which says how long a scanner lives between calls to This however didn’t help, apparently you also need to raise hbase.rpc.timeout (the Exception that indicated this was hidden in log level DEBUG, so took a while to realise that).

So, adding the following to hbase-site.xml solved it:

    <value>900000</value> <!-- 900 000, 15 minutes -->
    <value>900000>/value> <!-- 15 minutes -->


§90 · maj 2, 2012 · HBase · Kommentarer inaktiverade för HBase scanner LeaseException · Tags: , , ,

Recently I have been playing around with HBase for a project that will need to store billions of rows (long scale), with a column count variating from 1 to 1 million. The test data (13.3 million rows, 130.8 million columns) resulted in 27 GB of storage, without compression.  After activating compression it only took 6.6 GB.

I followed some guides on the net on how to activate LZO (which can't be enabled by default due to license terms), but all I tried had some minor faults in them (probably due to version issues).

Anyhow, this is how I did it(assuming Debian or Ubuntu):

apt-get install liblzo2-dev sun-java6-jdk ant
svn checkout hadoop-gpl-compression
cd hadoop-gpl-compression
export CFLAGS=”-m64″
export JAVA_HOME=/usr/lib/jvm/java6-sun/
export HBASE_HOME=/path/to/hbase/
ant compile-native
ant jar
cp build/hadoop-gpl-compression-*.jar $HBASE_HOME/lib/
cp build/native/Linux-amd64-64/lib/* /usr/local/lib/
echo ”export HBASE_LIBRARY_PATH=/usr/local/lib/” >> $HBASE_HOME/conf/
mkdir -p $HBASE_HOME/build
cp -r build/native $HBASE_HOME/build/native

Then verify that it works with:

./bin/hbase org.apache.hadoop.hbase.util.CompressionTest file:///tmp/testfile lzo


§24 · september 12, 2011 · HBase · Kommentarer inaktiverade för Activating LZO compression in HBase · Tags: , ,