I’ve previously found a great addon to hadoop streaming called ”hadoop hbase streaming” which enables you to use a HBase table as input or output format for your hadoop streaming map reduce jobs, but it’s not been working since a recent API change.
The error it was saying was:
Error: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.io.RowResult |
I just found a fork of it on github by David Maust that has been updated for newer versions of HBase.
You can find the fork here:
https://github.com/dmaust/hadoop-hbase-streaming
And the original branch here:
https://github.com/wanpark/hadoop-hbase-streaming