I got the following exceptions whenever running heavy map reduce jobs towards my HBase tables:
INFO mapred.JobClient: Task Id : attempt_201204240028_0048_m_000015_2, Status : FAILED org.apache.hadoop.hbase.regionserver.LeaseException: org.apache.hadoop.hbase.regionserver.LeaseException: lease '-8170712910260484725' does not exist at org.apache.hadoop.hbase.regionserver.Leases.removeLease(Leases.java:230) at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1879) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) ... |
Most oftenly they were severe enough to cause the entire job to fail. This indicated that I needed to raise hbase.regionserver.lease.period, which says how long a scanner lives between calls to scanner.next(). This however didn’t help, apparently you also need to raise hbase.rpc.timeout (the Exception that indicated this was hidden in log level DEBUG, so took a while to realise that).
So, adding the following to hbase-site.xml solved it:
<property> <name>hbase.regionserver.lease.period</name> <value>900000</value> <!-- 900 000, 15 minutes --> </property> <property> <name>hbase.rpc.timeout</name> <value>900000>/value> <!-- 15 minutes --> </property> |