In the world of NoSQL

Got the following exception when starting the datanode after it had terminated due to a disk failure (without rebooting the server):

2013-10-11 11:24:02,122 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain Problem binding to [] Address already in use; For more details see:
	at org.apache.hadoop.ipc.Server.bind(
	at org.apache.hadoop.ipc.Server.bind(
	at org.apache.hadoop.hdfs.server.datanode.DataNode.initDataXceiver(
	at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(
	at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(
	at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(
	at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(
	at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(
	at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(
	at org.apache.hadoop.hdfs.server.datanode.DataNode.main(
2013-10-11 11:24:02,126 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2013-10-11 11:24:02,128 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG: 
SHUTDOWN_MSG: Shutting down DataNode at

After an application crashes it might leave a lingering socket, so to reuse that socket early you need to set the socket flag SO_REUSEADDR when attempting to bind to it to be allowed to reuse it. The HDFS datanode doesn’t do that, and I didn’t want to restart the HBase regionserver (which was locking the socket with a connection it hadn’t realized was dead).
The solution was to bind to the port with an application that sets SO_REUSEADDR and then stop that application, I used netcat for that:

[root@hbase10 ~]#  nc -l 50010


§577 · oktober 11, 2013 · Hadoop · (No comments) · Tags: , , , ,