I am using a HDInsight Storm cluster to provide real-time message processing been fed by IoT devices in the field.
Recently I've started to notice a distinct slow-down in the receipt of messages which are causing flow-on effects for my application stack.
Trawling through the logs I noticed the following exception:
2016-10-27 06:21:58.288 c.m.e.c.EventHubReceiver [ERROR] Error{condition=detach-forced,description=Force
detach the link because the session is remotely ended.} 2016-10-27 06:21:58.289 c.m.e.c.EventHubClient [INFO] Recovering with offset filter 1357552 2016-10-27 06:21:58.288 STDIO [ERROR] java.net.SocketException: Connection timed out 2016-10-27 06:21:58.289
STDIO [ERROR] at java.net.SocketInputStream.socketRead0(Native Method) 2016-10-27 06:21:58.289 STDIO [ERROR] at java.net.SocketInputStream.read(SocketInputStream.java:152) 2016-10-27 06:21:58.290 STDIO [ERROR] at java.net.SocketInputStream.read(SocketInputStream.java:122)
2016-10-27 06:21:58.290 STDIO [ERROR] at sun.security.ssl.InputRecord.readFully(InputRecord.java:442) 2016-10-27 06:21:58.290 STDIO [ERROR] at sun.security.ssl.InputRecord.read(InputRecord.java:480) 2016-10-27 06:21:58.291 STDIO [ERROR] at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:944)
2016-10-27 06:21:58.291 STDIO [ERROR] at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:901) 2016-10-27 06:21:58.292 STDIO [ERROR] at sun.security.ssl.AppInputStream.read(AppInputStream.java:102) 2016-10-27 06:21:58.292 STDIO [ERROR] at java.io.InputStream.read(InputStream.java:101)
2016-10-27 06:21:58.298 STDIO [ERROR] at org.apache.qpid.amqp_1_0.client.TCPTransportProvider.doRead(TCPTransportProvider.java:234) 2016-10-27 06:21:58.299 STDIO [ERROR] at org.apache.qpid.amqp_1_0.client.TCPTransportProvider.access$000(TCPTransportProvider.java:47)
2016-10-27 06:21:58.299 STDIO [ERROR] at org.apache.qpid.amqp_1_0.client.TCPTransportProvider$1.run(TCPTransportProvider.java:185) 2016-10-27 06:21:58.299 STDIO [ERROR] at java.lang.Thread.run(Thread.java:745)
Starting a new cluster on my development machine had the messages flowing again at the correct rate and everything returned to normal.
Also seeing this appear sometimes:
c.m.e.c.EventHubReceiver - Error{condition=com.microsoft:container-close,description=The message container is being closed (26525). TrackingId:d822e0e4-5f2b-4344-988b-c7f0bc7999f8_B6, SystemTracker:NoSystemTracker, Timestamp:10/27/2016 7:59:08 AM}
c.m.e.c.EventHubReceiver - Error{condition=com.microsoft:container-close,description=The message container is being closed (26523). TrackingId:fd9288d5-16ed-47ed-ac8a-e4175d69ef95_B4, SystemTracker:NoSystemTracker, Timestamp:10/27/2016 7:59:08 AM}
I am running:
storm-core 0.10.0 (HDInsight version) on Linux
storm-eventhubs 0.10.2