I was running a
jstack dump on my project's app server this afternoon & I saw
Okio Watchdog among the threads:
"Okio Watchdog" #1972 daemon prio=5 os_prio=0 tid=0x00007f4a8c01e800 nid=0x362 in Object.wait() [0x00007f48f2351000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:502) at okio.AsyncTimeout.awaitTimeout(AsyncTimeout.java:297) - locked <0x0000000733eaa0e0> (a java.lang.Class for okio.AsyncTimeout) at okio.AsyncTimeout.access$000(AsyncTimeout.java:40) at okio.AsyncTimeout$Watchdog.run(AsyncTimeout.java:272)
The JDK's Network Timeouts
There are two timeout mechanisms in Java's networking stacks:
Socket.setSoTimeoutimplement connect and read timeouts respectively. They work with blocking I/O.
Selector.select()implements connect, read and write timeouts for non-blocking I/O.
Notably, there's no API for write timeouts on blocking I/O. Oracle advises you switch to NIO if you need that.
Okio uses a watchdog thread to implement read & write timeouts for blocking I/O. When the watchdog sees that an operation has gone on too long, it kills the offending stream so that the application can either give up or retry.
When you're doing HTTP/1.1, there's a one-to-one relationship between the application-layer stream and the network socket. If a read hangs for 10 seconds, the socket will time out and the failure can be reported to the application.
But with SPDY and HTTP/2, the stream can fail even if its host socket is working perfectly. For example, suppose your HTTP/2 server uses a database to satisfy a particular request, and that database is unreachable. Meanwhile, other requests multiplexed on the same socket may be working just fine: they don't use that database to produce their response.
You can't use socket timeouts on multiplexed streams; the sockets aren't the problem! But Okio's watchdog works on streams and can interrupt a problematic stream without breaking the entire connection.
So that's the
Okio Watchdog thread that you might have seen in your application's thread dump. If any of your I/O operations stop responding, it'll lick their face until they wake up.