Discussion:
Thrift Java Client - TTransportException (SocketException: Connection reset)
Ayush Gupta
2011-02-25 02:53:15 UTC
Permalink
Hi! I'm having some trouble running queries from a java client against a
remote Thrift Hive server. Its all setup and quicker queries do run through
fine.

But queries which run longer than about 10 minutes disconnect the client
with a "TTransportException: Connection reset" exception.. The query
continues to run on the Hive server but since the client is disconnected the
results are "lost". The complete stack trace is below. Does this sound
familiar to anyone?

org.apache.thrift.transport.TTransportException: java.net.SocketException:
Connection reset
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:314)
at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:262)
at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:192)
at
org.apache.hadoop.hive.service.ThriftHive$Client.recv_execute(ThriftHive.java:72)
at
org.apache.hadoop.hive.service.ThriftHive$Client.execute(ThriftHive.java:57)
at
com.wordnik.analytics.data.ReportsRunner$.refreshReport(ReportsRunner.scala:105)
at
com.wordnik.analytics.data.ReportsRunner$.refreshDailyReport(ReportsRunner.scala:34)
at
com.wordnik.analytics.data.ReportsRunner.refreshDailyReport(ReportsRunner.scala)
at com.wordnik.analytics.util.Temp.main(Temp.java:11)
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:168)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:125)
... 10 more

-ayush
Ayush Gupta
2011-02-25 05:36:45 UTC
Permalink
Probing this further reveals that the connection is reset by the server in
exactly 10 minutes every time.

I'm running Hive 0.6. I do not see anything relevant at
http://wiki.apache.org/hadoop/Hive/AdminManual/Configuration but is there
some configuration property which controls this?

-ayush
Post by Ayush Gupta
Hi! I'm having some trouble running queries from a java client against a
remote Thrift Hive server. Its all setup and quicker queries do run through
fine.
But queries which run longer than about 10 minutes disconnect the client
with a "TTransportException: Connection reset" exception.. The query
continues to run on the Hive server but since the client is disconnected the
results are "lost". The complete stack trace is below. Does this sound
familiar to anyone?
Connection reset
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:314)
at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:262)
at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:192)
at
org.apache.hadoop.hive.service.ThriftHive$Client.recv_execute(ThriftHive.java:72)
at
org.apache.hadoop.hive.service.ThriftHive$Client.execute(ThriftHive.java:57)
at
com.wordnik.analytics.data.ReportsRunner$.refreshReport(ReportsRunner.scala:105)
at
com.wordnik.analytics.data.ReportsRunner$.refreshDailyReport(ReportsRunner.scala:34)
at
com.wordnik.analytics.data.ReportsRunner.refreshDailyReport(ReportsRunner.scala)
at com.wordnik.analytics.util.Temp.main(Temp.java:11)
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:168)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:125)
... 10 more
-ayush
Adarsh Sharma
2011-02-25 06:44:56 UTC
Permalink
Did you start hiverserver service before running the client Program.


Cheers, Adarsh
Post by Ayush Gupta
Probing this further reveals that the connection is reset by the
server in exactly 10 minutes every time.
I'm running Hive 0.6. I do not see anything relevant at
http://wiki.apache.org/hadoop/Hive/AdminManual/Configuration but is
there some configuration property which controls this?
-ayush
Hi! I'm having some trouble running queries from a java client
against a remote Thrift Hive server. Its all setup and quicker
queries do run through fine.
But queries which run longer than about 10 minutes disconnect the
client with a "TTransportException: Connection reset" exception..
The query continues to run on the Hive server but since the client
is disconnected the results are "lost". The complete stack trace
is below. Does this sound familiar to anyone?
java.net.SocketException: Connection reset
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:314)
at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:262)
at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:192)
at
org.apache.hadoop.hive.service.ThriftHive$Client.recv_execute(ThriftHive.java:72)
at
org.apache.hadoop.hive.service.ThriftHive$Client.execute(ThriftHive.java:57)
at
com.wordnik.analytics.data.ReportsRunner$.refreshReport(ReportsRunner.scala:105)
at
com.wordnik.analytics.data.ReportsRunner$.refreshDailyReport(ReportsRunner.scala:34)
at
com.wordnik.analytics.data.ReportsRunner.refreshDailyReport(ReportsRunner.scala)
at com.wordnik.analytics.util.Temp.main(Temp.java:11)
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:168)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:125)
... 10 more
-ayush
Ayush Gupta
2011-02-25 06:49:57 UTC
Permalink
Yes, the hiveserver server was started and running before the client program
was run.

-ayush
Post by Adarsh Sharma
Did you start hiverserver service before running the client Program.
Cheers, Adarsh
Probing this further reveals that the connection is reset by the server in
exactly 10 minutes every time.
I'm running Hive 0.6. I do not see anything relevant at
http://wiki.apache.org/hadoop/Hive/AdminManual/Configuration but is there
some configuration property which controls this?
-ayush
Post by Ayush Gupta
Hi! I'm having some trouble running queries from a java client against a
remote Thrift Hive server. Its all setup and quicker queries do run through
fine.
But queries which run longer than about 10 minutes disconnect the client
with a "TTransportException: Connection reset" exception.. The query
continues to run on the Hive server but since the client is disconnected the
results are "lost". The complete stack trace is below. Does this sound
familiar to anyone?
java.net.SocketException: Connection reset
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:314)
at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:262)
at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:192)
at
org.apache.hadoop.hive.service.ThriftHive$Client.recv_execute(ThriftHive.java:72)
at
org.apache.hadoop.hive.service.ThriftHive$Client.execute(ThriftHive.java:57)
at
com.wordnik.analytics.data.ReportsRunner$.refreshReport(ReportsRunner.scala:105)
at
com.wordnik.analytics.data.ReportsRunner$.refreshDailyReport(ReportsRunner.scala:34)
at
com.wordnik.analytics.data.ReportsRunner.refreshDailyReport(ReportsRunner.scala)
at com.wordnik.analytics.util.Temp.main(Temp.java:11)
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:168)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:125)
... 10 more
-ayush
Viral Bajaria
2011-02-25 06:47:04 UTC
Permalink
What do the logs of the thrift server say ?? If it does not give any
relevant information, I would enable DEBUG level logging on the console.

Also a point to remember is the single-threaded nature of the hive thrift
server (atleast upto v0.5)

But looking at the logs is what will be the first thing that I would do.

The query (map/reduce job) will continue to run even if you shutdown the
server since a shutdown does not kill the job submitted to the JobTracker.
Post by Ayush Gupta
Probing this further reveals that the connection is reset by the server in
exactly 10 minutes every time.
I'm running Hive 0.6. I do not see anything relevant at
http://wiki.apache.org/hadoop/Hive/AdminManual/Configuration but is there
some configuration property which controls this?
-ayush
Post by Ayush Gupta
Hi! I'm having some trouble running queries from a java client against a
remote Thrift Hive server. Its all setup and quicker queries do run through
fine.
But queries which run longer than about 10 minutes disconnect the client
with a "TTransportException: Connection reset" exception.. The query
continues to run on the Hive server but since the client is disconnected the
results are "lost". The complete stack trace is below. Does this sound
familiar to anyone?
Connection reset
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:314)
at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:262)
at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:192)
at
org.apache.hadoop.hive.service.ThriftHive$Client.recv_execute(ThriftHive.java:72)
at
org.apache.hadoop.hive.service.ThriftHive$Client.execute(ThriftHive.java:57)
at
com.wordnik.analytics.data.ReportsRunner$.refreshReport(ReportsRunner.scala:105)
at
com.wordnik.analytics.data.ReportsRunner$.refreshDailyReport(ReportsRunner.scala:34)
at
com.wordnik.analytics.data.ReportsRunner.refreshDailyReport(ReportsRunner.scala)
at com.wordnik.analytics.util.Temp.main(Temp.java:11)
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:168)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:125)
... 10 more
-ayush
Ayush Gupta
2011-02-25 06:52:01 UTC
Permalink
Post by Viral Bajaria
What do the logs of the thrift server say ?? If it does not give any
relevant information, I would enable DEBUG level logging on the console.
the hiveserver is pretty quiet, the connection appears to be terminated
silently. I'll up the logging to DEBUG, thanks for that suggestion.
Post by Viral Bajaria
Also a point to remember is the single-threaded nature of the hive thrift
server (atleast upto v0.5)
yeah, there is only this one client connected in this scenario.
Post by Viral Bajaria
But looking at the logs is what will be the first thing that I would do.
The query (map/reduce job) will continue to run even if you shutdown the
server since a shutdown does not kill the job submitted to the JobTracker.
sure
Post by Viral Bajaria
Post by Ayush Gupta
Probing this further reveals that the connection is reset by the server in
exactly 10 minutes every time.
I'm running Hive 0.6. I do not see anything relevant at
http://wiki.apache.org/hadoop/Hive/AdminManual/Configuration but is there
some configuration property which controls this?
-ayush
Post by Ayush Gupta
Hi! I'm having some trouble running queries from a java client against a
remote Thrift Hive server. Its all setup and quicker queries do run through
fine.
But queries which run longer than about 10 minutes disconnect the client
with a "TTransportException: Connection reset" exception.. The query
continues to run on the Hive server but since the client is disconnected the
results are "lost". The complete stack trace is below. Does this sound
familiar to anyone?
java.net.SocketException: Connection reset
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:314)
at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:262)
at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:192)
at
org.apache.hadoop.hive.service.ThriftHive$Client.recv_execute(ThriftHive.java:72)
at
org.apache.hadoop.hive.service.ThriftHive$Client.execute(ThriftHive.java:57)
at
com.wordnik.analytics.data.ReportsRunner$.refreshReport(ReportsRunner.scala:105)
at
com.wordnik.analytics.data.ReportsRunner$.refreshDailyReport(ReportsRunner.scala:34)
at
com.wordnik.analytics.data.ReportsRunner.refreshDailyReport(ReportsRunner.scala)
at com.wordnik.analytics.util.Temp.main(Temp.java:11)
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:168)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:125)
... 10 more
-ayush
Carl Steinbach
2011-02-25 07:21:55 UTC
Permalink
Hi Ayush,

I suspect you're running into Thrift's default socket timeout setting. I
recommend checking out a copy of the Hive source code, and modifying the
Thrift setup code in HiveServer.java to explicitly set the socket timeout on
the TServerSocket, e.g. in HiveServer.main() change

TServerTransport serverTransport = new TServerSocket(port);

to

TServerTransport serverTransport = new TServerSocket(port, timeout);

The Thrift javadoc doesn't specify whether timeout is in seconds or
milliseconds, so you'll probably have to play around with this value.

Hope this helps.

Carl
Post by Ayush Gupta
Post by Viral Bajaria
What do the logs of the thrift server say ?? If it does not give any
relevant information, I would enable DEBUG level logging on the console.
the hiveserver is pretty quiet, the connection appears to be terminated
silently. I'll up the logging to DEBUG, thanks for that suggestion.
Post by Viral Bajaria
Also a point to remember is the single-threaded nature of the hive thrift
server (atleast upto v0.5)
yeah, there is only this one client connected in this scenario.
Post by Viral Bajaria
But looking at the logs is what will be the first thing that I would do.
The query (map/reduce job) will continue to run even if you shutdown the
server since a shutdown does not kill the job submitted to the JobTracker.
sure
Post by Viral Bajaria
Post by Ayush Gupta
Probing this further reveals that the connection is reset by the server
in exactly 10 minutes every time.
I'm running Hive 0.6. I do not see anything relevant at
http://wiki.apache.org/hadoop/Hive/AdminManual/Configuration but is
there some configuration property which controls this?
-ayush
Post by Ayush Gupta
Hi! I'm having some trouble running queries from a java client against a
remote Thrift Hive server. Its all setup and quicker queries do run through
fine.
But queries which run longer than about 10 minutes disconnect the client
with a "TTransportException: Connection reset" exception.. The query
continues to run on the Hive server but since the client is disconnected the
results are "lost". The complete stack trace is below. Does this sound
familiar to anyone?
java.net.SocketException: Connection reset
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:314)
at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:262)
at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:192)
at
org.apache.hadoop.hive.service.ThriftHive$Client.recv_execute(ThriftHive.java:72)
at
org.apache.hadoop.hive.service.ThriftHive$Client.execute(ThriftHive.java:57)
at
com.wordnik.analytics.data.ReportsRunner$.refreshReport(ReportsRunner.scala:105)
at
com.wordnik.analytics.data.ReportsRunner$.refreshDailyReport(ReportsRunner.scala:34)
at
com.wordnik.analytics.data.ReportsRunner.refreshDailyReport(ReportsRunner.scala)
at com.wordnik.analytics.util.Temp.main(Temp.java:11)
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:168)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:125)
... 10 more
-ayush
Carl Steinbach
2011-02-25 07:24:56 UTC
Permalink
I filed a JIRA ticket to track the task of making the Thrift socket timeout
configurable:

https://issues.apache.org/jira/browse/HIVE-2006
Post by Carl Steinbach
Hi Ayush,
I suspect you're running into Thrift's default socket timeout setting. I
recommend checking out a copy of the Hive source code, and modifying the
Thrift setup code in HiveServer.java to explicitly set the socket timeout on
the TServerSocket, e.g. in HiveServer.main() change
TServerTransport serverTransport = new TServerSocket(port);
to
TServerTransport serverTransport = new TServerSocket(port, timeout);
The Thrift javadoc doesn't specify whether timeout is in seconds or
milliseconds, so you'll probably have to play around with this value.
Hope this helps.
Carl
Post by Ayush Gupta
Post by Viral Bajaria
What do the logs of the thrift server say ?? If it does not give any
relevant information, I would enable DEBUG level logging on the console.
the hiveserver is pretty quiet, the connection appears to be terminated
silently. I'll up the logging to DEBUG, thanks for that suggestion.
Post by Viral Bajaria
Also a point to remember is the single-threaded nature of the hive thrift
server (atleast upto v0.5)
yeah, there is only this one client connected in this scenario.
Post by Viral Bajaria
But looking at the logs is what will be the first thing that I would do.
The query (map/reduce job) will continue to run even if you shutdown the
server since a shutdown does not kill the job submitted to the JobTracker.
sure
Post by Viral Bajaria
Post by Ayush Gupta
Probing this further reveals that the connection is reset by the server
in exactly 10 minutes every time.
I'm running Hive 0.6. I do not see anything relevant at
http://wiki.apache.org/hadoop/Hive/AdminManual/Configuration but is
there some configuration property which controls this?
-ayush
Post by Ayush Gupta
Hi! I'm having some trouble running queries from a java client against
a remote Thrift Hive server. Its all setup and quicker queries do run
through fine.
But queries which run longer than about 10 minutes disconnect the
client with a "TTransportException: Connection reset" exception.. The query
continues to run on the Hive server but since the client is disconnected the
results are "lost". The complete stack trace is below. Does this sound
familiar to anyone?
java.net.SocketException: Connection reset
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:314)
at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:262)
at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:192)
at
org.apache.hadoop.hive.service.ThriftHive$Client.recv_execute(ThriftHive.java:72)
at
org.apache.hadoop.hive.service.ThriftHive$Client.execute(ThriftHive.java:57)
at
com.wordnik.analytics.data.ReportsRunner$.refreshReport(ReportsRunner.scala:105)
at
com.wordnik.analytics.data.ReportsRunner$.refreshDailyReport(ReportsRunner.scala:34)
at
com.wordnik.analytics.data.ReportsRunner.refreshDailyReport(ReportsRunner.scala)
at com.wordnik.analytics.util.Temp.main(Temp.java:11)
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:168)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:125)
... 10 more
-ayush
Ayush Gupta
2011-02-25 07:27:54 UTC
Permalink
Thanks Carl, I'll check that.

But, surely, I cant be the only one running Hive queries which last more
than 10 minutes over a thrift client! The hive model is somewhat intended to
work with large data sets and long running queries should be expected. I
wonder why there is no discussion around this on the mailing list, that I
could find.

-ayush
Post by Carl Steinbach
I filed a JIRA ticket to track the task of making the Thrift socket timeout
https://issues.apache.org/jira/browse/HIVE-2006
Post by Carl Steinbach
Hi Ayush,
I suspect you're running into Thrift's default socket timeout setting. I
recommend checking out a copy of the Hive source code, and modifying the
Thrift setup code in HiveServer.java to explicitly set the socket timeout on
the TServerSocket, e.g. in HiveServer.main() change
TServerTransport serverTransport = new TServerSocket(port);
to
TServerTransport serverTransport = new TServerSocket(port, timeout);
The Thrift javadoc doesn't specify whether timeout is in seconds or
milliseconds, so you'll probably have to play around with this value.
Hope this helps.
Carl
Post by Ayush Gupta
Post by Viral Bajaria
What do the logs of the thrift server say ?? If it does not give any
relevant information, I would enable DEBUG level logging on the console.
the hiveserver is pretty quiet, the connection appears to be terminated
silently. I'll up the logging to DEBUG, thanks for that suggestion.
Post by Viral Bajaria
Also a point to remember is the single-threaded nature of the hive
thrift server (atleast upto v0.5)
yeah, there is only this one client connected in this scenario.
Post by Viral Bajaria
But looking at the logs is what will be the first thing that I would do.
The query (map/reduce job) will continue to run even if you shutdown the
server since a shutdown does not kill the job submitted to the JobTracker.
sure
Post by Viral Bajaria
Post by Ayush Gupta
Probing this further reveals that the connection is reset by the server
in exactly 10 minutes every time.
I'm running Hive 0.6. I do not see anything relevant at
http://wiki.apache.org/hadoop/Hive/AdminManual/Configuration but is
there some configuration property which controls this?
-ayush
Post by Ayush Gupta
Hi! I'm having some trouble running queries from a java client against
a remote Thrift Hive server. Its all setup and quicker queries do run
through fine.
But queries which run longer than about 10 minutes disconnect the
client with a "TTransportException: Connection reset" exception.. The query
continues to run on the Hive server but since the client is disconnected the
results are "lost". The complete stack trace is below. Does this sound
familiar to anyone?
java.net.SocketException: Connection reset
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
at
org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:314)
at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:262)
at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:192)
at
org.apache.hadoop.hive.service.ThriftHive$Client.recv_execute(ThriftHive.java:72)
at
org.apache.hadoop.hive.service.ThriftHive$Client.execute(ThriftHive.java:57)
at
com.wordnik.analytics.data.ReportsRunner$.refreshReport(ReportsRunner.scala:105)
at
com.wordnik.analytics.data.ReportsRunner$.refreshDailyReport(ReportsRunner.scala:34)
at
com.wordnik.analytics.data.ReportsRunner.refreshDailyReport(ReportsRunner.scala)
at com.wordnik.analytics.util.Temp.main(Temp.java:11)
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:168)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:125)
... 10 more
-ayush
Viral Bajaria
2011-02-25 07:31:21 UTC
Permalink
Carl,

Do you think this issue was not there before 0.6 ? We run our thrift servers
for hours and have never faced this issue. I don't think I have restarted
any of my thrift servers for days.

My hive wrapper does have logic to handle timeouts, it reconnects whenever
it sees that the thrift connection has died. I think I will add some logging
in my wrapper code to see if my setup does see regular timeouts but I never
see any issues due to the retry logic.

-Viral
Post by Carl Steinbach
I filed a JIRA ticket to track the task of making the Thrift socket timeout
https://issues.apache.org/jira/browse/HIVE-2006
Post by Carl Steinbach
Hi Ayush,
I suspect you're running into Thrift's default socket timeout setting. I
recommend checking out a copy of the Hive source code, and modifying the
Thrift setup code in HiveServer.java to explicitly set the socket timeout on
the TServerSocket, e.g. in HiveServer.main() change
TServerTransport serverTransport = new TServerSocket(port);
to
TServerTransport serverTransport = new TServerSocket(port, timeout);
The Thrift javadoc doesn't specify whether timeout is in seconds or
milliseconds, so you'll probably have to play around with this value.
Hope this helps.
Carl
Post by Ayush Gupta
Post by Viral Bajaria
What do the logs of the thrift server say ?? If it does not give any
relevant information, I would enable DEBUG level logging on the console.
the hiveserver is pretty quiet, the connection appears to be terminated
silently. I'll up the logging to DEBUG, thanks for that suggestion.
Post by Viral Bajaria
Also a point to remember is the single-threaded nature of the hive
thrift server (atleast upto v0.5)
yeah, there is only this one client connected in this scenario.
Post by Viral Bajaria
But looking at the logs is what will be the first thing that I would do.
The query (map/reduce job) will continue to run even if you shutdown the
server since a shutdown does not kill the job submitted to the JobTracker.
sure
Post by Viral Bajaria
Post by Ayush Gupta
Probing this further reveals that the connection is reset by the server
in exactly 10 minutes every time.
I'm running Hive 0.6. I do not see anything relevant at
http://wiki.apache.org/hadoop/Hive/AdminManual/Configuration but is
there some configuration property which controls this?
-ayush
Post by Ayush Gupta
Hi! I'm having some trouble running queries from a java client against
a remote Thrift Hive server. Its all setup and quicker queries do run
through fine.
But queries which run longer than about 10 minutes disconnect the
client with a "TTransportException: Connection reset" exception.. The query
continues to run on the Hive server but since the client is disconnected the
results are "lost". The complete stack trace is below. Does this sound
familiar to anyone?
java.net.SocketException: Connection reset
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
at
org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:314)
at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:262)
at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:192)
at
org.apache.hadoop.hive.service.ThriftHive$Client.recv_execute(ThriftHive.java:72)
at
org.apache.hadoop.hive.service.ThriftHive$Client.execute(ThriftHive.java:57)
at
com.wordnik.analytics.data.ReportsRunner$.refreshReport(ReportsRunner.scala:105)
at
com.wordnik.analytics.data.ReportsRunner$.refreshDailyReport(ReportsRunner.scala:34)
at
com.wordnik.analytics.data.ReportsRunner.refreshDailyReport(ReportsRunner.scala)
at com.wordnik.analytics.util.Temp.main(Temp.java:11)
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:168)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:125)
... 10 more
-ayush
Carl Steinbach
2011-02-25 07:43:02 UTC
Permalink
Hi Viral,

Hive 0.5.0 and 0.6.0 use the same version of libthrift, so the problem is
more likely related to some difference in the way 0.5.0 and 0.6.0
configure/initialize Thrift, or to some other issue related to the way the
Thrift connection is managed on the client or server side (though it looks
like the connection is getting dropped on the server side).

Carl
Post by Viral Bajaria
Carl,
Do you think this issue was not there before 0.6 ? We run our thrift
servers for hours and have never faced this issue. I don't think I have
restarted any of my thrift servers for days.
My hive wrapper does have logic to handle timeouts, it reconnects whenever
it sees that the thrift connection has died. I think I will add some logging
in my wrapper code to see if my setup does see regular timeouts but I never
see any issues due to the retry logic.
-Viral
Post by Carl Steinbach
I filed a JIRA ticket to track the task of making the Thrift socket
https://issues.apache.org/jira/browse/HIVE-2006
Post by Carl Steinbach
Hi Ayush,
I suspect you're running into Thrift's default socket timeout setting. I
recommend checking out a copy of the Hive source code, and modifying the
Thrift setup code in HiveServer.java to explicitly set the socket timeout on
the TServerSocket, e.g. in HiveServer.main() change
TServerTransport serverTransport = new TServerSocket(port);
to
TServerTransport serverTransport = new TServerSocket(port, timeout);
The Thrift javadoc doesn't specify whether timeout is in seconds or
milliseconds, so you'll probably have to play around with this value.
Hope this helps.
Carl
On Fri, Feb 25, 2011 at 12:17 PM, Viral Bajaria <
Post by Viral Bajaria
What do the logs of the thrift server say ?? If it does not give any
relevant information, I would enable DEBUG level logging on the console.
the hiveserver is pretty quiet, the connection appears to be terminated
silently. I'll up the logging to DEBUG, thanks for that suggestion.
Post by Viral Bajaria
Also a point to remember is the single-threaded nature of the hive
thrift server (atleast upto v0.5)
yeah, there is only this one client connected in this scenario.
Post by Viral Bajaria
But looking at the logs is what will be the first thing that I would do.
The query (map/reduce job) will continue to run even if you shutdown
the server since a shutdown does not kill the job submitted to the
JobTracker.
sure
Post by Viral Bajaria
Post by Ayush Gupta
Probing this further reveals that the connection is reset by the
server in exactly 10 minutes every time.
I'm running Hive 0.6. I do not see anything relevant at
http://wiki.apache.org/hadoop/Hive/AdminManual/Configuration but is
there some configuration property which controls this?
-ayush
Post by Ayush Gupta
Hi! I'm having some trouble running queries from a java client
against a remote Thrift Hive server. Its all setup and quicker queries do
run through fine.
But queries which run longer than about 10 minutes disconnect the
client with a "TTransportException: Connection reset" exception.. The query
continues to run on the Hive server but since the client is disconnected the
results are "lost". The complete stack trace is below. Does this sound
familiar to anyone?
java.net.SocketException: Connection reset
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
at
org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:314)
at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:262)
at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:192)
at
org.apache.hadoop.hive.service.ThriftHive$Client.recv_execute(ThriftHive.java:72)
at
org.apache.hadoop.hive.service.ThriftHive$Client.execute(ThriftHive.java:57)
at
com.wordnik.analytics.data.ReportsRunner$.refreshReport(ReportsRunner.scala:105)
at
com.wordnik.analytics.data.ReportsRunner$.refreshDailyReport(ReportsRunner.scala:34)
at
com.wordnik.analytics.data.ReportsRunner.refreshDailyReport(ReportsRunner.scala)
at com.wordnik.analytics.util.Temp.main(Temp.java:11)
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:168)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:125)
... 10 more
-ayush
Ning Zhang
2011-02-25 18:32:57 UTC
Permalink
I tried on the latest trunk (through CLI connecting to Hive Server) and there is no disconnection after 10 mins for a long query.

@Ayush, is this Java client using JDBC connection? If so the client may have set a timeout for JDBC queries. I'm suspecting the disconnection is from the Java client you are using.

On Feb 24, 2011, at 11:43 PM, Carl Steinbach wrote:

Hi Viral,

Hive 0.5.0 and 0.6.0 use the same version of libthrift, so the problem is more likely related to some difference in the way 0.5.0 and 0.6.0 configure/initialize Thrift, or to some other issue related to the way the Thrift connection is managed on the client or server side (though it looks like the connection is getting dropped on the server side).

Carl

On Thu, Feb 24, 2011 at 11:31 PM, Viral Bajaria <viral.bajaria-***@public.gmane.org<mailto:viral.bajaria-***@public.gmane.org>> wrote:
Carl,

Do you think this issue was not there before 0.6 ? We run our thrift servers for hours and have never faced this issue. I don't think I have restarted any of my thrift servers for days.

My hive wrapper does have logic to handle timeouts, it reconnects whenever it sees that the thrift connection has died. I think I will add some logging in my wrapper code to see if my setup does see regular timeouts but I never see any issues due to the retry logic.

-Viral


On Thu, Feb 24, 2011 at 11:24 PM, Carl Steinbach <carl-psgPW5cihnJWk0Htik3J/***@public.gmane.org<mailto:carl-psgPW5cihnJWk0Htik3J/***@public.gmane.org>> wrote:
I filed a JIRA ticket to track the task of making the Thrift socket timeout configurable:

https://issues.apache.org/jira/browse/HIVE-2006


On Thu, Feb 24, 2011 at 11:21 PM, Carl Steinbach <carl-psgPW5cihnJWk0Htik3J/***@public.gmane.org<mailto:carl-psgPW5cihnJWk0Htik3J/***@public.gmane.org>> wrote:
Hi Ayush,

I suspect you're running into Thrift's default socket timeout setting. I recommend checking out a copy of the Hive source code, and modifying the Thrift setup code in HiveServer.java to explicitly set the socket timeout on the TServerSocket, e.g. in HiveServer.main() change

TServerTransport serverTransport = new TServerSocket(port);

to

TServerTransport serverTransport = new TServerSocket(port, timeout);

The Thrift javadoc doesn't specify whether timeout is in seconds or milliseconds, so you'll probably have to play around with this value.

Hope this helps.

Carl


On Thu, Feb 24, 2011 at 10:52 PM, Ayush Gupta <ayush-tiZJL1Yx/***@public.gmane.org<mailto:ayush-tiZJL1Yx/***@public.gmane.org>> wrote:

On Fri, Feb 25, 2011 at 12:17 PM, Viral Bajaria <viral.bajaria-***@public.gmane.org<mailto:viral.bajaria-***@public.gmane.org>> wrote:
What do the logs of the thrift server say ?? If it does not give any relevant information, I would enable DEBUG level logging on the console.
the hiveserver is pretty quiet, the connection appears to be terminated silently. I'll up the logging to DEBUG, thanks for that suggestion.



Also a point to remember is the single-threaded nature of the hive thrift server (atleast upto v0.5)
yeah, there is only this one client connected in this scenario.

But looking at the logs is what will be the first thing that I would do.

The query (map/reduce job) will continue to run even if you shutdown the server since a shutdown does not kill the job submitted to the JobTracker.
sure


On Thu, Feb 24, 2011 at 9:36 PM, Ayush Gupta <ayush-tiZJL1Yx/***@public.gmane.org<mailto:***@glugbot.com>> wrote:
Probing this further reveals that the connection is reset by the server in exactly 10 minutes every time.

I'm running Hive 0.6. I do not see anything relevant at http://wiki.apache.org/hadoop/Hive/AdminManual/Configuration but is there some configuration property which controls this?

-ayush


On Fri, Feb 25, 2011 at 8:23 AM, Ayush Gupta <ayush-tiZJL1Yx/***@public.gmane.org<mailto:***@glugbot.com>> wrote:
Hi! I'm having some trouble running queries from a java client against a remote Thrift Hive server. Its all setup and quicker queries do run through fine.

But queries which run longer than about 10 minutes disconnect the client with a "TTransportException: Connection reset" exception.. The query continues to run on the Hive server but since the client is disconnected the results are "lost". The complete stack trace is below. Does this sound familiar to anyone?

org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:314)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:262)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:192)
at org.apache.hadoop.hive.service.ThriftHive$Client.recv_execute(ThriftHive.java:72)
at org.apache.hadoop.hive.service.ThriftHive$Client.execute(ThriftHive.java:57)
at com.wordnik.analytics.data.ReportsRunner$.refreshReport(ReportsRunner.scala:105)
at com.wordnik.analytics.data.ReportsRunner$.refreshDailyReport(ReportsRunner.scala:34)
at com.wordnik.analytics.data.ReportsRunner.refreshDailyReport(ReportsRunner.scala)
at com.wordnik.analytics.util.Temp.main(Temp.java:11)
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:168)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:125)
... 10 more

-ayush

Loading...