java8964
2014-10-02 14:14:00 UTC
Hi,
Currently our production is using Hive 0.9.0. There is already a complex Hive query running on hadoop daily to generate millions records output. What I want to do is to transfer this result to Cassandra.
I tried to do it in UDF, as then I can send the data at reducer level, to maximum the transfer speed. Everything should work, until I tested in cluster.
We are using Netflix/Astyanax Cassandra client driver to write the data, which requires the google concurrent library Guava 14. I found out that the hive 0.9.0 already include Guava 9, which leads to the following exception shown up in my Hive CLI:
SLF4J: Class path contains multiple SLF4J bindings.SLF4J: Found binding in [jar:file:/opt/ibm/biginsights/hive/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/opt/ibm/biginsights/IHC/lib/slf4j-log4j12-1.4.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.Exception in thread "main" java.lang.NoSuchMethodError: com/google/common/util/concurrent/MoreExecutors.listeningDecorator(Ljava/util/concurrent/ExecutorService;)Lcom/google/common/util/concurrent/ListeningExecutorService; at com.netflix.astyanax.thrift.ThriftKeyspaceImpl.<init>(ThriftKeyspaceImpl.java:114) at com.netflix.astyanax.thrift.ThriftFamilyFactory.createKeyspace(ThriftFamilyFactory.java:41) at com.netflix.astyanax.AstyanaxContext$Builder.buildKeyspace(AstyanaxContext.java:146)
Now I know what is the problem, but I cannot find a good solution.In Hadoop, I can use "mapreduce.job.user.classpath.first" to set my version of Guava picked up first, but if I "set mapreduce.job.user.classpath.first=true" in hive CLI, it didn't work.Then I google around, and found another setting looks like specified for Hive, so I "set mapreduce.task.classpath.user.precedence=true;" in Hive CLI, but it still didn't work.
It doesn't look like that Hive has a way to allow user's third party jar files to be put in front of classpath in UDF.
Does anyone face this kind of problem before? What is the best solution?
Thanks
Yong
Currently our production is using Hive 0.9.0. There is already a complex Hive query running on hadoop daily to generate millions records output. What I want to do is to transfer this result to Cassandra.
I tried to do it in UDF, as then I can send the data at reducer level, to maximum the transfer speed. Everything should work, until I tested in cluster.
We are using Netflix/Astyanax Cassandra client driver to write the data, which requires the google concurrent library Guava 14. I found out that the hive 0.9.0 already include Guava 9, which leads to the following exception shown up in my Hive CLI:
SLF4J: Class path contains multiple SLF4J bindings.SLF4J: Found binding in [jar:file:/opt/ibm/biginsights/hive/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/opt/ibm/biginsights/IHC/lib/slf4j-log4j12-1.4.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.Exception in thread "main" java.lang.NoSuchMethodError: com/google/common/util/concurrent/MoreExecutors.listeningDecorator(Ljava/util/concurrent/ExecutorService;)Lcom/google/common/util/concurrent/ListeningExecutorService; at com.netflix.astyanax.thrift.ThriftKeyspaceImpl.<init>(ThriftKeyspaceImpl.java:114) at com.netflix.astyanax.thrift.ThriftFamilyFactory.createKeyspace(ThriftFamilyFactory.java:41) at com.netflix.astyanax.AstyanaxContext$Builder.buildKeyspace(AstyanaxContext.java:146)
Now I know what is the problem, but I cannot find a good solution.In Hadoop, I can use "mapreduce.job.user.classpath.first" to set my version of Guava picked up first, but if I "set mapreduce.job.user.classpath.first=true" in hive CLI, it didn't work.Then I google around, and found another setting looks like specified for Hive, so I "set mapreduce.task.classpath.user.precedence=true;" in Hive CLI, but it still didn't work.
It doesn't look like that Hive has a way to allow user's third party jar files to be put in front of classpath in UDF.
Does anyone face this kind of problem before? What is the best solution?
Thanks
Yong