Home Forums Hive / HCatalog SemanticException (10014): Cannot find UDF class in classpath

Tagged: 

This topic contains 6 replies, has 3 voices, and was last updated by  Jason Dere 2 months ago.

  • Creator
    Topic
  • #59000

    Matt Parker
    Participant

    TWIMC:

    I’ve been trying to integrate ESRI’s geospatial UDF classes with Hive, but it cannot find the classes on the classpath. The have tried loading the classes using the Hue interface, Hive CLI both as a script and using the auxpath parameter settings, and dropped the jar files into /usr/lib/hadoop/lib and /usr/lib/hive/lib. Nothing seem to work. I’ve confirmed the classes are in the jar, the UDF class has public scope, and the UDF’s method signature exists for the two fields (double,double) I call with it. If I ask the system to “describe function ST_Point”, it find two methods on the class that are implementations of the evaluate method. I can run hive queries if I’m not using their UDF classes.

    It’s all detailed here:

    https://github.com/Esri/spatial-framework-for-hadoop/issues/64

    I was wondering whether anyone else is having this same issue using HDP 2.1?

    TIA,

    M.

Viewing 6 replies - 1 through 6 (of 6 total)

You must be logged in to reply to this topic.

  • Author
    Replies
  • #59061

    Jason Dere
    Participant

    Without the explain, I get the following results from the query:
    OK
    1 0
    Time taken: 33.54 seconds, Fetched: 1 row(s)

    Not really sure why the order of add jar would matter, but it is something I have seen with the ESRI UDFs.

    Collapse
    #59037

    Matt Parker
    Participant

    Interesting. What happens when you remove “explain” from the select statement? Not sure your example is executing the actual query.

    Also, I wouldn’t think it would matter what order you added jar files to the classpath for the system to find them as long as they were there. Classpath order typically only matters when you have two classes with the same name and package structure, where the first one on the classpath wins out.

    Back to your main question -> I have them in the same order.

    As an aside, I created my own custom UDF and it runs on the cluster just fine. Not sure why ESRI’s won’t run on my cluster.

    Collapse
    #59014

    Jason Dere
    Participant

    What order do you add the JARs to your class path? This seems to make a difference. This example works for me:


    add jar /tmp/esri-geometry-api.jar;
    add jar /tmp/spatial-sdk-hive-1.0.3-SNAPSHOT.jar;

    create temporary function ST_Point as 'com.esri.hadoop.hive.ST_Point';
    create temporary function ST_Contains as 'com.esri.hadoop.hive.ST_Contains';
    create temporary function ST_Polygon as 'com.esri.hadoop.hive.ST_Polygon';

    create external table if not exists geotest (
    id INT,
    latitude DOUBLE,
    longitude DOUBLE) location '/tmp/geotest';

    explain select "1", count(*) from geotest where ST_Contains( ST_Polygon("polygon((0 0, 0 3, 3 3, 3 0, 0 0))"), ST_Point(geotest.longitude,geotest.latitude));

    If I swap the order of the ADD JAR commands, I see the following error (using a build that includes HIVE-6995):

    java.lang.NoClassDefFoundError: com/esri/core/geometry/Geometry
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:270)
    at org.apache.hadoop.hive.ql.exec.FunctionTask.getUdfClass(FunctionTask.java:313)
    at org.apache.hadoop.hive.ql.exec.FunctionTask.createTemporaryFunction(FunctionTask.java:181)
    at org.apache.hadoop.hive.ql.exec.FunctionTask.execute(FunctionTask.java:81)
    at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:161)
    at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
    at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1555)
    at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1322)
    at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1136)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:960)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:950)
    at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:265)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:217)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:427)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:363)
    at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:460)
    at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:476)
    at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756)
    at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:694)
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

    Collapse
    #59013

    Jason Dere
    Participant

    Unfortunately prior to HIVE-6995, Hive was not logging the actual cause of the error here and just giving a generic “not in class path” message, that would be helpful here to help troubleshoot.

    Collapse
    #59010

    Matt Parker
    Participant

    If you follow the link above, you’ll see the output returned by Hue and Hive CLI.

    Collapse
    #59009

    Carter Shanklin
    Participant

    Can you try it on the shell to see if you can get it to work there?

    Also, did you have a look at https://github.com/cartershanklin/hive-spatial-uber/blob/master/queries/queries.hive

    There are 2 ESRI jars needed and the create functions are also needed.

    It has been a while since I did this but I was able to make this work with HUE back in the day. If you can try with the shell it might help isolate the issue.

    Collapse
Viewing 6 replies - 1 through 6 (of 6 total)