The Hortonworks Community Connection is now live. A completely rebuilt Q&A forum, Knowledge Base, Code Hub and more, backed by the experts in the industry.

You will be redirected here in 10 seconds. If your are not redirected, click here to visit the new site.

The legacy Hortonworks Forum is now closed. You can view a read-only version of the former site by clicking here. The site will be taken offline on January 31,2016

Hive / HCatalog Forum

SemanticException (10014): Cannot find UDF class in classpath

  • #59000
    Matt Parker


    I’ve been trying to integrate ESRI’s geospatial UDF classes with Hive, but it cannot find the classes on the classpath. The have tried loading the classes using the Hue interface, Hive CLI both as a script and using the auxpath parameter settings, and dropped the jar files into /usr/lib/hadoop/lib and /usr/lib/hive/lib. Nothing seem to work. I’ve confirmed the classes are in the jar, the UDF class has public scope, and the UDF’s method signature exists for the two fields (double,double) I call with it. If I ask the system to “describe function ST_Point”, it find two methods on the class that are implementations of the evaluate method. I can run hive queries if I’m not using their UDF classes.

    It’s all detailed here:

    I was wondering whether anyone else is having this same issue using HDP 2.1?



  • Author
  • #59009
    Carter Shanklin

    Can you try it on the shell to see if you can get it to work there?

    Also, did you have a look at

    There are 2 ESRI jars needed and the create functions are also needed.

    It has been a while since I did this but I was able to make this work with HUE back in the day. If you can try with the shell it might help isolate the issue.

    Matt Parker

    If you follow the link above, you’ll see the output returned by Hue and Hive CLI.

    Jason Dere

    Unfortunately prior to HIVE-6995, Hive was not logging the actual cause of the error here and just giving a generic “not in class path” message, that would be helpful here to help troubleshoot.

    Jason Dere

    What order do you add the JARs to your class path? This seems to make a difference. This example works for me:

    add jar /tmp/esri-geometry-api.jar;
    add jar /tmp/spatial-sdk-hive-1.0.3-SNAPSHOT.jar;

    create temporary function ST_Point as 'com.esri.hadoop.hive.ST_Point';
    create temporary function ST_Contains as 'com.esri.hadoop.hive.ST_Contains';
    create temporary function ST_Polygon as 'com.esri.hadoop.hive.ST_Polygon';

    create external table if not exists geotest (
    id INT,
    latitude DOUBLE,
    longitude DOUBLE) location '/tmp/geotest';

    explain select "1", count(*) from geotest where ST_Contains( ST_Polygon("polygon((0 0, 0 3, 3 3, 3 0, 0 0))"), ST_Point(geotest.longitude,geotest.latitude));

    If I swap the order of the ADD JAR commands, I see the following error (using a build that includes HIVE-6995):

    java.lang.NoClassDefFoundError: com/esri/core/geometry/Geometry
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(
    at org.apache.hadoop.hive.ql.exec.FunctionTask.getUdfClass(
    at org.apache.hadoop.hive.ql.exec.FunctionTask.createTemporaryFunction(
    at org.apache.hadoop.hive.ql.exec.FunctionTask.execute(
    at org.apache.hadoop.hive.ql.exec.Task.executeTask(
    at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(
    at org.apache.hadoop.hive.ql.Driver.launchTask(
    at org.apache.hadoop.hive.ql.Driver.execute(
    at org.apache.hadoop.hive.ql.Driver.runInternal(
    at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(
    at org.apache.hadoop.hive.cli.CliDriver.processLine(
    at org.apache.hadoop.hive.cli.CliDriver.processLine(
    at org.apache.hadoop.hive.cli.CliDriver.processReader(
    at org.apache.hadoop.hive.cli.CliDriver.processFile(
    at org.apache.hadoop.hive.cli.CliDriver.executeDriver(
    at org.apache.hadoop.hive.cli.CliDriver.main(
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(
    at java.lang.reflect.Method.invoke(
    at org.apache.hadoop.util.RunJar.main(

    Matt Parker

    Interesting. What happens when you remove “explain” from the select statement? Not sure your example is executing the actual query.

    Also, I wouldn’t think it would matter what order you added jar files to the classpath for the system to find them as long as they were there. Classpath order typically only matters when you have two classes with the same name and package structure, where the first one on the classpath wins out.

    Back to your main question -> I have them in the same order.

    As an aside, I created my own custom UDF and it runs on the cluster just fine. Not sure why ESRI’s won’t run on my cluster.

    Jason Dere

    Without the explain, I get the following results from the query:
    1 0
    Time taken: 33.54 seconds, Fetched: 1 row(s)

    Not really sure why the order of add jar would matter, but it is something I have seen with the ESRI UDFs.

The forum ‘Hive / HCatalog’ is closed to new topics and replies.

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.