Spark Forum

Exception over HDP: 2.0.6

  • #57455
    Abel Coronado

    Hi Everybody :)
    I´m Triyng to run an App in a Spark installation over HDP: 2.0.6 and Hadoop:2.2.0 but something is wrong, anybody can give me some hint???

    The example runs OK:
    SPARK_JAR=lib/spark-assembly_2.10- ./bin/spark-class org.apache.spark.deploy.yarn.Client –jar examples/lib/spark-examples_2.10- –class org.apache.spark.examples.SparkPi –args yarn-standalone –num-workers 20 –master-memory 512m –worker-memory 512m –worker-cores 1

    But my JAR fails:
    SPARK_JAR=/usr/lib/spark/lib/spark-assembly_2.10- /usr/lib/spark/bin/spark-class org.apache.spark.deploy.yarn.Client –jar /home/abel/spark/geoProcessingSpark-0.9.1/target/scala-2.10/SparkGeoprocessing-0.9.1-assembly-1.0.jar –class SimpleApp –args yarn-standalone –num-workers 2 –master-memory 512m –worker-memory 512m –worker-cores 1

    appDiagnostics: Application application_1403012744101_0477 failed 2 times due to AM Container for appattempt_1403012744101_0477_000002 exited with exitCode: 1 due to: Exception from container-launch:
    at org.apache.hadoop.util.Shell.runCommand(

    I think that the issue is in the configuration, here build.sbt:

    import AssemblyKeys._


    excludedJars in assembly <<= (fullClasspath in assembly) map { cp =>
    cp filter { c=>List(“asm-3.2.jar”,”javax.servlet-2.5.0.v201103041518.jar”,”hadoop-yarn-common-2.2.0.jar”,”jcl-over-slf4j-1.7.5.jar”,”xsd-2.6.0.jar”,”ecore-2.6.1.jar”,”jt-zonalstats-1.3.1.jar”,”javax.transaction-1.1.1.v201105210645.jar”,”javax.servlet-3.0.0.v201112011016.jar”,”javax.mail.glassfish-1.4.1.v201005082020.jar”,”javax.activation-1.1.0.v201105071233.jar”,”commons-collections-3.1.jar”,”hsqldb-”,”commons-beanutils-1.7.0.jar”,”commons-collections-3.2.1.jar”) exists { contains _} }

    name := “SparkGeoprocessing-0.9.1”

    version := “1.0”

    scalaVersion := “2.10.0”

    libraryDependencies += “org.apache.hadoop” % “hadoop-client” % “2.2.0”

    libraryDependencies ++= Seq(
    ( “org.apache.spark” %% “spark-core” % “0.9.1”).
    exclude(“org.mortbay.jetty”, “servlet-api”).
    exclude(“commons-beanutils”, “commons-beanutils-core”).
    exclude(“commons-collections”, “commons-collections”).
    exclude(“commons-collections”, “commons-collections”).
    exclude(“com.esotericsoftware.minlog”, “minlog”)

    libraryDependencies ++= Seq(
    “com.vividsolutions” % “jts” % “1.13”,
    “org.geotools” % “gt-main” % “11.1”,
    “org.geotools” % “gt-epsg-hsql” % “11.1”,
    “org.geotools” % “gt-shapefile” % “11.1”,
    “org.geotools” % “gt-render” % “11.1”,
    “org.geotools” % “gt-xml” % “11.1”,
    “org.geotools” % “gt-geojson” % “11.1”,
    “org.geotools.jdbc” % “gt-jdbc-postgis” % “11.1”,
    “org.geotools.jdbc” % “gt-jdbc-spatialite” % “11.1”,
    “org.geotools” % “gt-coverage” % “11.1”,
    “org.geotools” % “gt-geotiff” % “11.1”,

to create new topics or reply. | New User Registration

  • Author
  • #57456
    Abel Coronado

    At this moment the code is very simple:

    import org.apache.spark.SparkContext
    import org.apache.spark.SparkContext._
    import org.apache.spark.SparkConf
    import org.geoscript.feature._
    import org.geoscript.geometry._
    import org.geoscript.geometry.builder._
    import com.vividsolutions.jts._
    import org.geoscript.layer.Shapefile
    import org.geotools.feature.FeatureCollection

    object SimpleApp {
    def main(args: Array[String]){
    val conf = new SparkConf().setMaster(“local”).setAppName(“Csv Clipper”).set(“spark.executor.memory”, “1g”)
    val sc = new SparkContext(conf)


    Abel Coronado

    May be you can let me see the build.sbt used to assembly the SparkPi example?

You must be to reply to this topic. | Create Account

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.