HDP on Linux – Installation Forum

Install Talend for HDP on EC2

  • #27666
    Anupam Gupta
    Participant

    Hi,
    I all ready installed HDP 1.2.0 on Ec2 using ambari, now I want to upload data using Talend Open Studio . Can anybody guide me how to install Talend on ec2.

to create new topics or reply. | New User Registration

  • Author
    Replies
  • #27724
    Anupam Gupta
    Participant

    I am trying to install Talend on Amazon EC2.(centos instance) Is there any documentation which describes the installation step.

    #27726
    Anupam Gupta
    Participant

    HI,
    I am trying to install Talend on Amazon EC2.(centos instance) Is there any documentation which describes the installation step.

    Thanks

    #27790
    tedr
    Moderator

    Hi Agupta,

    Have you looked at the Talend documentation available at http://www.talend.com/download/big-data ?

    Thanks,
    Ted.

    #27884
    Anupam Gupta
    Participant

    Hi All,
    I already unzip talend Open Studio (HDP-ETL-TOS_BD-V5.1.1) on ec2 centos, when i tried to run TOS_BD-linux-gtk-x86.sh i got following exception
    (TOS_BD-linux-gtk-x86_64:11518): Gtk-CRITICAL **: gtk_style_detac h: assertion `style->attach_count > 0′ failed

    (TOS_BD-linux-gtk-x86_64:11518): Gdk-CRITICAL **: gdk_window_set_ user_data: assertion `GDK_IS_WINDOW (window)’ failed

    (TOS_BD-linux-gtk-x86_64:11518): Gdk-CRITICAL **: _gdk_window_des troy_hierarchy: assertion `GDK_IS_WINDOW (window)’ failed

    (TOS_BD-linux-gtk-x86_64:11518): GLib-GObject-CRITICAL **: g_obje ct_unref: assertion `G_IS_OBJECT (object)’ failed

    (TOS_BD-linux-gtk-x86_64:11517): GLib-GObject-WARNING **: invalid (NULL) pointer instance

    (TOS_BD-linux-gtk-x86_64:11517): GLib-GObject-CRITICAL **: g_sign al_connect_data: assertion `G_TYPE_CHECK_INSTANCE (instance)’ fai led

    (TOS_BD-linux-gtk-x86_64:11517): Gtk-CRITICAL **: gtk_settings_ge t_for_screen: assertion `GDK_IS_SCREEN (screen)’ failed

    (TOS_BD-linux-gtk-x86_64:11517): GLib-GObject-CRITICAL **: g_obje ct_get: assertion `G_IS_OBJECT (object)’ failed

    (TOS_BD-linux-gtk-x86_64:11517): GLib-GObject-WARNING **: value ” TRUE” of type `gboolean’ is invalid or out of range for property `visible’ of type `gboolean’

    (TOS_BD-linux-gtk-x86_64:11517): Gtk-CRITICAL **: gtk_settings_ge t_for_screen: assertion `GDK_IS_SCREEN (screen)’ failed

    (TOS_BD-linux-gtk-x86_64:11517): GLib-GObject-CRITICAL **: g_obje ct_get: assertion `G_IS_OBJECT (object)’ failed

    (TOS_BD-linux-gtk-x86_64:11517): Gtk-WARNING **: Screen for GtkWi ndow not set; you must always set
    a screen for a GtkWindow before using the window

    (TOS_BD-linux-gtk-x86_64:11517): Gdk-CRITICAL **: gdk_pango_conte xt_get_for_screen: assertion `GDK_IS_SCREEN (screen)’ failed

    (TOS_BD-linux-gtk-x86_64:11517): Pango-CRITICAL **: pango_context _set_font_description: assertion `context != NULL’ failed

    Thanks in Advance

    #27887
    tedr
    Moderator

    Hi Agupta,

    Talend OpenStudio is a GUI application, The trace you post looks like it is having trouble showing the GUI.

    Thanks,
    Ted.

    #28384
    Anupam Gupta
    Participant

    HI Ted,
    We go through the talend website for our issue but we unable to find out any help regarding this. May you please provide us any link in which it is specified that how to install talend and run it up on ec2.

    #28526
    Sasha J
    Moderator

    Agupta,
    this looks like you odes not have graphical desktop installed on your box.
    Please, make sure you have one.

    Thank you!
    Sasha

    #28541
    Anupam Gupta
    Participant

    Hi Shasha J,

    Thanks for your reply.

    Actually we are new to the hadoop, aws and ec2. Our motive is to load data on Hadoop and HBase. We have already loaded a csv file on hdfs datanode(/user/root/datanode) through command line by using “hadoop -fs copyFromLocal soucre destination” command. And now we want to read/write that file on HBase but we are unable to find a way for doing so. Thats why we want to use Talend because as it is said that it provides a GUI and we can simply drag our file and drop onto the place where we want to.

    Now the problem we are facing is that, we have downloaded and unzipped the talend file on ec2 but we are unable to run it. We have ran it on our windows desktop and it is running fine but it is not running on ec2.

    We have also written the error occurred on running the talend in this post previously. You may have a look onto the errors.

    Please help us to sort it out.
    Thanks in advance.

    #28560
    Robert
    Participant

    Hi Agupta,
    As Sasha mentioned, the machines you have provisioned on EC2 do not seem to have desktop GUI installed. When provisioning these nodes, did you happen to install the light version of the CentOS which doesn’t install the GUI modules?

    Regards,
    Robert

    #29021
    Anupam Gupta
    Participant

    Hi Robert,
    Thanks for your help. We have successfully installed GUI for centos on head node and able to access it form our local windows machien using NX Client. Now we have some doubts.
    1) Can we access slave nodes by this NX client(bcoz we have installed GUI for only head node).

    2) We have no space left for install Talend data integration tool, we have instance of type m1.medium and root device is EBS with volumes(6 GB).
    3) Which file need to run for launch Talend after unzip Talend data integration tool
    Please Guide us.

    Thanks in advance
    Agupta

    #29035
    tedr
    Moderator

    Hi Agupta,

    To answer your questions:
    1) to access the slaves you won’t need the NX client you can just ssh to them as you have not installed a GUI environment on them there is no need to use a GUI client to access them. You will be able to access in Talend without them having a GUI environment.
    2) Unfortunately you will need to remake the master to a larger instance. we recommend that the drive space on each node be no smaller than 10 GB for just installing HDP.
    3) the file you need to run is TOS_BD-linux-gtk-x86.sh

    Thanks,
    Ted.

    #29089
    Anupam Gupta
    Participant

    Hi Ted,
    Thanks for your valuable response, now we are using 2 instance type m1.large and we have installed centos GUI on one instance, in this node we have name node, job tracker,hbase master , we successfully installed and run talend open studio in this node by using NX client.
    Ambari server is installed on other node(slave). Before installing GUI and NX server all hadoop services are running perfect but now HDFS, Mapreduce, HBase is down. Is it because we have very less space(500 mb) left in master node or it is any other issue?

    Please Help
    Thanks in Advance,
    Agupta

    #29103
    Sasha J
    Moderator

    Did you reboot your nodes?
    make sure you have ambari-server running on Ambari server node and ambari-agents on all nodes.
    Then you can connect to Ambary UI and start/stop services.

    Thank you!
    Sasha

    #29108
    Anupam Gupta
    Participant

    Hi Sasha,
    Thanks for reply, we have only two instances m1.large type, Ambari server was running on one node in this node we have data node, Hive Metastore,Oozie,SecondaryNameNode etc.

    we have ambari-agents on all nodes.
    Ambari was running perfect all services was up and running, before installing GUI for centos and NX server on other node(master).
    _____________________________________________________________________________
    To enable Enable GNOME I have done following

    # yum -y groupinstall “Desktop” “Desktop Platform” “X Window System” “Fonts”

    # vi /etc/yum.conf
    group_package_types=default mandatory optional

    # yum -y groupinstall “Legacy X Window System compatibility”

    # vi /etc/inittab
    change
    id:3:initdefault:
    to
    id:5:initdefault:

    # init 6

    Install NX Free

    # yum -y install nx freenx
    # cd /etc/nxserver
    # cp node.conf node.conf.backup
    Edit the node.conf file to enable ENABLE_PASSDB_AUTHENTICATION=”1″.

    Modify the /etc/ssh/sshd_config file
    set PasswordAuthentication yes

    # service sshd restart
    # nxserver –restart

    # nxserver –adduser root
    # nxserver –passwd

    Thanks in Advance,
    AGupta

    #29133
    tedr
    Moderator

    Hi Agupta,

    You will need to look in the hadoop logs to determine the cause of these services stopping. You can find the HDFS and Mapreduce logs in /var/log/hadoop on the instance where these services are run. The HBase logs can be found in /var/log/hbase.

    Thanks,
    Ted.

You must be to reply to this topic. | Create Account

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.