How To Install Hadoop on Windows with HDP 2.0

Double-click the EXE. Sorta.

Installing the Hortonworks Data Platform 2.0 for Windows is straightforward. Lets take a look at how to install a one node cluster on your Windows Server 2012 R2 machine.

To start, download the HDP 2.0 for Windows package. The package is under 1 GB, and will take a few moments to download depending on your internet speed. Documentation for installing a single node instance is located here. This blog post will guide you through that instruction set to get you going with HDP 2.0 for Windows!

Here’s an outline of the process you’ll work through to deploy:

  • Install the prerequisites
  • Deploy HDP on your single node machine
  • Start the services
  • Run smoke tests to validate the install

Install the Pre-requisites

You’ll now install Java, Python, and MSFT C++ run time. Windows Server 2012 already has the up to date .NET runtime, so you can skip that step.

Let’s download the C++ run time, and install that by double clicking the downloaded MSI.

Download Python 2.7.x, and double click the downloaded MSI to install the package.

Once you’ve installed, you’ll need to ensure HDP can find Python – by updating the PATH System Environment variable.

Go to Computer > Properties > Advanced System Settings > Environment variables. Then append the install path to Python, for example C:\Python27, to this path after a ‘;’:

()

Verify your path is setup by entering a new Powershell or Command Prompt and typing: python, which should run the python interpreter. Type quit() to exit.

Setup Java, which you can get here. You will also need to setup JAVA_HOME, which Hadoop requires. Make sure to install Java to somewhere without a space in the path – “Program Files” will not work!

To setup JAVA_HOME, in Explorer > right click Computer > Properties > Advanced System Settings > Environment variables. Then setup a new System variable called JAVA_HOME that points to your Java install (in this case, C:\java\jdk1.6.0_31).

Install the MSI package

Now we have all the pre-requisites installed. The next step is to install the HDP 2.0 for Windows package.

Extract the MSI from the zip package you downloaded earlier. Open a Powershell prompt in Administrator (“Run as Administrator”) mode, and execute the MSI through this command:

   > msiexec /i "hdp-2.0.6.0.winpkg.msi"

The HDP Setup window appears pre-populated with the host name of the server, as well as default installation parameters. Now, complete the form with your parameters:

  • Set the Hadoop User Password. This enables you to log in as the administrative user and perform administrative actions. This must match your local Windows Server password requirements. We recommend a strong pasword. Note the password you set – we’ll use this later.
  • Check ‘Delete Existing HDP Data’. This ensures that HDFS will be formatted and ready to use after you install.
  • Check ‘Install HDP Additional Components’. Select this check box to install Zookeeper, Flume, and HBase as HDP services deployed to the single node server.
  • Set the Hive and Oozie database credentials. Set ‘hive’ for all Hive Metastore entries, and ‘oozie’ for all Oozie Metastore entries.
  • Select DERBY, and not MSSQL, as the DB Flavor in the dropdown selection. This will setup HDP to use an embedded Derby database, which is ideal for the evaluation single node scenario.

When you have finished setting the installation parameters, click ‘Install’ to install HDP.

The HDP Setup window will close, and a progress indicator will be displayed while the installer is running. The installation will take a few minutes – disregard the progress bar expected time display.

The MSI installer window will display an info prompt when the installation is finished and successful.

Start the services and run a jobs

Once the install is successful, you will start the HDP services on the single node.

Open a command prompt, and navigate to the HDP install directory. By default, the location is “C:\hdp”, unless you set a different location:

   > cd C:\hdp

   > start_local_hdp_services

Validate the install by running the full suite of smoke tests. It’s easiest to run the smoke tests as the HDP super user: ‘hadoop’.

In a command prompt, switch to using the ‘hadoop’ user:

   > runas /user:hadoop cmd

When prompted, enter the password you had set up during install.

Run the provided smoke tests as the hadoop user to verify that the HDP 2.0 services work as expected:

   > cd C:\hdp

   > Run-SmokeTests hadoop

This will fire up a Mapreduce job on your freshly set up cluster. If it fails the first time, try running it again with the same command Run-SmokeTests hadoop.

Congratulations, you are now Hadooping on Windows!

If you’d like to learn more about Hadoop, check out the Hortonworks Sandbox, a virtual machine for you to learn Hadoop and sign up for our free ‘Learn Hadoop in 2 Weeks‘ guided tutorial series.

Categorized by :
HDP for Windows

Comments

Juan Luis Rivero
|
September 17, 2014 at 2:38 am
|

Hello guys,

I am trying to install hdp 2.3.1.0 for windows in a single node. I have already installed Python under Anaconda pck. Anaconda directory is in PATH env. variable. If I run python from a cmd command, python 2.7.7 executes and point to anaconda from any directory I run. Also I have installed Java.

The fact is that I get an error: No python executable found in path. I am lost…. What could be happened?

Thanks a lot.

Regards. JL

Disha
|
August 8, 2014 at 9:34 am
|

I am getting a pop-up saying “JAVA_HOME” must be set even when I have set it with a path that has no spaces.
Please help!!

Disha
|
August 8, 2014 at 9:33 am
|

I am getting a pop-up saying “JAVA_HOME” must be set even when I have set it with a path that has no spaces.
Please help!!

dalal
|
August 3, 2014 at 8:19 pm
|

I installed the Hortonworks Data Platform 2.0 for Windows. The tutorial stops at Run-SmokeTests hadoop. After that there is no information / guide to what we can do on this windows installation .All tutorials are related to virtual sandbox. Is it possible to provide tutorials for Hortonworks Data Platform 2.0 for Windows similar to what you have for virtual sandbox http://hortonworks.com/hadoop-tutorial/hello-world-an-introduction-to-hadoop-hcatalog-hive-and-pig/ .

Tom
|
August 3, 2014 at 9:29 am
|

Windows 7 Ultimate, 64-bit. JDK 1.8.0.11, Python34 – installation fails, no log written, no information in the system log (beyond the advise to check the non-existent log file.
To me, free was too expensive this time, wasting about a day to troubleshoot.

Leave a Reply

Your email address will not be published. Required fields are marked *

If you have specific technical questions, please post them in the Forums

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

HDP Sandbox on Windows

Hortonworks Sandbox is a self-contained virtual machine running Hadoop. Follow along with our tutorials or set up a proof of concept. You’ll be running in 15 minutes.

Get Sandbox

Integrate with existing systems
Hortonworks maintains and works with an extensive partner ecosystem from broad enterprise platform vendors to specialized solutions and systems integrators.
Contact Us
Hortonworks provides enterprise-grade support, services and training. Discuss how to leverage Hadoop in your business with our sales team.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.