Home Forums HDP on Windows – Installation HDP for Windows vs Linux

This topic contains 7 replies, has 4 voices, and was last updated by  tedr 1 year, 5 months ago.

  • Creator
    Topic
  • #26948

    Tri Nguyen
    Participant

    Hi,

    Why would I choose “HDP Windows” over the “regular” HDP?

    Let’s take this imaginary scenario. My Dev Team is very proficient with the Windows environment: .NET, Powershell, IIS, Visual Studio, etc. But very weak knowledge of Linux. Could “HDP for Windows” allow us to be as efficient as another team using the regular HDP ? For example:

    – Admin and Monitor Hadoop using various Windows Management Tools and Framework
    – Securing access to Hadoop via Active Directory
    – Integrating Pig scripts in Powershell instead of Python
    – Writing code using .NET C# (Pig UDFs, MR jobs, HBase Client application, etc.)

    If none of the above is possible, then what is the real advantage of “HDP for Windows”? Because very few of our knowledge of Microsoft Dev Tools could be leveraged in this context, if not at all. Would it be actually more productive to learn Linux and work directly with the “regular” HDP distro?

    Thanks in advance for any advice.

Viewing 7 replies - 1 through 7 (of 7 total)

You must be logged in to reply to this topic.

  • Author
    Replies
  • #28675

    tedr
    Moderator

    Hi Tri,

    For more information on these please check out this link http://gettingstarted.hadooponazure.com/gettingStarted.html
    The information on HDInsight is transferrable to HDP on Windows.

    Thanks,
    Ted.

    Collapse
    #28175

    Tri Nguyen
    Participant

    Thanks for the info. Can you please tell me if the following scenarios are doable with HDP for Windows:

    1. I write a Pig script. But I need a customized UDF for some specific computation. Is it possible to write that UDF in C# and if yes, how do you register it in the Pig script? In this scenario, Pig is the main player. The UDF written in C# is just a helper function, but which must implement the proper Hadoop interface.

    2. I would like to use Pig script with a more elaborate logic control flow. Can I use Powershell as the programming script and embed Pig statement into Powershell to take care of the data flow control?

    3. Writing Hive queries (both DML and DDL) and capture Hive queries results: What is the tool currently used in HDP for Windows in interactive mode?

    4. Export data from HDFS to SQL Server: is it possible to use Sqoop? If yes, let’s say the export is complex and require some custom code. Can this code be written in C# instead of Java? In other words, Is Sqoop in HDP for Windows compatible with .NET ?

    Thanks in advance for any insight.

    Collapse
    #28172

    Seth Lyubich
    Keymaster

    Hi Tri,

    Just to add to what was already said here that you should be able to join your servers to domain and use Active Directory for user management.

    Hope this helps,

    Thanks,
    seth

    Collapse
    #28123

    Dan Rosanova
    Member

    Learning Linux is not something done lightly and most experienced Windows developers are quite comfortable with their tools. HDP for Windows has some compelling things going for it including the tooling. Almost all of the HDInsight tools work with HDP for Windows so the .NET components and examples work fine. Being able to use Windows system administration tools is fine and since the processes run as Windows services most of your monitoring tools will work with them as they are (like SCOM).

    You can totally write jobs in .NET or even in powershell. The thing to remember is at the lowest level anything with StdIn and StdOut (i.e. command line) will work – so a .NET console app will get the job done if nothing else. Also, the Visual Studio plug in for Python is actually said to be the best Python IDE out there.

    Most importantly something like 70% of the worlds servers run Windows – that’s a big market.

    Now this all doesn’t mean that there is no learning curve, but there is less learning curve – you will need to learn the Hadoop tools. If you need to learn the Hadoop tools AND the Linux OS… well, unless you have a UNIX background it’s going to be a 2x challenge.

    Collapse
    #27004

    tedr
    Moderator

    Hi Tri,

    Since HDP on windows is rather new there aren’t any examples that I know of yet. Also of note is that HBase is currently not included in the HDP on Windows release.

    Thanks,
    Ted.

    Collapse
    #26969

    Tri Nguyen
    Participant

    Hi Ted,

    Can you confirm that with “HDP for Windows” it is possible to develop using .NET C#, Powershell and using Windows to manage Hadoop, and almost zero knowledge of Linux is required?

    Do you know of any implementation of Pig + Powershell UDFs, Client app code in .NET C# working against a HBase back end?

    Thanks

    Collapse
    #26968

    tedr
    Moderator

    Hi Tri,

    Yes all of those “advantages” you mention would be good reasons to choose HDP on windows over regular HDP. Another item that would figure into the decision is that some companies have a policy to not allow the use of Linux servers, so in that case HDP on Windows would be the only option left.

    Thanks,
    Ted.

    Collapse
Viewing 7 replies - 1 through 7 (of 7 total)