The Hortonworks Blog

Posts categorized by : Hadoop

Our SI partner Ingenious Qube worked with a customer who wanted to price auto insurance based on driving behavior insights obtained from sensors on cars using Hortonworks Data Platform. Rajnish Goswami, CEO of Ingenious Qube, describes the customer story below.

The Situation

Insurance companies around the world strive to provide lower insurance rates, and auto insurance is no exception to this phenomenon. The automobile insurance companies are devising ways to derive innovative pricing models that will help customers reduce their insurance premiums; however, it requires an understanding of how one drives their vehicle.…

This guest blog post is from Srikanth Venkat, director of product management at Dataguise, a Hortonworks security partner.

Plus ça change, plus c’est la même chose As Jean-Baptiste Alphonse Karr noted “The more things change, the more they stay the same.” Often, that’s not what we hear when looking at Hadoop security: people tend to call out how different Hadoop is, and how different its security solutions need to be.…

With YARN as its architectural center, Apache Hadoop continues to attract new engines to run within the data platform, as organizations want to efficiently store their data in a single repository and interact with it for batch, interactive and real-time streaming use cases. As more data flows into and through a Hadoop cluster to feed these engines, Apache Falcon is a crucial framework for simplifying data management and pipeline processing.

Falcon enables data architects to automate the movement and processing of datasets for ingest, pipeline, disaster recovery and data retention use cases.…

We take pride in producing valuable technical blogs and sharing it with a wider audience. Of all the blogs published in 2014 on our website, the following were most popular:

  • Improving Spark for Data Pipelines with Native YARN Integration.

    Gopal Vijayaraghavan and Oleg Zhurakousky demonstrate improved Apache Spark, which with the help of the pluggable execution context.

  • HDP 2.2 A Major Step Forward for Enterprise Hadoop

    Tim Hall outlines six months of innovation and new features across Apache Hadoop and its related projects.

  • Introduction

    Apache Ranger provides centralized security for the Enterprise Hadoop ecosystem, including fine-grained access control and centralized audit mechanism, all essential for Enterprise Hadoop. This blog covers various details of Apache Ranger’s audit framework options available with Apache Ranger Release 0.4.0 in HDP 2.2 and how they can be configured.

    The audit framework can be configured to send access audit logs generated by Apache Ranger plug-ins to one or more of the following destinations:

    • RDBMS: MySQL or Oracle
    • HDFS
    • Log4j appender
    Default Value xasecure.audit.is.enabled Setting to enable/disable audit logging in the Ranger plug-in.…

    On December 4th, Hortonworks presented the fifth of 8 Discover HDP 2.2 webinars: Apache Kafka and Apache Storm for Stream Data Processing. Taylor Goetz, Rajiv Onat, and Justin Sears hosted this 5th webinar in the series.

    After Justin Sears set the stage for the webinar by explaining the drivers behind Modern Data Architecture (MDA), Rajiv Onat and Taylor Goetz introduced and discussed how to use Apache Kafka and Apache Storm for stream data processing.…

    With Apache Hadoop YARN as its architectural center, Apache Hadoop continues to attract new engines to run within the data platform, as organizations want to efficiently store their data in a single repository and interact with it for batch, interactive and real-time streaming use cases. Apache Storm brings real-time data processing capabilities to help capture new business opportunities by powering low-latency dashboards, security alerts, and operational enhancements integrated with other applications running in the Hadoop cluster.…

    Hortonworks introduces HDP Operations Ready, HDP Security Ready and HDP Governance Ready certifications to showcase solutions that deeply integrate with enterprise Hadoop.

    Customer adoption of Apache Hadoop continues to accelerate the pace at which the community works to meet the requirements of Enterprise Hadoop. Once the place of HDFS and MapReduce only, the introduction of Apache Hadoop YARN a little over a year ago has unleashed many new ways to get value from a Hadoop cluster.…

    Hortonworks architects vertically integrate the projects within our Hadoop distribution with YARN and HDFS in order to enable HDP to span workloads from batch, interactive, and real time—across both open source and other data access technologies. In HDP 2.2, we deliver work to vertically integrate Apache Storm, Apache Accumulo and Apache HBase so that all of those long-running services run in Hadoop on YARN via Apache Slider.

    The Apache Slider community recently released Apache Slider 0.60.0.…

    On November 13th, Hortonworks presented the fourth of 8 Discover HDP 2.2 webinars: Rohit Bakhshi, Jitendra Pandey, and Justin Sears hosted this 4th webinar in the series.

    Rohit Bakhshi and Jitendra Pandey introduced HDP and discussed how to use HDFS for reliable, scalable, cost-efficient, and fault tolerant as a distributed data storage platform for your Modern Data Architecture (MDA). They also covered new HDFS data storage innovations now included in HDP 2.2:

    • Heterogeneous storage
    • Encryption
    • Operational security enhancements

    Here is the complete recording of the Webinar.…

    As we approach the opening bell on Nasdaq and another milestone for open source Apache Hadoop, we at Hortonworks want to thank those who have contributed deeply to this journey. We owe you – our customers – a huge thank you. Your active collaboration with us in the Apache Hadoop community has greatly impacted the trajectory of this platform for data management and has established a path for how thousands of other enterprises can successfully build a new open data architecture that brings all data under management.…

    Many types of industries are finding new opportunities from an abundance of new types of data stored at scale in Hadoop, combined with Hadoop’s ability to process that data at lower costs than traditional platforms. Apache Hadoop and the Hortonworks Data Platform (HDP) can help enterprises turn what used to be data fumes into high-octane fuel that propels their businesses.

    Sign up for the Hadoop industry solutions email series to find out how Hortonworks customers use Hadoop to solve real-world business challenges.…

    The Stinger.next initiative, with its focus on transactions, sub-second queries and SQL:2011 Analytics evolves Apache Hive to allow it to run most of the analytical workloads that are typical within a data warehouse, but now at petabyte scale. The first phase of Stinger.Next, delivered in Apache Hive 0.14 and in HDP 2.2, delivers transactions with ACID semantics a critical step in the evolution of the Hive as the defacto standard for SQL in Hadoop.…

    The public sector is charged with protecting citizens, responding to constituents, providing services and maintaining infrastructure. In many instances, the demands of these responsibilities increase while government resources simultaneously shrink under budget pressures.

    How can Intelligence, Defense and Civilian agencies do more with less?

    Apache Hadoop is part of the answer. Within the public sector, Hadoop delivers data-driven actions in support of IT efficiency and good government.

    Download the White Paper

    In one example, the United States Internal Revenue Service had to reduce its auditor headcount due to budget cuts.…

    With Apache Hadoop YARN as its architectural center, Apache Hadoop continues to attract new engines to run within the data platform, as organizations want to efficiently store their data in a single repository and interact with it in different ways. As YARN propels Hadoop’s emergence as a business-critical data platform, the enterprise requires more stringent data security capabilities. Apache Ranger provides many of these, with central security policy administration across authorization, accounting and data protection.…