cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button

Hadoop Ecosystem

About two years ago, Hortonworks donated the entire code base of about 440,000 lines from its XA Secure acquisition to the Apache Software Foundation (ASF) in order to help jump start Apache Ranger as an Apache Incubator project. Hortonworks made this decision because our enterprise customers need an extensible and robust open source security framework […]

In this blog, we will be discussing, SAS® Grid Manager for Hadoop. There are some very compelling reasons to modernize data architectures with Hadoop. Anyone responsible for administering SAS workloads on Hadoop or considering this path should know about SAS Grid Manager for Hadoop. What is SAS Grid Computing? SAS Grid Computing has been offering […]

Today, Hortonworks announced the Hortonworks EDW Optimization Solution to help extend and accelerate return on investment for business intelligence e.g. the data warehouse. The solution brings together technologies from Hortonworks and partners Syncsort and AtScale. But before I dig into the details of this solution it is worth understanding the vision Hortonworks is revealing here. […]

Interested in sharing your knowledge with the best and brightest in the data community? If you are, then be sure to submit an abstract for DataWorks Summit/Hadoop Summit San Jose, which will beheld June 13-15 at the San Jose McEnery Convention Center. DataWorks Summit/Hadoop Summit is the industry’s premier event focusing on next-generation big data […]

We are very excited to be bringing you DataWorks Summit/Hadoop Summit this year. It’s the industry’s premier event focusing on next-generation big data solutions. We hope that you’ll be able to attend this year and learn from your peers and industry experts about how open source technologies like Apache Hadoop, Apache Spark, and Apache NiFi […]

Bob Glithero Analytics Product Marketing Manager, Pivotal Over the last five years, mobile network operators (MNOs) realized 15% lower compound revenue growth on average than other types of communication service providers. With few exceptions, MNOs globally have seen a long-term decline in average revenue per user (ARPU). To reinvigorate growth, innovative MNOs are searching for […]

We were really excited to welcome a sold out crowd at the first Hadoop Summit in Tokyo last week.  This was a fantastic response, based on the huge interest around a technology that is transforming industries across Asia and Pacific. We could not put this kind of conference on without the help of our sponsors […]

We recently hosted a webinar on the topic of  HDF 2.0 and the integration between Apache NiFi, Apache Ambari and Apache Ranger.  We thought we would share the questions & answers from the webinar, and also compile relevant data into a single place to make it easy to find and reference. Should you have any […]

Guest author: Jeff Kelly, Data Strategist, Pivotal The phrase “digital transformation” gets bandied about a lot these days, but what exactly does it mean? When you strip away the hyperbole, I believe digital transformation is the process by which enterprises evolve from using traditional information technology to merely support existing business models to adopting modern […]

Provenance, Lineage & Chain of Custody The models of Provenance, Lineage and Chain of Custody are used in fine art to determine when a piece was created, the sequence of locations where it was held, how it was touched along the way, and who has owned it since creation, all with the purpose of authenticating the piece. […]

People often think about cloud architecture in simplistic terms: you’re either public, private, or hybrid. (In fact, there’s even confusion about the meaning of the term “hybrid” itself—this video helps clear it up: In the real world, of course, virtually every implementation is hybrid—no company puts 100% of its IT environment into one single cloud. […]

Apache Hive(™) is the most complete SQL on Hadoop system, supporting comprehensive SQL, a sophisticated cost-based optimizer, ACID transactions and fine-grained dynamic security. Though Hive has proven itself on multi-petabyte datasets spanning thousands of nodes many interesting use cases demand more interactive performance on smaller datasets, requiring a shift to in-memory. Hive 2 marks the […]

The Financial regulators are driving a Data Evolution Traditionally technology moves fast, regulators react slow. When technology leaps forward, it enables financial firms to change the nature of their business – often into un-regulated territory; Regulators react to pass regulation to catch up. This model can work in slow moving markets, but in todays interconnected […]

As enterprises around the world bring more of their sensitive data into Hadoop data lakes, balancing the need for democratization of access to data without sacrificing strong security principles becomes paramount. According to a recent research report by Securosis, “Hadoop has (mostly) reached security parity with the relational platforms of old, and that’s saying a […]

User Interface and User Experience are some of the most important aspects of developing a product. No matter how many amazing features something has, a user must be able to access them in order to reap the full benefits of the product. For example, in the Apache Ambari Web UI, add-on apps called Views have, […]