Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.

cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
October 19, 2017 | Shelby Khan | Dataworks Summit

7 Sessions From DataWorks Summit Sydney You Should See

October 18, 2017 | Kevin Jordan | Hortonworks Case Study

How Much Can You Trust Your Big Data?

October 16, 2017 | Matt Spillar | Hortonworks Case Study

Leveraging Data to Make Decisions in Financial Services

Viewing posts by: Carter Shanklin« Back to all

X
FILTERS
ALL
TECHNICAL
BUSINESS

All Topics















All Channels











CLEAR FILTERS

The 100% open source and community driven innovation of Apache Hive 2.0 and LLAP (Long Last and Process) truly brings agile analytics to the next level. It enables customers to perform sub-second interactive queries without the need for additional SQL-based analytical tools, enabling rapid analytical iterations and providing significant time-to-value. TRY HIVE LLAP TODAY Read about […]

The world’s top authorities on Apache Hadoop convene at Hadoop Summit San Jose and one of the top questions that will be answered will be around the future and direction of Hadoop. Sanjay Radia – Founder and Architect, Hortonworks lead the track which selected 13 sessions around this topic. I asked Sanjay what he hoped would […]

Our business in Europe continues to expand and I’m excited to share this guest blog post from Geoff Cleaves, Business Intelligence Manager at Billy Mobile a new Hortonworks customer based in Barcelona, Spain. This week at Billy Mobile we are migrating our core technology stack onto HDP 2.3 and boy are we looking forward to […]

I recently had the pleasure of visiting with Arvind Battula, Sr. Data Scientist at Schlumberger. We discussed his background as a chemical and mechanical engineer and his move onto the Data and Analytics team as a data scientist. The following is a transcript of my conversation with Arvind. We discussed his background, his interesting focus areas for […]

Yahoo! JAPAN needed a data platform that could scale to generate 100,000 reports per day as well as having the ability to process large amounts of data. It needed to keep the last 13 months’ worth of data, which is approximately 500 billion rows, organized and easily accessible. Relational Database Management Systems (RDBMS) cannot scale […]

Are you still learning about the Data Lake? Wondering how it can help your organization manage and leverage massive amounts of data? On September 8th, VHA, the largest member-owned health care company delivering supply chain management services and clinical services to its members, will share their experience and explain how they simplified data management and […]

On July 22nd, we introduced the general availability of HDP 2.3. In part 2 of this blog series, we explore notable improvements and features related to Data Access. SQL on Hadoop Spark 1.3.1 Stream Processing Systems of Engagement that scale HDP Search We are especially excited about what these data access improvements mean for our […]

Hortonworks is always pleased to see new contributions come into the open-source community. We worked with our customer, Hotels.com, to help them develop libraries and utilities around Apache Hive, the Apache ORC format and Cascading. It’s great to see the results released for the community. In this guest blog, Adrian Woodhead, Big Data Engineering Team […]

As YARN drives Hadoop’s emergence as a business-critical data platform, the enterprise requires more stringent data security capabilities. The Apache Ranger delivers a comprehensive approach to security for a Hadoop cluster. It provides a platform for centralized security policy administration across the core enterprise security requirements of authorization, audit and data protection. On June 10th, […]

Sumeet Kumar Agrawal, principal product manager for Big Data Edition product at Informatica, is our guest blogger. In this blog, explains how Informatica’s Big Data Edition integrates with Tez and allow for significant performance gains. Informatica Big Data Edition’s codeless visual development environment accelerates the ability of enterprises to take advantage of amazing innovations in big […]

Not a day passes without someone tweeting or re-tweeting a blog on the virtues of Apache Spark. At a Memorial Day BBQ, an old friend proclaimed: “Spark is the new rub, just as Java was two decades ago. It’s a developers’ delight.” Spark as a distributed data processing and computing platform offers much of what […]

SQL is the most popular use case for the Hadoop user community, and Apache Hive is still the defacto standard. Early this week, the Apache Hive community released Apache Hive 1.2.0. Already the third release this year, the Hive developer community continues to improve the release and grow its team, with 11 Hive contributors promoted […]

Kristen Hardwick, Vice President of Big Data Solutions at Spry, Inc is our guest blogger. In this blog, Kristen shares performance analysis during Spryinc’s evaluation of Apache Hive with Tez as a fast query engine. In early 2014, Spry developed a solution that heavily utilized Hive for data transformations. When the project was complete, three […]

Historically, the strength of a platform lies in the abilities of developers to learn, try, and build against the platform APIs and capabilities. As Apache Hadoop matures as a platform, it’s the creativity and efforts of the developer community that is driving the innovation that makes Hadoop a vibrant and impactful foundation of a modern […]

Two weeks ago, Apache ORC became an Apache top-level project within the Apache Software Foundation (ASF). This step represents a major step forward for the project, and it is representative of its momentum been built by a broad community of developers. What is ORC and why is it useful? Back in January 2013, we created […]