The Hortonworks Blog

Posts categorized by : Big Data

Since its first deployment at Yahoo in 2006, HDFS has established itself as the defacto scalable, reliable and robust file system for Big Data. It has addressed several fundamental problems of distributed storage at unparalleled scales and with enterprise grade robustness.

As more and more enterprises adopt Apache Hadoop, it is becoming a unified central storage aka Data Lake for all kinds of enterprise data. Many of these storage use cases are for file storage for classic big data applications, where HDFS is the perfect fit.…

Today’s guest blog comes from Matt Davies at Splunk, where he is the Director of Marketing.

You can’t really escape the fact that we’re in the age of the customer. From CRM to the “long tail” to multi-channel to social media brand sentiment to Net Promoter Scores – it is all about customer experience. Big Data has an important part to play – no great revelation there but how do you actually do it?…

Syncsort is a certified Hortonworks Technology and YARN Ready Partner and our guest blogger. Here, Tendu Yogurtcu, vice president of engineering at Syncsort, expands on Syncsort’s recent news about their integration of DMX-h and Ambari.

As Apache Hadoop YARN has transformed Hadoop from being a data processing solution to being a true data processing platform, requirements for provisioning, managing, and securing the platform have changed dramatically.

Stability, security, easy deployment, performance, management and monitoring are among many of the key attributes that make a data management platform enterprise-grade.…

With the release of Apache Hadoop YARN in October of last year, more and more solution providers are moving from single-application Hadoop clusters to a versatile, integrated Hadoop 2 data platform. This allows them to host multiple applications — eliminating silos, maximizing resources and bringing true multi-workload capabilities to Hadoop. 

That is why we’re  extremely excited to have Paul Kent, Vice President of Big Data at SAS, share his insights on the value of Apache Hadoop YARN and the benefits it brings to SAS and its users. …

Today we are delighted to announce the formal partnership between Accenture and Hortonworks, which is the continuing evolution of the ongoing collaboration between the two companies which started in 2012. With this formal agreement, Accenture and Hortonworks will collaborate on making large structured and unstructured datasets – including operational, video and sensor data – more accessible to organizations for insight-driven decision-making. Together, the two companies will continue to collaborate on joint horizontal and vertical solutions to speed the adoption of Apache Hadoop.…

Apache Cassandra is an open source NoSQL distributed database management system designed to handle large amounts of data offering a scalable real time solution that allows users to create online applications that are “always-on, no matter what.” DataStax is the company behind Cassandra, and a new Technology Partner of Hortonworks.

Lynn Walitch leads Partner Management for DataStax and is our guest blogger today. Lynn discusses the importance of the partnership and certification with Hortonworks.…

Oscar Padilla, Vice President of Strategy at Luminar, is our guest blogger. He shares his thoughts and insights about Apache Hadoop, Hortonworks Data Platform, and Luminar’s journey to the Data Lake.

Luminar is the first big data analytics provider focused specifically on U.S. Latino consumers. Our company offers analysis based on empirical insights, rather than with a sample-based approach. Apache Hadoop and Hortonworks Data Platform (HDP) make this empirical approach work at scale.…

Data Analytics Virtual Event

Hortonworks and Teradata have partnered to provide a clear path to Big Data Analytics via stable and reliable Hadoop for the enterprise. We are excited to support their upcoming Big Data Analytics virtual event, “Data Discovery in Action.” We will have experts standing by to help answer questions to help ensure you have the right strategy in place for all of your big data.

At this event on July 2 nd, you will learn more about how Teradata’s Unified Big Data Architecture™ provides a quick path to data discovery.…

We recently hosted the fourth of our seven Discover HDP 2.1 webinars, entitled Apache Hadoop 2.4.0, HDFS and YARN. It was very well attended and a very informative discourse. The speakers outlined the new features in YARN and HDFS in HDP 2.1 including:

  • HDFS Extended ACLs
  • HTTPs support for WebHDFS and for the Hadoop web UIs
  • HDFS Coordinated DataNode Caching
  • YARN Resource Manager High Availability
  • Application Monitoring through the YARN Timeline Server
  • Capacity Scheduler Preemption

Many thanks to our presenters, Rohit Bakhshi (Hortonworks’ senior product manager), Vinod Kumar Vavilapalli (co-author of the YARN Book, PMC, Hadoop YARN Project Lead at Apache and Hortonworks), and Justin Sears (Hortonworks’ Product Marketing Manager).…

Trifacta is a Hortonworks Technology Partner, a pioneer in data transformation, recently certified with HDP 2.1. Here, Trifacta’s CTO and Co-founder Sean Kandel, talks about their Predictive Interaction ™ solution with Hortonworks Data Platform.

“I spend more than half my time integrating, cleansing and transforming data without doing any actual analysis. Most of the the time I’m lucky if I get to do any analysis.” – Data Scientist [1]

The most commonly reported use of Hadoop today is data transformation. …

Customers want to make more rapid, data-driven decisions but historically this has been challenging in the era of Big Data. Predictive analytics, machine learning and statistical algorithms are at the leading edge of where enterprises can unlock the value hidden in their data to deliver timely insights for intelligent decisions.

Zementis is a new Hortonworks Technology Partner offering a standards-based predictive analytics scoring engine for Hortonworks Data Platform (HDP) and existing data repositories as part of the Modern Data Architecture (MDA).…

In this blog, Paul Phillips, EMEA Sales Director at Hortonworks, discusses the importance of extending big data science courses to PhD students and scientists. This joint venture with KPMG provides an opportunity to “bring excellent basic skills that are useful in data science and this programme aims to commercialize these skills and ease the path to a data science profession.”

At Hortonworks, we encourage our team members to innovate and as the Open Source community grows, it is also vital that we play our part to ensure the community is continually reinvigorated with new ideas and innovation. …

On Wednesday May 21, Himanshu Bari (Hortonworks’ senior product manager), Venkatesh Seetharam (committer to Apache Falcon), and Justin Sears ( Hortonworks’ Product Marketing Manager), hosted the third of our seven Discover HDP 2.1 webinars. Himanshu and Venkatesh discussed data governance in Hadoop through Apache Falcon that is included in HDP 2.1. As most of you know, ingesting data into Hadoop is one thing; having data governed, by dictating and defining data-pipeline policies, is another thing—a necessity in the enterprise.…

According to New York Observer, there were couple of major social reasons that spurred the genesis and growth of Meetup.com. First, it was Robert Putman’s book Bowling Alone, in which he talks about the collapse of communities in America. And the second was an event that not only changed the world but changed New York: it was the aftermath of September 11, where strangers cared about greeting, meeting, and talking.…

For years, experts in the healthcare industry have been searching for ways to detect (and possibly cure) Alzheimer’s disease, the most common form of dementia. Current estimates indicate that 35.6 million people are living with dementia, projected to jump to 135 million by 2050, according to the Global CEO Initiative on Alzheimer’s Disease. At a projected cost of over $600 billion each year, it’s a looming global health and fiscal crisis.…

Go to page:12345...Last »