Hadoop big data gets personal
One of the reasons Hadoop big data analysis is particularly valuable is that much of it is fundamentally about people – what they do, what they buy, what they think about or what they want, etc. Analysis gleaned from big data offers a clearer picture of human conditions and enables organizations to anticipate needs and respond to wants. Many big data projects offer a more objective view of the way humanity functions and can trigger insight into how to make improvements.
The newest trend in big data insight is analysis directly targeted at people, reported Bloomberg Businessweek. In particular, 'People Analytics' is an approach through which organizations turn the focus inward and use big data comprised of external research and intercompany observations to make operational improvements and interact better with employees.
"Because most communication and collaboration happens face to face, the data are critical for people analytics to take that next leap forward and become a transformative organizational tool," wrote Businessweek contributor Ben Waber.…
3 steps to making Hadoop data aerodynamic
Comparing Hadoop big data analytics to an aerodynamic vehicle produces a fairly apt parallel – both are modern concepts that harness lots of information and figure out how to concentrate it for optimal results. Big data can produce a lot of figurative weight and drag if insights aren't directed with the right focus, and the friction caused by retrieval lags can torpedo organizational growth. Here are three ways to make big data analytics soar.
1) Conquer the air
This step might sound silly, but it comes in the spirit of believing that the sheer amount of available information can be conquered. Big data arrives constantly and from various sources, continuously regenerating and offering new perspectives. According to DotNetNuke CEO Navin Nagiah, data streams will continue to get more crowded, but that doesn't have to mean analytics efforts must become cloudier. On the contrary, having the data and being able to use it will be paramount to success.
"In the business world, it is the company that has the data that has the power," wrote Nagiah.…
Hadoop adoption is key to understanding big data
The Hadoop explosion continues, with the number of organizations adopting the platform growing at a compound annual growth rate of about 60 percent, according to the International Data Corporation. However, ReadWrite's Matt Asay wrote that many of the companies adopting it are still acclimating to Hadoop and aren't using it at its optimal capacity. Currently, most organizations that use Hadoop take advantage of its storage and ETL (extract, transform, load) features, but aren't taking the crucial steps for optimizing big data analysis.
"The fact that most enterprises have yet to get to analytics in any meaningful way is simply a description of where we are in the Hadoop market's evolution," Asay suggested.
Part of the lag time, Asay asserted, is that its many components can make some users unwilling to spend time discovering what it has to offer. A recent report by CIO Insight illustrated the stratification of Hadoop users.…
Hadoop tools benefit medical IT
Hadoop big data tools can provide healthcare providers with numerous benefits both by enhancing patient care and driving down operational expenses. With the current state of healthcare costs in the United States, this should come as a welcome relief. According to analysis conducted by Aon Hewitt, the average healthcare premium increases are predicted to rise by 6.3 percent in 2013, bringing the average health plan premium cost for each employee to $11,188. With these rising costs showing no signs of abating, there has never been a more pressing need for the cost-saving applications of data analytics.
Charles Boicey, a member of the IT team at the University of California – Irvine Medical Center, recently spoke to eWeek about his organization's successful implementation of Hadoop architecture. Boicey was intrigued by the technology's ability to provide superior data analytics functions with little to no latency.…
Protecting data in Hadoop
Once an enterprise has its Hadoop platform up and running, it may want to consider employing security measures to protect all of the valuable data it is amassing on its servers. Cybercriminals can find many ways to profit from information, regardless of the industry or field of research. The effects of a data breach can damage any organization. Data analytics research projects can be compromised and sensitive information can be stolen. If a company using personal customer information had its databases breached, the public fallout and erosion of consumer confidence could be harmful. In order to prevent these unfortunate outcomes, data analytics researchers should ensure that their Hadoop big data projects are well secured.
Sarbanes-Oxley Compliance Journal contributor Manmeet Singh recently outlined several steps that organizations can take to protect their big data projects. One of the major aspects of big data security that managers should take into consideration is preparation.…
Big data spending on the rise
Big data and Hadoop software have provided enterprises from different industries with the resources to tackle a number of issues. The many success stories for data analytics appear to be spreading as more enterprises invest in the technology.
MarketsandMarkets recently released a study detailing the success of the big data market across the globe. According to researchers' findings, the worldwide Hadoop market is predicted to increase at a compound annual growth rate of 54.9 percent through 2017. Last year, the global market was worth $1.56 billion, but that figure is expected to reach $13.95 billion in 2017.
According to the report, businesses are using Apache Hadoop to provide real-time analytics to a wide variety of unstructured data covering everything from internet traffic to information collected from environmental sensors. Hadoop applications have been deployed by organizations looking to enhance their customer engagement, recruitment and retention practices as well as enterprises seeking more scientific applications such as is the case in the healthcare industry.…
Improving retail sales through big data
The spread of ecommerce businesses has created a formidable threat to the continued prosperity of physical retail stores. More consumers are turning toward online stores because of their convenience and low prices. The proliferation of mobile devices has made this trend even more pronounced with many shoppers having access to digital shops anytime and anywhere. In order to compete, retailers need to leverage every advantage they can get. Fortunately, big data analytics can help.
Creating a more personalized experience
In a recent ClickZ article, big data expert Krishnan Parasuraman offered several ways that retailers could improve their business with data analytics software. One method was to use big data tools to provide a more personalized experience for shoppers. Online stores have it easy in this regard since from the moment a customer logs into his or her account, the site has access to personal information, order histories and shopping preferences to craft an individualized marketing and sales approach.…
Hadoop tools predict mechanical breakdown
The manufacturing, utility and oil industries can especially benefit from big data applications. The huge numbers of datasets generated by these sectors can be gathered and processed more efficiently now, allowing data analysts to extract valuable information regarding many operations.
For instance, the advent of smart meters has allowed utility companies to gather information on consumer energy consumption. According to a survey conducted by Oracle about the utility industry, companies plan to leverage big data tools to predict electricity demands and minimize the scope of power outages. Enterprise CIO Forum contributor Jamal Khwaja argued that the information gleaned from smart meters could be used to determine exactly how much wasteful energy consumption practices – such as using incandescent light bulbs or leaving appliances running through the night – are costing homeowners on their electric bill.
One promising development in data analytics has been the recent announcement of Hadoop-powered software that could predict the likelihood of failure in expensive machinery, according to IT Jungle.…
Modern customer engagement requires data technology
Big data tools such as Hadoop and data virtualization are essential for customer management as businesses move toward providing more individualized experiences, according to Forrester Research. In a recent column for CIO Journal, Forrester analysts Noel Yuhanna and Mike Gualtieri explained that organizations need a holistic technology approach that includes elements such as big data, predictive analytics and data virtualization.
"You must consider every shred of customer data available for analysis, as it may contain gems that you can use to individualize experiences," they wrote. "Traditional data management solutions and approaches have difficulty consolidating and processing the array of large and unstructured data sets that defines big data. To support a customer big data platform, you need new technologies and architectures, including Hadoop, NoSQL databases, advanced enterprise data warehouses, and cloud analytic platforms."
Not only will organizations need to leverage customer information to gain better insight into how consumers are behaving, they will need to use analytics tools to build predictive models, Yuhanna and Gualtieri wrote.…
Company finds success with Hadoop and Hortonworks
Hadoop technology has been at the forefront of the big data boom from the beginning. As the trend continues to expand, more developers are finding that the analytics platform offers the right balance of scalability, operability and cost-effectiveness for meeting their big data needs. It does not appear that Hadoop's grip on the industry will loosen any time soon.
TechNavio recently released a report forecasting the growth of the worldwide Hadoop market. Researchers expect the market to expand at a compound annual growth rate of more than 55 percent through 2016. According to the report's publishers, the increasing demand for big data analytics will drive Hadoop spending in the next few years.
The freedom of an open source platform
Many businesses have already achieved success by utilizing Hadoop technology. Network World reported that Neustar, a publicly-traded analytics company worth approximately $830 million, has begun building its big data solutions on Hadoop platforms.…
The retail value of big data
For years, retailers have leveraged big data tools to improve their marketing efforts and profile customer needs. Sears Holdings has been one of the most fervent believers in the retail value of data analytics in recent years. Using resources built upon Hadoop architectures, the company has significantly enhanced its data analytics program. According to InformationWeek, since adopting Apache Hadoop technology in 2010, Sears has significantly decreased the turnaround time on its marketing analysis initiatives, while expenditures have fallen to a third of the cost of comparable big data platforms.
The retail giant recently announced it has been able to improve upon those figures, reducing the amount of time needed to return meaningful insight from data analytics processes as much as 70 percent. Sears CTO Phil Shelley presented a webinar highlighting the various benefits of the company's Hadoop programs. In addition to its cost-effectiveness, the Hadoop platform provides users with an open source foundation upon which developers can scale their projects as high or low as they please.…
Where is Hadoop headed?
The early years of Hadoop development focused on creating a platform that could handle analytics at a large scale, but the coming years will more likely focus on refining the environment to work at faster speeds, according to panelists at the recent Structure: Data 2013 conference. Reporting from several panels at the event, GigaOM noted that running queries on large data sets is now a manageable process supported by a wealth of applications. Developers are now setting their sights on introducing more interactivity to the Hadoop ecosystem.
Hadoop needs to move in the direction of offering fast, predictive capabilities so that it can fulfill the same role as a feature like Google's "I'm Feeling Lucky" search function, Omer Trajman of analysis applications company WibiData suggested. Users should be able to plug in queries and get smart, dynamic responses. For this to happen, companies will need "Hadoop high throughput, low latency," analytics executive Muddu Sudhakar said.…
Predicting brain damage with Hadoop big data
For several years now, Hadoop big data tools have provided companies with an open platform to pursue many ambitions. Although big data rose to prominence because of its ability to enhance marketing campaigns, organizations from numerous sectors have been finding new and exciting applications for the technology. Some of the most impressive developments in the data analytics field has come from the healthcare industry. Physicians have begun to leverage big data tools to diagnose patients as well as screen others for chronic illnesses. Recent developments in the field have gone one step further.
Brain injuries by the numbers
Traumatic brain injuries (TBI) are a matter of serious concern in the United States. According to data gathered by the Centers for Disease Control and Prevention, 1.7 million cases of TBI are recorded each year. In nearly half of those cases, the patient is reported to be a child under the age of 14.…
How Facebook uses Hadoop and Hive
Social media giant Facebook is one of Hadoop and big data's biggest champions, and it claims to operate the largest single Hadoop Distributed Filesystem (HDFS) cluster anywhere, with more than 100 petabytes of disk space in a single system as of July 2012. The site stores more than 250 billion photos, with 350 million new ones uploaded every day, Jay Parikh, the company's vice president of infrastructure, told InformationWeek in a recent interview. He explained that the social network must use a number of tools – among them Hadoop, Hive and HBase – to manage its user information and effectively run its business.
According to Parikh, Hadoop is used in every Facebook product and in a variety of ways. User actions such as a "like" or a status update are stored in a highly distributed, customized MySQL database, but applications such as Facebook Messaging run on top of HBase, Hadoop's NoSQL database framework.…
Week in Review: Sandboxes, HDP 2.0 Alpha 2, Hive Performance and Summits
It’s almost time for that final drive home of the week, and what a week it has been with a few new releases, a summit, and a little bit of technical fun. Here’s what happened:
New Sandbox Release. Yes, your favorite Hadoop VM image just got even better. Cheryle took us through the new features which included Ambari integration and Russell followed up with a quick tour of Ambari. There’s still plenty of time to download Sandbox for a weekend of data crunching fun.
HDP 2.0 Alpha 2 was released. This preview release demonstrates some of the performance improvements in store for the final HDP 2.0 release via YARN, enhancements to Hive per the Stinger Initiative, and Apache Tez. Just before the release, we posted some early test results which showed a 45X (yes, that’s forty five) performance improvement for Hive interactive queries. But that’s just the beginning as we push to 100X, and Microsoft also talked about their contributions to the Stinger Initiative with the same aim in mind.
If you’ve downloaded Sandbox and are looking for some inspiration for a little fun, then Russell also posted a two part series on extracting, loading, querying and analyzing your own Twitter archive with Hive. Part 1 is here, and Part 2 is here.
And finally, there was just the small matter of the Hadoop Summit in Amsterdam. We had a great time and hope you did too. Thank you for attending, contributing to the conversation and supporting Hadoop. If you’re now really excited to learn Hadoop, we posted about available training we have in Europe and Palo Alto.
And that was the week that was. Has your Sandbox downloaded yet?