Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.

cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
November 06, 2014
prev slideNext slide

Improve Insight into Your Enterprise Data with Red Hat JBoss Data Virtualization and HDP – Part 2

In part 1, Kenneth Peeples, JBoss technology evangelist and principal marketing manager for Data Virtualization and Fuse Service Works at Red Hat, gave us an overview of the Red Hat and Hortonworks webinar series and offered insights into JBoss Data Virtualization and HDP. He started with an overview of data virtualization with the Hortonworks Data Platform and went over the first use case, Sentiment and Sales Analysis. Today, he describes the three other use cases. For those of you who missed one of our webinar series (or want to review them), you can find recordings of all sessions on the Red Hat partner page

Use Case 2 from Webinar 2 – Data Firewall with Multiple Hadoop Instances

rd_1This use case describes using a data firewall with multiple Hadoop instances. The data virtualization unified view uses roles to display all data, data for a specific region and/or mask data.

  • Objective: Secure data according to role for low level security and column masking.
  • Problem: Cannot hide region data from region specific users.
  • Solution: Leverage JBoss Data Virtualization to provide low level security and masking of columns.

In this use case, we show some of the security capabilities within DV such as Role Based Access Control (RBAC), Column Masking and Centralized Management of VDB privileges. A summary of the security features in DV are:

  • Authentication: Kerberos, LDAP, WS-UsernameToken, HTTP Basic, SAML
  • Authorization: Virtual data views, Role based access control
  • Administration: Centralized management of VDB privileges
  • Audit: Centralized audit logging and dashboard
  • Protection: Row and column masking and SSL encryption (ODBC and JDBC)

Demonstration Detail: Our demonstration will show multiple test cases. We have a super user or admin user, a US user and EU user. The admin user has the admin role, the US user has the usaccess role and the EU user has the euaccess role. We have a Virtual Database with the masking and row level security defined. Two HDP Sandboxes are setup with 2 tables in each – customer and customer address. The VDB contains 2 tables – customers and customer addresses. Our three test cases are:

  1. Superuser with admin role
    • All data is available for all the regions for both tables
    • No data is masked
  2. USuser with USaccess role
    • Only the US region data is viewable
    • All of the birthdate information is masked
  3. EUuser with EUaccess role
    • Only the EU region data is viewable
    • All of the birthdate information is masked

The SQuirreL Client is used to run the different use cases to highlight the row security and column masking.

Demonstration References

Use Case 3 from Webinar 3 – Virtual Data Marts with multiple Virtual Databases and the HDP Sandbox

rh2This use case describes using Virtual Data Marts with multiple Virtual Databases and the HDP Sandbox.

  • Objective: Purpose oriented data views for functional teams over a rich variety of semi-structured and structured data.
  • Problem: Data Lakes have large volumes of consolidated clickstream data, product and customer data that need to be constrained for multi-departmental use.
  • Solution: Leverage HDP to mashup clickstream analysis data with product and customer data to better understand customers’ behaviors on the website, and mashup customer and product data to improve product marketing strategy.

Demonstration Detail: Our demonstration uses one HDP Sandbox with two VDBs. The User, Product and web log data are all stored in the HDP. The two VDBs allow access for the Marketing and Product teams. The Marketing VDB combines the clickstream logs with customer data so that Marketing could find who (what gender, age) is accessing their site and when they drop off. The Product VDB that combines the customer and product data so they can see who (location, age, gender) has been buying the products so they can make product plans targeting their users. The Data Virtualization Dashboard is used to show the data according to the Marketing or Product teams.
Demonstration References:

Use Case 4 from Webinar 3 – Materialized views to Improve Access to Data

This use case describes how to use materialized views to improve access to your data. This use case is in progress so the demonstration source and supporting files will be available soon, so keep a look out for them, but I want to describe it briefly here.

  • Objective: Improve access to data, especially operational data.
  • Problem: All the legacy and archived data are in the Hadoop data lake. We want to access the most recent, up to the minute, operational data often and quickly.
  • Solution: Use JBoss Data Virtualization to integrate up to the minute data from multiple diverse data sources that can be quickly queried.
    • Use HDP for all data older than today.
    • Use JDV to materialize the data in HDP for faster access and to combine with operational VDB

    rh3
    Demonstration Detail: This demonstration is being worked on currently.

    Demonstration References

    • Source and Supporting Files: To be posted
    • Tutorial: To be posted

    Conclusion

    In closing, DV and HDP complement each other to give your enterprise the necessary tools and architecture to get the most out of your data. Your legacy data stores as well as big data can be combined into helpful views for easier analysis through a wide range of analytic tools. This will help your enterprise interpret the large amounts of data that continually grow at an astronomical rate in today’s enterprise. Try our demonstrations with the collateral that has been created and keep watching for more as the partnership with Red Hat and Hortonworks continues to grow.

    Additional Resources

    To learn more, listen to the replays of the webinars listed below:

Tags:

Leave a Reply

Your email address will not be published. Required fields are marked *