data sets in Hive that appropriate for keyword search

to create new topics or reply. | New User Registration

This topic contains 0 replies, has 1 voice, and was last updated by  Wenjian XU 1 year, 8 months ago.

  • Creator
  • #44567

    Wenjian XU


    I want to do keyword search over Hive. I would like to ask what kind of data sets did you use in Hive.

    Actually, I want to find some use cases for keyword search. For example, there are three tables A, B and C distributed over HDFS. When the user issues a keyword query Q, the system should return results that contain Q by joining Tables A, B, and C.

    So, my question is, apart from TPC-H, are there any data sets (contains three or more tables) that appropriate for the keyword search scenario?

    Thanks a lot!!!

You must be to reply to this topic. | Create Account

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.