Dstributing work in Flume 1.4

to create new topics or reply. | New User Registration


This topic contains 1 reply, has 2 voices, and was last updated by  Robert 1 year, 11 months ago.

  • Creator
  • #30371


    I looked at Apache Flume 1.4 documentation and its not clear about how to distribute the work accross nodes. I have to fetch data from multiple sources and multiple query terms for each source and need to poll them regularly. Older versions of Flume for flume-master and flume-node(s). Cloudera release notes says these concepts are replaced by Agents. But how to distribute the agent(s)?
    Should I manually create multiple configurations on each node and start agent on each of the node? or can be done through any centralized mechanism?

Viewing 1 replies (of 1 total)

The topic ‘Dstributing work in Flume 1.4’ is closed to new replies.

  • Author
  • #30504


    Hi Maruti,
    The new generation of flume does not have a centralized mechanism. The agents are independent. You would have to launch each agent on each of the nodes.


Viewing 1 replies (of 1 total)
Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.