mapreduce - Piping data into jobs in Hadoop MR/Pig -


i have 3 different type of jobs running on data in hdfs. these 3 jobs have run separately in current scenario. now, we want run 3 jobs piping output data of 1 job other job without writing data in hdfs improve architecture , overall performance.

any suggestions welcome scenario.

ps : oozie not fitting workflow.cascading framework ruled out because of scalability issues. thanks

hadoop inherently writes storage (e.g. hdfs) after m/r steps. if want in memory, maybe need spark.


Comments

Popular posts from this blog

javascript - IndexedDB error: Uncaught DataCloneError: Failed to execute 'put' on 'IDBObjectStore': An object could not be cloned -

java - Unable to make sub reports with Jasper -

Integrity error when loading fixtures for Selenium testing in Django -