mapreduce - Piping data into jobs in Hadoop MR/Pig -


i have 3 different type of jobs running on data in hdfs. these 3 jobs have run separately in current scenario. now, we want run 3 jobs piping output data of 1 job other job without writing data in hdfs improve architecture , overall performance.

any suggestions welcome scenario.

ps : oozie not fitting workflow.cascading framework ruled out because of scalability issues. thanks

hadoop inherently writes storage (e.g. hdfs) after m/r steps. if want in memory, maybe need spark.


Comments

Popular posts from this blog

java - Unable to make sub reports with Jasper -

scala - play framework: Modules were resolved with conflicting cross-version suffixes -

Save and close a word document by giving a name in R -