mapreduce - Piping data into jobs in Hadoop MR/Pig -


i have 3 different type of jobs running on data in hdfs. these 3 jobs have run separately in current scenario. now, we want run 3 jobs piping output data of 1 job other job without writing data in hdfs improve architecture , overall performance.

any suggestions welcome scenario.

ps : oozie not fitting workflow.cascading framework ruled out because of scalability issues. thanks

hadoop inherently writes storage (e.g. hdfs) after m/r steps. if want in memory, maybe need spark.


Comments

Popular posts from this blog

java - Plugin org.apache.maven.plugins:maven-install-plugin:2.4 or one of its dependencies could not be resolved -

Round ImageView Android -

How can I utilize Yahoo Weather API in android -