mapreduce - Piping data into jobs in Hadoop MR/Pig -
i have 3 different type of jobs running on data in hdfs. these 3 jobs have run separately in current scenario. now, we want run 3 jobs piping output data of 1 job other job without writing data in hdfs improve architecture , overall performance.
any suggestions welcome scenario.
ps : oozie not fitting workflow.cascading framework ruled out because of scalability issues. thanks
hadoop inherently writes storage (e.g. hdfs) after m/r steps. if want in memory, maybe need spark.
Comments
Post a Comment