Spark SQL: NULL handling in IF -
i trying perform if
[spark's coalesce] on top of left outer joined output, seems null
not getting handled expected. here base tables, sample query, output , expected output-
base tables:
t1:
a,100
b,101
c,102t2:
101
query:
select a.x, a.x1, if(b.x1 null,a.x1,b.x1) t1 left outer join t2 b on a.x1=b.x1;
output:
a,100,null
b,101,101
c,102,null
expected:
a,100,100
b,101,101
c,102,102
i have tried wrapping above query , performing if on top of it. no success. please suggest missing something.
this seems working
file: tbl1
1 2 b 3 c
file: tbl2
1 c 3 d case class c_tbl1(c1: string,c2: string) sc.textfile("tbl1").map { row => val parts = row.split("\t") c_tbl1(parts(0),parts(1)) }.registertemptable("t_tbl1") case class c_tbl2(c1: string,c2: string) sc.textfile("tbl2").map { row => val parts = row.split("\t") c_tbl2(parts(0),parts(1)) }.registertemptable("t_tbl2") sqlcontext.sql("""select t.c1,t.c2,if(t2.c1 null,1,2),t2.c2 t_tbl1 t left outer join t_tbl2 t2 on t.c1=t2.c1""".stripmargin).collect.foreach(println) [1,a,2,c] [2,b,1,null] [3,c,2,d]
Comments
Post a Comment