hadoop - Pig HBaseStorage - How to Generate Dynamic Column Names and a Dynamic Number of Column Qualifiers from a DataBag? -
a has 1:m relationship b.
a = load ... ( a_id:char ,... ); b = load ... ( a_id:chararray ,b_id:chararray ,... ); joined = join a_id, b a_id; grouped = group joined a::a_id;
this create databag following schema:
{group: chararray, joined: {(a:a_id, ..., b::a_id, b::b_id, ...)}}
for example:
(1, {(1, ..., 1, 1, ...)}) (2, {(2, ..., 2, 2, ...), (2, ..., 2,3, ...), (2, ...,2,4, ...)}) (3, {(3, ..., 3, 5, ...)})
for these 3 rows, how corresponding hbase results like:
rowkey = 1, a:a_id=1, ... b:b1|a_id=1, b:b1|b_id:=1 rowkey = 2, a:a_id=2, ... b:b2|a_id=2, b:b2|b_id=2, ..., b:b3|a_id=2, b:b3|b_id=3, ..., b:b4|a_id=2, b:b4|b_id=4, ... rowkey = 3, a:a_id=3, ..., b:b5|a_id=3, b:b5|b_id = 5
how can import databag hbase using above logic?
in order need generate dynamic column qualifier names, number of dependent on number of subtuples in databag.
Comments
Post a Comment