hadoop - Pig HBaseStorage - How to Generate Dynamic Column Names and a Dynamic Number of Column Qualifiers from a DataBag? -


a has 1:m relationship b.

a = load ... (     a_id:char     ,... ); b = load ... (     a_id:chararray     ,b_id:chararray     ,... ); joined = join a_id, b a_id; grouped = group joined a::a_id; 

this create databag following schema:

{group: chararray, joined: {(a:a_id, ..., b::a_id, b::b_id, ...)}} 

for example:

(1, {(1, ..., 1, 1, ...)}) (2, {(2, ..., 2, 2, ...), (2, ..., 2,3, ...), (2, ...,2,4, ...)}) (3, {(3, ..., 3, 5, ...)}) 

for these 3 rows, how corresponding hbase results like:

rowkey = 1, a:a_id=1, ... b:b1|a_id=1, b:b1|b_id:=1 rowkey = 2, a:a_id=2, ... b:b2|a_id=2, b:b2|b_id=2, ..., b:b3|a_id=2, b:b3|b_id=3, ..., b:b4|a_id=2, b:b4|b_id=4, ... rowkey = 3, a:a_id=3, ..., b:b5|a_id=3, b:b5|b_id = 5 

how can import databag hbase using above logic?

in order need generate dynamic column qualifier names, number of dependent on number of subtuples in databag.


Comments

Popular posts from this blog

What can cause "Required Package 'IndyCore' not found" when compiling a Delphi 2010 project? -

c# - Must be a non abstract type with public parameterless constructor -