dictionary - Python script to use coordinates from one file and add values from matching coordinates in another file -
i have original set of genomic coordinates (chrom, start, end) in tab delimited bed file. have additional tab delimited bed files contain of original genomic coordinates plus numerical value associated each of these coordinates. these coordinates can show multiple times in bed file different numerical value each time. need final bed file contains each of original genomic coordinates summed number of values found associated specific coordinate. examples of files i'm working below.
original file:
chr1 2100 2300 chr2 3300 3600 chr1 2560 2800
other bed file:
chr1 2100 2300 6 chr2 3300 3600 56 chr1 2100 2300 10
needed output file:
chr1 2100 2300 16 chr2 3300 3600 56 chr1 2560 2800 0
i need write python script this, i'm not sure best way is.
def fetch_data(filename1, filename2): lines = [] data = {} open (filename1) f: lines = f.readlines() line in lines: if not line.strip(): continue data[' '.join(line.split())] = 0 open (filename2) f: lines = f.readlines() line in lines: if not line.strip(): continue arr = line.split() data[' '.join(arr[:-1])] += int(arr[3]) return data open ('output.txt', 'w') f: key,value in fetch_data('original.txt','data.txt').iteritems(): f.write('{0} {1} \n\n'.format(' '.join(key.split()), str(value)))
Comments
Post a Comment