regex - Key confusion in python -


hey friends have seen weird code.am new python programming.the code is

import re, collections  mylist = ['probes', 'gene.symbol', 'gene.title', 'go1', 'go2', 'go3', 'adx_kd_06.ip', 'adx_kd_24.ip', 'adx_lg_06.ip', 'adx_lg_24.ip', 'adx_lv_06.ip', 'adx_lv_24.ip', 'adx_sp_06.ip', 'adx_sp_24.ip', 'adx_ln_06.id', 'alm_ln_06.id', 'alm_lv_06.ip', 'alm_sp_06.ip', 'k3spg_lv_06.ip', 'k3spg_sp_06.ip', 'kkk_ln_06.id', 'kkk_lv_06.ip', 'kkk_sp_06.ip', 'endcn_lv_06.in', 'endcn_sp_06.in', 'bcd_lv_06.ip', 'bcd_sp_06.ip', 'adx_lv_06.id', 'adx_sp_06.id', 'alm_lv_06.id', 'alm_sp_06.id', 'd35_ln_06.id', 'k3spg_ln_06.id', 'k3_lv_06.id', 'k3_sp_06.id', 'bcd_ln_06.id', 'd35_lv_06.id', 'd35_sp_06.id', 'k3spg_lv_06.id', 'k3spg_sp_06.id', 'bcd_lv_06.id', 'bcd_sp_06.id', 'endcn_kd_06.in', 'endcn_lg_06.in', 'probes', 'gene.symbol', 'adx_kd_06.ip', 'adx_kd_24.ip', 'adx_lg_06.ip', 'adx_lg_24.ip', 'adx_lv_06.ip', 'adx_lv_24.ip', 'adx_sp_06.ip', 'adx_sp_24.ip', 'adx_ln_06.id', 'alm_ln_06.id', 'alm_lv_06.ip', 'alm_sp_06.ip', 'k3spg_lv_06.ip', 'k3spg_sp_06.ip', 'kkk_ln_06.id', 'kkk_lv_06.ip', 'kkk_sp_06.ip', 'endcn_lv_06.in', 'endcn_sp_06.in', 'bcd_lv_06.ip', 'bcd_sp_06.ip', 'adx_lv_06.id', 'adx_sp_06.id', 'alm_lv_06.id', 'alm_sp_06.id', 'd35_ln_06.id', 'k3spg_ln_06.id', 'k3_lv_06.id', 'k3_sp_06.id', 'bcd_ln_06.id', 'd35_lv_06.id', 'd35_sp_06.id', 'k3spg_lv_06.id', 'k3spg_sp_06.id', 'bcd_lv_06.id', 'bcd_sp_06.id', 'endcn_kd_06.in', 'endcn_lg_06.in']  regex = re.compile(r'([\w\d]+)_(\w\w)_(\d\d)\.(\w\w)')  first_part_dict = collections.defaultdict(list)  second_part_dict = collections.defaultdict(list) 

second instance of 'probes', separate first , second parts

cutoff_index = mylist.index('probes', 1)   i, string in enumerate(mylist):  matched = regex.match(string)  if not matched:      continue  rg1, rg2, rg3, rg4 = matched.groups()  key = rg1 + rg3  if < cutoff_index:      first_part_dict[key].append(i)  else:      second_part_dict[key].append(i) 

we can see list above separated 2 parts, delimited 'probes', 'gene.symbol', 'gene.title', 'go1', 'go2', 'go3' , 'probes', 'gene.symbol'.

the regex components of first , second part is:

([\w\d]+)_(\w\w)_(\d\d)\.(\w\w)   rg1      rg2     rg3    rg4 

which should match string adx_sp_06.ip or k3spg_ln_06.id

my question ..i didnt understood use of first_part_dict[key].append(i) in code.i know given index here.am not in regex , think matched portion number.so key act number , first_part_dict dictionary.is value of index stored dictionary first_part_dict ??..

am confused..please me in undersding this..any appreciated ..and sorry long question..

the dictionary being used dictionary text/string key , list value.

what first_part_dict[key].append(i) doing is appending (or adding) value of i list corresponding key key of dictionary first_part_dict.

if key adx06, dictionary go {'adx06': []} {'adx06': [1]} should value of i 1.


i'll put walkthrough illustrate:

mylist = ['probes', 'gene.symbol', 'gene.title', 'go1', 'go2', 'go3', 'adx_kd_06.ip', 'adx_kd_24.ip', 'adx_lg_06.ip' i, string in enumerate(mylist): matched = regex.match(string) if not matched:     continue rg1, rg2, rg3, rg4 = matched.groups() key = rg1 + rg3 if < cutoff_index:     first_part_dict[key].append(i) else:     second_part_dict[key].append(i) 

when pass through loop first time, i = 0 , string = 'probes'. since probes doesn't match regex, loop skips next item through continue.

this time, i = 1 , string = 'gene.symbol. once again, string doesn't match regex, skip next item. goes on until 7th item: adx_kd_06.ip. here, have i = 6 , string = 'adx_kd_06.ip' matches regex.

from that, rg1 = adx, rg2 = lg, rg3 = 06 , rg4 = ip. key becomes adx06 , first_part_dict[key].append(i) executing.

this create key adx06 in dictionary first_part_dict , append 6 value list. right now, have dict having {'adx06': [6]}. loop continues on next item.

this time, have i = 7 , string = 'adx_kd_24.ip'. matches regex , couple of lines later, have first_part_dict[key].append(i) executing.

this create key adx24 in dictionary first_part_dict , append 7 value list. right now, have dict having {'adx06': [6], 'adx24': [7]}. loop continues on next item.

this time, have i = 8 , string = 'adx_lg_06.ip'. matches regex , couple of lines later, have first_part_dict[key].append(i) executing again.

this create key adx06 in dictionary... wait! key exists, instead append 8 existing value list. right now, have dict having {'adx06': [6, 8], 'adx24': [7]}.

this goes on , on until items in list has been treated.


Comments

Popular posts from this blog

PHPMotion implementation - URL based videos (Hosted on separate location) -

javascript - Using Windows Media Player as video fallback for video tag -

c# - Unity IoC Lifetime per HttpRequest for UserStore -