string - Unexpected output in for loop - Python -


i have list:

t=[['universitario de deportes'],['lancaster'],['universitario de'],['juan aurich'],['muni'],['juan']] 

i want reorder list according jaccard distance. if reorder t expected ouput should be:

[['universitario de deportes'],['universitario de'],['lancaster'],['juan aurich'],['juan'],['muni']] 

the code of jackard distance working ok, rest of code doesn't give expected output.the code below:

def jack(a,b):     x=a.split()     y=b.split()     k=float(len(set(x)&set(y)))/float(len((set(x) | set(y))))     return k t=[['universitario de deportes'],['lancaster'],['universitario de'],['juan aurich'],['muni'],['juan']]  import copy cp   b=cp.deepcopy(t)  c=[]  while (len(b)>0):     c.append(b[0][0])     d=b[0][0]     del b[0]     m in range (0 , len(b)+1):         if m > len(b):             break             if jack(d,b[m][0])>0.3:                 c.append(b[m][0])                 del b[m] 

unfortunately, unexpected output same list :

print c ['universitario de deportes', 'lancaster', 'universitario de', 'juan aurich', 'muni', 'juan'] 

edit:

i tried correct code didn't work got little closer expected output:

t=[['universitario de deportes'],['lancaster'],['universitario de'],['juan aurich'],['muni'],['juan']]  import copy cp   b=cp.deepcopy(t)  c=[]  while (len(b)>0):     c.append(b[0][0])     d=b[0][0]     del b[0]     m in range(0,len(b)-1):         if jack(d,b[m][0])>0.3:             c.append(b[m][0])             del b[m] 

the "close" output is:

['universitario de deportes', 'universitario de', 'lancaster', 'juan aurich', 'muni', 'juan'] 

second edit:

finally, came solution has quite fast computational. currently, i'll use code order 60 thousands names. code below:

t=['universitario de deportes','lancaster','lancaste','juan aurich','lancaster','juan','universitario','juan franco']  import copy cp   b=cp.deepcopy(t)  c=[]  while (len(b)>0):     c.append(b[0])     e=b[0]     del b[0]     val in b:         if jack(e,val)>0.3:             c.append(val)             b.remove(val)  print c ['universitario de deportes', 'universitario', 'lancaster', 'lancaster', 'lancaste', 'juan aurich', 'juan', 'juan franco' 

firstly, not sure why you've got in single-item lists, suggest flattening out first:

t = [l[0] l in t] 

this gets rid of 0 indices everywhere, , means need shallow copies (as strings immutable).

secondly, last 3 lines of code never run:

if m > len(b):     break # nothing after happen     if jack(d,b[m][0])>0.3:        c.append(b[m][0])        del b[m] 

i think want is:

out = [] # sorted list index, val1 in enumerate(t): # work through each item in original list     if val1 not in out: # if haven't put item in new list         out.append(val1) # put item in new list     val2 in t[index+1:]: # search rest of list         if val2 not in out: # if haven't put item in new list             jack(val1, val2) > 0.3: # , new item close current item                 out.append(val2) # add new item 

this gives me

out == ['universitario de deportes', 'universitario de',        'lancaster', 'juan aurich', 'juan', 'muni'] 

i recommend using better variable names a, b, c, etc..


Comments

Popular posts from this blog

PHPMotion implementation - URL based videos (Hosted on separate location) -

javascript - Using Windows Media Player as video fallback for video tag -

c# - Unity IoC Lifetime per HttpRequest for UserStore -