parsing - Create Regex that accept name but not Word "to" -


i working on parsing commentary of espncricinfo , want parse following of statements.

example1 : yuvraj singh nasir jamshed

example2 : kumar shoaib malik

i write same regex both bowler , batsman name,

regex : [a-za-z[-]*]*\s[a-za-z[-]*]*\s

example1 parse facing problem in example2 like,

"kumar to" consider bowler name...

i need rid of word "to" bowler name.

you can try following regex

(?<=to |^).*?(?= to|$) 

it work in case of yuvraj singh nasir jamshed kumar shoaib malik string.

ex.

string[] names = regex.matches("yuvraj singh nasir jamshed kumar shoaib malik", "(?<=to |^).*?(?= to|$)")                       .cast<match>()                       .select(m => m.value)                       .toarray(); 

another option, since know every part of name starts capital letter, force rule (to won't matched it, trailing whitespace will):

([a-z][\w-]*\s*)+ 

Comments

Popular posts from this blog

PHPMotion implementation - URL based videos (Hosted on separate location) -

javascript - Using Windows Media Player as video fallback for video tag -

c# - Unity IoC Lifetime per HttpRequest for UserStore -