Matching email ids to people names -


i have database(say 5000 records) full of people names(first , last name). have huge set of email ids (say around 30000). have match these email ids people names ever possible , discard other ids. doing is, have made patterns like:

1. firstname.lastname@something.com 2. lastname.firstname@something.com 3. firstname_lastname@something.com 4. lastname_firstname@something.com etc

i trying use fuzzy search in both first , last names following above patterns. people tend use lot of patterns in email ids. of tend more 1 result people. there better way increase probability in matching emails correctly. searching lot , didn't find solid ideas.

to make bit smarter assume non alpha numeric name separator , use regular expression, e.g.

$jan[^a-z0-9]smith@.*^

but doesn't multiple matches. think it's inevitable you'll false positives email format not constrained. given size of database think you're stuck doing of hand :(


Comments

Popular posts from this blog

PHPMotion implementation - URL based videos (Hosted on separate location) -

javascript - Using Windows Media Player as video fallback for video tag -

c# - Unity IoC Lifetime per HttpRequest for UserStore -