regex - change words for others in a list of equivalents without losing format -
having input folder, , output folder , list of equivalents folder.
where can start research, in order if have word in list document inside input folder, it's equivalent list of equivalents, , make replacement , produce txt output, using utf8 in documents.
if have list of equivalents:
bovine = cattle cancrine = crab canine = dog cervine = deer corvine = crow equine = horse elapine = snake
and have input document this:
bovine cancrine canine cervine equine text1 text2 elapine.
i want in output file:
cattle crab dog deer [text1] [text2] snake
text1 , text2 in square brackets since not in list of equivalents.
but able of changing word if followed coma or other punctuation marks. example , input this:
bovine! cancrine, ,canine# cervine% $equine text1, text2,,, elapine......
should return:
cattle! crab, ,dog# deer% $horse [text1], [text2[,,, snake......
using perl script, please. should not programmer, 1 friend of mine made program me years ago, couple of lines, understood. remember using regex: ^[^=]+=[.*]+$
read equivalents that.
i using active perl, last version. , want include in equivalents caracters °ŸÖ†ª or maybe other ascii character, can't contact friend anymore, asking doing this, translate english words phonetics. thanks
it suppoused script should on same folder, cointains 3 folders, input, output , list. if double click on script, input text should converted , placed file in output folder.
thanks help
one way perl
:
#!/usr/bin/perl use strict; use warnings; use autodie; open $lookup , "<" , "file.txt"; open $list , "<" , "list.txt"; open $output , ">" , "output.txt"; %h; while (<$lookup>) { chomp; ($k, $v) = split /\s*=\s*/; $h{$k} = $v; } while (<$list>) { s/([a-za-z0-9]+)/$h{$1} || "[$1]"/eg; print $output $_; }
for fun one-liner:
perl -lpe ' begin{$x=pop;%h=map{$_->[0]=>$_->[2]}map[split],<>;@argv=$x} s/([a-za-z0-9]+)/$h{$1}||"[$1]"/eg' file.txt list.txt > output.txt
Comments
Post a Comment