I am trying to remove all special characters from some text, here's my regex:
pattern = re.compile ('[\ W_] +', re.UNICODE) word = str (pattern.sub ('', word)) super simple , But unfortunately, problems are arising while using this astrophysic (single quote). For example, if I have the word "no", then this code is returning "doesn" < P> Edit: Here it is that I Prahlad'm doing: does not mean it does -teknik work? should be:
This does not mean it works technically
Like this?
& gt; & Gt; & Gt; Pattern = re.compile ("[^ \ w ']") & gt; & Gt; & Gt; If the underscore should also be filtered:
> & gt; & Gt; & Gt; Re.compile ("[^ \ w '] | _") Sub ("", "Is not _technically_, does it mean no? I am ...") "It does not mean technically it's no ow me"
Comments
Post a Comment