Python Regex - Remove special characters but preserve apostraphes -


I am trying to remove all special characters from some text, here's my regex:

  pattern = re.compile ('[\ W_] +', re.UNICODE) word = str (pattern.sub ('', word))   

super simple , But unfortunately, problems are arising while using this astrophysic (single quote). For example, if I have the word "no", then this code is returning "doesn" < P> Edit: Here it is that I Prahlad'm doing:

  does not mean it does -teknik work?    

should be:

This does not mean it works technically

Like this?

  & gt; & Gt; & Gt; Pattern = re.compile ("[^ \ w ']") & gt; & Gt; & Gt;   

If the underscore should also be filtered:

  

> & gt; & Gt; & Gt; Re.compile ("[^ \ w '] | _") Sub ("", "Is not _technically_, does it mean no? I am ...") "It does not mean technically it's no ow me"

Comments