"First of all, we kill all lawyers." - William Shakespeare
Looking at the quotation given above, I exclude
"kill" and
"lawyers"
[["First", "NNP"], to describe the overall meaning of sentence in the form of two main keywords, I have removed the following noun / verb. , ["Talk", "nn"], ["two", "VBP", "" "", "ns"], ["kill", "vb"], ["lawyer", "nns"]]
The more common problem I am trying to solve To summarize the overall "meanings" of a sentence, the "most important" word / tags have to break a sentence.
* Note the intimidating quotation I agree that There is a very difficult problem and there is no correct solution at this point on time. Nevertheless, I look at the efforts to remove the specific problem (
"kill" and
"lawyers" Interested) and general problem (overall of a sentence Summary Keyword / Tags)
A simple approach will be to keep a list for NN, VB etc. High frequency words that usually add very semantic content to the sentence.
The snippet below shows different lists for each type of word token, but you can also employ a single stop word list for both actions and nouns (such as a) . Vbp = ['two' def 'filter_stop_words (pos_list): return to [
[Token, token_type] for token, token_type in pos_list if not in token.lower (), VB = [], NNS = ['returns', 'things'], stop_words [token_type]]
Comments
Post a Comment