Member-only story
NLTK WordNet & Synset
NLTK Wordnet & Synset, how it is working?
Let’s talk about WordNet and Synsets in NLTK.
WordNet is a comprehensive lexical database design for NLP, containing approximately 155K words organized into 175K synsets. These synsets represent groups of synonymous words, and each synset includes a short definition and usage examples with a compressed size of 12MB. and it includes dictionaries for languages (e.g., English), serving synonymous groups.
History
WordNet was first created in 1985, in English only, in the Cognitive Science Laboratory of Princeton University under the direction of psychology professor George Armitage Miller. It was later directed by Christiane Fellbaum. The project was initially funded by the U.S. Office of Naval Research, and later also by other U.S. government agencies including the DARPA, the National Science Foundation, the Disruptive Technology Office (formerly the Advanced Research and Development Activity) and REFLEX. George Miller and Christiane Fellbaum received the 2006 Antonio Zampolli Prize for their work with WordNet.
What is Wordnet?
To understand WordNet and Synset, you should know what hypernyms and hyponyms are. In this case, a hypernym is the name of a broader category of things. Dog, for example, is a hypernym for a dachshund, Chihuahua, and a poodle. Pigeon, crow, eagle and seagull are all hyponyms (co-hyponyms) of bird (their hypernym).