/
Wordnets

Wordnets

1. Introduction

A Wordnet is a lexical database of semantic relations between words that links words into semantic relations including synonyms, hyponyms, and meronyms. The synonyms are grouped into synsets with short definitions and usage examples. It can be seen as a combination and extension of a dictionary and thesaurus. It was first created in English (Princeton Wordnet) and software tools are available to access and use Wordnets. GermaNet is the most popular Wordnet version for German, but there are also Wordnets in several other languages.

2. Why Wordnets can be useful for WLO

2.1 More precise context-based tagging

A significant problem when tagging content with words, is that words can have multiple meanings depending on the context. Homographs and Polysemes are examples of words having multiple meanings in different contexts. Homographs are words with similar spelling but different meanings. Polysemes are words or phrases with different, but related, meanings.

Homograph example: bark (referring to the sound a dog makes) and bark (referring to the outer layer or skin of a tree)

Polyseme example: bank (referring to the financial institution) and bank (referring to the physical location where money can be drawn)

Wordnets enable us to pinpoint meaning more precisely. In the examples above, the words have an id in the English Wordnet, which refer to the word itself as well as the context in which it is used:

bark (referring to the sound a dog makes) -> wordnet id: ewn-07391331-n

bark (referring to the outer layer or skin of a tree) -> wordnet id: ewn-13183195-n

bank (referring to the financial institution) -> wordnet id: ewn-08437235-n

bank (referring to the physical location where money can be drawn) -> wordnet id: ewn-02790795-n

The wordnet id is the unique identifier of something that is called a synset (grouping on synonymous words with the same contextual meaning). Tagging content with a wordnet id, instead of the word alone, allows us to categorise and retrieve information more accurately.

2.2 Mappings between different languages

There are wordnets in many different languages, and the wordnet id can be used to connect synsets in different languages to each other. Practically, this means that if we have the wordnet id of a concept such as bark (referring to the sound a dog makes), then we can retrieve the same corresponding concept in a language such as German (Bellen). This idea can be useful to us, because if we have German content, then somebody can search for a concept in a language such as English, and still retrieve this content.

2.3 Wordnets and relationships between words

Wordnets are not only useful for storing and retrieving words in a more accurate context, but it also contains a lot more metadata and other useful information. Firstly, wordnets are like a graph-based dictionary connecting words to each other with specific relationships. Some of the most important relationships that words can have with each other in Wordnets are the following:

Relationship

Description

Example

Relationship

Description

Example

Hypernym

words with broader meaning

dog -> mammal

Hyponym

words with narrower meaning

dog -> Labrador

Meronym

words that form part of another word

finger, palm -> hand

Holonym

the parts a word consists of (the opposite of a meronym)

hand -> finger, palm

Antonym

opposites

boy <-> girl

Another interesting relationship is entailment. Entailments are applicable to verbs, and it refers to actions which cannot occur without another action happening concurrently. For example, the word snore entails sleep. It means that you cannot snore without sleeping. These actions happen at the same time. The opposite is not true though, and sleep does not entail snore, since it is possible to sleep without snoring.

Derivation is also an important relationship. It refers to words with a different part of speech, but originating from the same root. For example, the words concept and conceptual have a derivational relationship. The one is a noun, and the other is an adjective, but they are derived from the same root idea.

Other relationship types in wordnet:

  • domain_region

  • domain_topic

  • has_domain_topic

  • also

  • similar

  • pertainym

  • exemplifies

  • has_domain_region

  • is_exemplified_by

  • participle

2.4 Wordnet meta data and additional information

Wordnets also link to other LOD vocabularies. For example, the concepts in the English Wordnet link to dc:identifier and dc:subject in Dublin Core. The dc:subject might be of intereste to us, as it groups words into categories. The following are examples of the available categories:

'adv.all', 'noun.person', 'verb.contact', 'noun.object', 'verb.competition', 'verb.communication', 'adj.pert', 'noun.time', 'verb.weather', 'verb.creation', 'noun.state', 'verb.motion', 'noun.cognition', 'noun.process', 'verb.perception', 'adj.ppl', 'verb.social', 'verb.consumption', 'noun.substance', 'noun.food', 'noun.shape', 'verb.possession', 'noun.feeling', 'verb.change', 'verb.cognition', 'noun.location', 'verb.body', 'verb.emotion', 'noun.communication', 'noun.act', 'noun.Tops', 'noun.plant', 'noun.quantity', 'noun.relation', 'adj.all', 'noun.artifact', 'noun.motive', 'noun.attribute', 'verb.stative', 'noun.body', 'noun.possession', 'noun.animal', 'noun.event', 'noun.phenomenon', 'noun.group'

2.5 Wordnets are extendable

Wordnets can be extended, meaning that specialist terminology can be added to the wordnet, by linking it to the already existing synsets.

2.6 Visualisation of concepts is easy

Because Wordnets are stored in a graph-based structure, words and their relationships to other words can be visualised easily. Below is a visualisation example of the word automobile in the English Wordnet:

Related content