load_wn18

ampligraph.datasets.load_wn18(check_md5hash=False, add_reciprocal_rels=False)

Load the WN18 dataset.

WN18 is a subset of Wordnet. It was first presented by [BUGD+13].

Warning

The dataset includes a large number of inverse relations that spilled to the test set, and its use in experiments has been deprecated. Use WN18RR instead.

The WN18 dataset is loaded from file if it exists at the AMPLIGRAPH_DATA_HOME location. If AMPLIGRAPH_DATA_HOME is not set, the default ~/ampligraph_datasets is checked. If the dataset is not found at either location, it is downloaded and placed in AMPLIGRAPH_DATA_HOME or ~/ampligraph_datasets.

The dataset is divided in three splits:

  • train: 141,442 triples

  • valid 5,000 triples

  • test 5,000 triples

Dataset

Train

Valid

Test

Entities

Relations

WN18

141,442

5,000

5,000

40,943

18

Parameters:
  • check_md5hash (bool) – If True check the md5hash of the files (default: False).

  • add_reciprocal_rels (bool) – Flag which specifies whether to add reciprocal relations. For every <s, p, o> in the dataset this creates a corresponding triple with reciprocal relation <o, p_reciprocal, s> (default: False).

Returns:

splits – The dataset splits {‘train’: train, ‘valid’: valid, ‘test’: test}. Each split is a ndarray of shape (n, 3).

Return type:

dict

Example

>>> from ampligraph.datasets import load_wn18
>>> X = load_wn18()
>>> X['test'][:3]
array([['06845599', '_member_of_domain_usage', '03754979'],
       ['00789448', '_verb_group', '01062739'],
       ['10217831', '_hyponym', '10682169']], dtype=object)