load_wn18¶

ampligraph.datasets.load_wn18(check_md5hash=False)¶

Load the WN18 dataset

Warning

The dataset includes a large number of inverse relations that spilled to the test set, and its use in experiments has been deprecated. Use WN18RR instead.

WN18 is a subset of Wordnet. It was first presented by [BUGD+13].

The WN18 dataset is loaded from file if it exists at the AMPLIGRAPH_DATA_HOME location. If AMPLIGRAPH_DATA_HOME is not set the the default ~/ampligraph_datasets is checked.

If the dataset is not found at either location it is downloaded and placed in AMPLIGRAPH_DATA_HOME or ~/ampligraph_datasets.

The dataset is divided in three splits:

train: 141,442 triples
valid 5,000 triples
test 5,000 triples

Dataset	Train	Valid	Test	Entities	Relations
WN18	141,442	5,000	5,000	40,943	18

Parameters:	check_md5hash (bool) – If `True` check the md5hash of the files. Defaults to `False`.
Returns:	splits – The dataset splits {‘train’: train, ‘valid’: valid, ‘test’: test}. Each split is an ndarray of shape [n, 3].
Return type:	dict

Examples

>>> from ampligraph.datasets import load_wn18
>>> X = load_wn18()
>>> X['test'][:3]
array([['06845599', '_member_of_domain_usage', '03754979'],
       ['00789448', '_verb_group', '01062739'],
       ['10217831', '_hyponym', '10682169']], dtype=object)