load_wn18¶
- ampligraph.datasets.datasets.load_wn18(check_md5hash=False, add_reciprocal_rels=False)¶
Load the WN18 dataset.
WN18 is a subset of Wordnet. It was first presented by [BUGD+13].
Warning
The dataset includes a large number of inverse relations that spilled to the test set, and its use in experiments has been deprecated. Use WN18RR instead.
The WN18 dataset is loaded from file if it exists at the
AMPLIGRAPH_DATA_HOMElocation. IfAMPLIGRAPH_DATA_HOMEis not set, the default~/ampligraph_datasetsis checked. If the dataset is not found at either location, it is downloaded and placed inAMPLIGRAPH_DATA_HOMEor~/ampligraph_datasets.The dataset is divided in three splits:
train: 141,442 triples
valid 5,000 triples
test 5,000 triples
Dataset
Train
Valid
Test
Entities
Relations
WN18
141,442
5,000
5,000
40,943
18
- Parameters:
check_md5hash (bool) – If True check the md5hash of the files (default: False).
add_reciprocal_rels (bool) – Flag which specifies whether to add reciprocal relations. For every <s, p, o> in the dataset this creates a corresponding triple with reciprocal relation <o, p_reciprocal, s> (default: False).
- Returns:
splits – The dataset splits {‘train’: train, ‘valid’: valid, ‘test’: test}. Each split is a ndarray of shape (n, 3).
- Return type:
dict
Example
>>> from ampligraph.datasets import load_wn18 >>> X = load_wn18() >>> X['test'][:3] array([['06845599', '_member_of_domain_usage', '03754979'], ['00789448', '_verb_group', '01062739'], ['10217831', '_hyponym', '10682169']], dtype=object)