load_fb15k¶
- ampligraph.datasets.datasets.load_fb15k(check_md5hash=False, add_reciprocal_rels=False)¶
Load the FB15k dataset.
FB15k is a split of Freebase, first proposed by [BUGD+13].
Warning
The dataset includes a large number of inverse relations that spilled to the test set, and its use in experiments has been deprecated. Use FB15k-237 instead.
The FB15k dataset is loaded from file if it exists at the
AMPLIGRAPH_DATA_HOMElocation. IfAMPLIGRAPH_DATA_HOMEis not set, the default~/ampligraph_datasetsis checked. If the dataset is not found at either location, it is downloaded and placed inAMPLIGRAPH_DATA_HOMEor~/ampligraph_datasets.The dataset is divided in three splits:
train: 483,142 triples
valid: 50,000 triples
test: 59,071 triples
Dataset
Train
Valid
Test
Entities
Relations
FB15K
483,142
50,000
59,071
14,951
1,345
- Parameters:
check_md5hash (bool) – If True check the md5hash of the files (default: False).
add_reciprocal_rels (bool) – Flag which specifies whether to add reciprocal relations. For every <s, p, o> in the dataset this creates a corresponding triple with reciprocal relation <o, p_reciprocal, s> (default: False).
- Returns:
splits – The dataset splits: {‘train’: train, ‘valid’: valid, ‘test’: test}. Each split is a ndarray of shape (n, 3).
- Return type:
dict
Example
>>> from ampligraph.datasets import load_fb15k >>> X = load_fb15k() >>> X['test'][:3] array([['/m/01qscs', '/award/award_nominee/award_nominations./award/award_nomination/award', '/m/02x8n1n'], ['/m/040db', '/base/activism/activist/area_of_activism', '/m/0148d'], ['/m/08966', '/travel/travel_destination/climate./travel/travel_destination_monthly_climate/month', '/m/05lf_']], dtype=object)