load_yago3_10¶
-
ampligraph.datasets.
load_yago3_10
(check_md5hash=False, clean_unseen=True, add_reciprocal_rels=False)¶ Load the YAGO3-10 dataset
The dataset is a split of YAGO3 [MBS13], and has been first presented in [DMSR18].
The YAGO3-10 dataset is loaded from file if it exists at the
AMPLIGRAPH_DATA_HOME
location. IfAMPLIGRAPH_DATA_HOME
is not set the the default~/ampligraph_datasets
is checked.If the dataset is not found at either location it is downloaded and placed in
AMPLIGRAPH_DATA_HOME
or~/ampligraph_datasets
.It is divided in three splits:
train
valid
test
Dataset Train Valid Test Entities Relations YAGO3-10 1,079,040 5,000 5,000 123,182 37 Parameters: - check_md5hash (boolean) – If
True
check the md5hash of the files. Defaults toFalse
. - clean_unseen (bool) – If
True
, filters triples in validation and test sets that include entities not present in the training set. - add_reciprocal_rels (bool) – Flag which specifies whether to add reciprocal relations. For every <s, p, o> in the dataset this creates a corresponding triple with reciprocal relation <o, p_reciprocal, s>. (default: False).
Returns: splits – The dataset splits: {‘train’: train, ‘valid’: valid, ‘test’: test}. Each split is an ndarray of shape [n, 3].
Return type: dict
Examples
>>> from ampligraph.datasets import load_yago3_10 >>> X = load_yago3_10() >>> X["valid"][0] array(['Mikheil_Khutsishvili', 'playsFor', 'FC_Merani_Tbilisi'], dtype=object)