Datasets¶
Helper functions to load knowledge graphs from disk.
Note
It is recommended to set the AMPLIGRAPH_DATA_HOME
environment variable:
export AMPLIGRAPH_DATA_HOME=/YOUR/PATH/TO/datasets
When attempting to load a dataset, the module will first check if AMPLIGRAPH_DATA_HOME
is set.
If it is, it will search this location for the required dataset.
If the dataset is not found it will be downloaded and placed in this directory.
If AMPLIGRAPH_DATA_HOME
has not been set the databases will be saved in the following directory:
~/ampligraph_datasets
Additionally, a specific directory can be passed to the dataset loader via the data_home
parameter.
Dataset-Specific Loaders¶
Use these helpers functions to load datasets used in graph representation learning literature.
The functions will automatically download the datasets if they are not present in ~/ampligraph_datasets
or
at the location set in AMPLIGRAPH_DATA_HOME
.
load_wn18 ([data_home]) |
Load the WN18 dataset |
load_fb15k ([data_home]) |
Load the FB15k dataset |
load_fb15k_237 ([data_home]) |
Load the FB15k-237 dataset |
load_yago3_10 ([data_home]) |
Load the YAGO3-10 dataset |
load_wn18rr ([data_home]) |
Load the WN18RR dataset |
Dataset Summary
Dataset | Train | Valid | Test | Entities | Relations |
---|---|---|---|---|---|
FB15K-237 | 272,115 | 17,535 | 20,466 | 14,541 | 237 |
WN18RR | 86,835 | 3,034 | 3,134 | 40,943 | 11 |
FB15K | 483,142 | 50,000 | 59,071 | 14,951 | 1,345 |
WN18 | 141,442 | 5,000 | 5,000 | 40,943 | 18 |
YAGO3-10 | 1,079,040 | 5,000 | 5,000 | 123,182 | 37 |
These datasets are originated from: FB15K-237, WN18RR, FB15K, WN18, YAGO3-10
Warning
FB15K-237 contains 8 unseen entities inside 9 triples in the validation set and 29 inside 28 triples in the test set. WN18RR contains 198 unseen entities inside 210 triples in the validation set and 209 inside 210 triples in the test set.
Generic Loaders¶
Functions to load custom knowledge graphs from disk.
Note
The environment variable AMPLIGRAPH_DATA_HOME
must be set
and input graphs must be stored at the path indicated.
load_from_csv (directory_path, file_name[, …]) |
Load a csv file |
load_from_ntriples (folder_name, file_name[, …]) |
Load RDF ntriples as csv statements |
load_from_rdf (folder_name, file_name[, …]) |
Load an RDF file |