load_fb15k

ampligraph.datasets.load_fb15k(check_md5hash=False)

Load the FB15k dataset

Warning

The dataset includes a large number of inverse relations that spilled to the test set, and its use in experiments has been deprecated. Use FB15k-237 instead.

FB15k is a split of Freebase, first proposed by [BUGD+13].

The FB15k dataset is loaded from file if it exists at the AMPLIGRAPH_DATA_HOME location. If AMPLIGRAPH_DATA_HOME is not set the the default ~/ampligraph_datasets is checked.

If the dataset is not found at either location it is downloaded and placed in AMPLIGRAPH_DATA_HOME or ~/ampligraph_datasets.

The dataset is divided in three splits:

  • train

  • valid

  • test

Dataset

Train

Valid

Test

Entities

Relations

FB15K

483,142

50,000

59,071

14,951

1,345

Parameters

check_md5hash (boolean) – If True check the md5hash of the files. Defaults to False.

Returns

splits – The dataset splits: {‘train’: train, ‘valid’: valid, ‘test’: test}. Each split is an ndarray of shape [n, 3].

Return type

dict

Examples

>>> from ampligraph.datasets import load_fb15k
>>> X = load_fb15k()
>>> X['test'][:3]
array([['/m/01qscs',
        '/award/award_nominee/award_nominations./award/award_nomination/award',
        '/m/02x8n1n'],
       ['/m/040db', '/base/activism/activist/area_of_activism', '/m/0148d'],
       ['/m/08966',
        '/travel/travel_destination/climate./travel/travel_destination_monthly_climate/month',
        '/m/05lf_']], dtype=object)