load_from_ntriples

ampligraph.datasets.datasets.load_from_ntriples(folder_name, file_name, data_home=None, add_reciprocal_rels=False)

Load a dataset of RDF ntriples.

Loads an RDF knowledge graph serialized as ntriples, without building an RDF graph in memory. This function should be preferred over load_from_rdf(), since it does not load the graph into an rdflib model (and it is therefore faster by order of magnitudes). Nevertheless, it requires a ntriples serialization as in the example below:

_:alice <http://xmlns.com/foaf/0.1/knows> _:bob .
_:bob <http://xmlns.com/foaf/0.1/knows> _:alice .

Hint

To split a generic knowledge graphs into training, validation, and test sets do not use the above function, but rather train_test_split_no_unseen(): this will return validation and test sets not including triples with entities not present in the training set.

Parameters:
  • folder_name (str) – Base folder where the file is stored.

  • file_name (str) – File name.

  • data_home (str) – The path to the folder that contains the datasets.

  • add_reciprocal_rels (bool) – Flag which specifies whether to add reciprocal relations. For every <s, p, o> in the dataset this creates a corresponding triple with reciprocal relation <o, p_reciprocal, s> (default: False).

Returns:

triples – The actual triples of the file.

Return type:

ndarray, shape (n, 3)