load_from_rdf

ampligraph.datasets.datasets.load_from_rdf(folder_name, file_name, rdf_format='nt', data_home=None, add_reciprocal_rels=False)

Load an RDF file.

Loads an RDF knowledge graph using rdflib APIs. Multiple RDF serialization formats are supported (nt, ttl, rdf/xml, etc). The entire graph will be loaded in memory, and converted into an rdflib Graph object.

Warning

Large RDF graphs should be serialized to ntriples beforehand and loaded with load_from_ntriples() instead. This function, indeed, is faster by orders of magnitude.

Hint

To split a generic knowledge graphs into training, validation, and test sets do not use the above function, but rather train_test_split_no_unseen(): this will return validation and test sets not including triples with entities not present in the training set.

Parameters:
  • folder_name (str) – Base folder where the file is stored.

  • file_name (str) – File name.

  • rdf_format (str) – The RDF serialization format (nt, ttl, rdf/xml - see rdflib documentation).

  • data_home (str) – The path to the folder that contains the datasets.

  • add_reciprocal_rels (bool) – Flag which specifies whether to add reciprocal relations. For every <s, p, o> in the dataset this creates a corresponding triple with reciprocal relation <o, p_reciprocal, s> (default: False).

Returns:

triples – The actual triples of the file.

Return type:

ndarray, shape (n, 3)