load_from_csv¶
-
ampligraph.datasets.
load_from_csv
(directory_path, file_name, sep='\t', header=None)¶ Load a knowledge graph from a csv file
Loads a knowledge graph serialized in a csv file as:
subj1 relationX obj1 subj1 relationY obj2 subj3 relationZ obj2 subj4 relationY obj2 ...
Note
The function filters duplicated statements.
Note
It is recommended to use
ampligraph.evaluation.train_test_split_no_unseen()
to split custom knowledge graphs into train, validation, and test sets. Using this function will lead to validation, test sets that do not include triples with entities that do not occur in the training set.Parameters: - directory_path (str) – folder where the input file is stored.
- file_name (str) – file name
- sep (str) – The subject-predicate-object separator (default ).
- header (int, None) – The row of the header of the csv file. Same as pandas.read_csv header param.
Returns: triples – the actual triples of the file.
Return type: ndarray , shape [n, 3]
Examples
>>> from ampligraph.datasets import load_from_csv >>> X = load_from_csv('folder', 'dataset.csv', sep=',') >>> X[:3] array([['a', 'y', 'b'], ['b', 'y', 'a'], ['a', 'y', 'c']], dtype='<U1')