query_topn¶
-
ampligraph.discovery.
query_topn
(model, top_n=10, head=None, relation=None, tail=None, ents_to_consider=None, rels_to_consider=None)¶ Queries the model with two elements of a triple and returns the top_n results of all possible completions ordered by score predicted by the model.
For example, given a <subject, predicate> pair in the arguments, the model will score all possible triples <subject, predicate, ?>, filling in the missing element with known entities, and return the top_n triples ordered by score. If given a <subject, object> pair it will fill in the missing element with known relations.
Note
This function does not filter out true statements - triples returned can include those the model was trained on.
Parameters: - model (EmbeddingModel) – The trained model that will be used to score triple completions.
- top_n (int) – The number of completed triples to returned.
- head (string) – An entity string to query.
- relation (string) – A relation string to query.
- tail – An object string to query.
- ents_to_consider (array-like) – List of entities to use for triple completions. If None, will generate completions using all distinct entities. (Default: None.)
- rels_to_consider (array-like) – List of relations to use for triple completions. If None, will generate completions using all distinct relations. (Default: None.)
Returns: - X (ndarray, shape [n, 3]) – A list of triples ordered by score.
- S (ndarray, shape [n]) – A list of scores.
Examples
>>> import requests >>> from ampligraph.datasets import load_from_csv >>> from ampligraph.latent_features import ComplEx >>> from ampligraph.discovery import discover_facts >>> from ampligraph.discovery import query_topn >>> >>> # Game of Thrones relations dataset >>> url = 'https://ampligraph.s3-eu-west-1.amazonaws.com/datasets/GoT.csv' >>> open('GoT.csv', 'wb').write(requests.get(url).content) >>> X = load_from_csv('.', 'GoT.csv', sep=',') >>> >>> model = ComplEx(batches_count=10, seed=0, epochs=200, k=150, eta=5, >>> optimizer='adam', optimizer_params={'lr':1e-3}, loss='multiclass_nll', >>> regularizer='LP', regularizer_params={'p':3, 'lambda':1e-5}, >>> verbose=True) >>> model.fit(X) >>> >>> query_topn(model, top_n=5, >>> head='Catelyn Stark', relation='ALLIED_WITH', tail=None, >>> ents_to_consider=None, rels_to_consider=None) >>> (array([['Catelyn Stark', 'ALLIED_WITH', 'House Tully of Riverrun'], ['Catelyn Stark', 'ALLIED_WITH', 'House Stark of Winterfell'], ['Catelyn Stark', 'ALLIED_WITH', 'House Wayn'], ['Catelyn Stark', 'ALLIED_WITH', 'House Mollen'], ['Catelyn Stark', 'ALLIED_WITH', 'Orton Merryweather']], dtype='<U44'), array([[10.261374 ], [ 8.84298 ], [ 2.78139 ], [ 1.9809164], [ 1.833096 ]], dtype=float32))