generate_corruptions_for_eval

ampligraph.evaluation.generate_corruptions_for_eval(X, entities_for_corruption, corrupt_side='s+o', table_entity_lookup_left=None, table_entity_lookup_right=None, table_reln_lookup=None)

Generate corruptions for evaluation.

Create corruptions (subject and object) for a given triple x, in compliance with the local closed world assumption (LCWA), as described in [NMTG16].

Note

For filtering the corruptions, we adopt a hashing-based strategy to handle the set difference problem. This strategy is as described below:

  • We compute unique entities and relations in our dataset.
  • We assign unique prime numbers for entities (unique for subject and object separately) and for relations and create three separate hash tables. (these hash maps are input to this function)
  • For each triple in filter_triples, we get the prime numbers associated with subject, relation and object by mapping to their respective hash tables. We then compute the prime product for the filter triple. We store this triple product.
  • Since the numbers assigned to subjects, relations and objects are unique, their prime product is also unique. i.e. a triple \((a, b, c)\) would have a different product compared to triple \((c, b, a)\) as \(a, c\) of subject have different primes compared to \(a, c\) of object.
  • While generating corruptions for evaluation, we hash the triple’s entities and relations and get the associated prime number and compute the prime product for the corrupted triple.
  • If this product is present in the products stored for the filter set, then we remove the corresponding corrupted triple (as it is a duplicate i.e. the corruption triple is present in filter_triples)
  • Using this approach we generate filtered corruptions for evaluation.

Execution Time: This method takes ~20 minutes on FB15K using ComplEx (Intel Xeon Gold 6142, 64 GB Ubuntu 16.04 box, Tesla V100 16GB)

Parameters:
  • X (Tensor, shape [1, 3]) – Currently, a single positive triples that will be used to create corruptions.
  • entities_for_corruption (Tensor) – All the entity IDs which are to be used for generation of corruptions
  • corrupt_side (string) –

    Specifies which side of the triple to corrupt:

    • ’s’: corrupt only subject.
    • ’o’: corrupt only object
    • ’s+o’: corrupt both subject and object
  • table_entity_lookup_left (tf.HashTable) – Hash table of subject entities mapped to unique prime numbers
  • table_entity_lookup_right (tf.HashTable) – Hash table of object entities mapped to unique prime numbers
  • table_reln_lookup (tf.HashTable) – Hash table of relations mapped to unique prime numbers
Returns:

  • out (Tensor, shape [n, 3]) – An array of corruptions for the triples for x.
  • out_prime (Tensor, shape [n, 3]) – An array of product of prime numbers associated with corruption triples or None based on filtered or non filtered version.