HolE

class ampligraph.latent_features.HolE(k=100, eta=2, epochs=100, batches_count=100, seed=0, embedding_model_params={}, optimizer='adagrad', optimizer_params={'lr': 0.1}, loss='nll', loss_params={}, regularizer=None, regularizer_params={}, model_checkpoint_path='saved_model/', verbose=False, **kwargs)

Holographic Embeddings

The HolE model [NRP+16] as re-defined by [HS17].

Hayashi et al. [HS17] redifine the original HolE scoring function as:

\[f_{HolE}= 2 / n * f_{ComplEx}\]

Examples

>>> import numpy as np
>>> from ampligraph.latent_features import HolE
>>> model = HolE(batches_count=1, seed=555, epochs=20, k=10,
>>>             loss='pairwise', loss_params={'margin':1},
>>>             regularizer='LP', regularizer_params={'lambda':0.1})
>>>
>>> X = np.array([['a', 'y', 'b'],
>>>               ['b', 'y', 'a'],
>>>               ['a', 'y', 'c'],
>>>               ['c', 'y', 'a'],
>>>               ['a', 'y', 'd'],
>>>               ['c', 'y', 'd'],
>>>               ['b', 'y', 'c'],
>>>               ['f', 'y', 'e']])
>>> model.fit(X)
>>> model.predict(np.array([['f', 'y', 'e'], ['b', 'y', 'd']]), get_ranks=True)
[0.3046168, -0.0379385]
>>> model.get_embeddings(['f','e'], type='entity')
array([[-0.2704807 , -0.05434025,  0.13363852,  0.04879733,  0.00184516,
-0.1149573 , -0.1177371 , -0.20798951,  0.01935115,  0.13033926,
-0.81528974,  0.22864424,  0.2045117 ,  0.1145515 ,  0.248952  ,
 0.03513691, -0.08550065, -0.06037813,  0.23231442, -0.39326245],
[ 0.204738  ,  0.10758886, -0.11931524,  0.14881928,  0.0929039 ,
 0.25577265,  0.05722341,  0.2549932 , -0.16462566,  0.43789816,
-0.91011846,  0.3533137 ,  0.1144442 ,  0.00359709, -0.09599967,
-0.03151475,  0.14198618,  0.16138661,  0.07511608, -0.2465882 ]],
dtype=float32)

Methods

__init__([k, eta, epochs, batches_count, …]) Initialize an EmbeddingModel
fit(X[, early_stopping, early_stopping_params]) Train a HolE model.
get_embeddings(entities[, type]) Get the embeddings of entities or relations.
predict(X[, from_idx, get_ranks]) Predict the score of triples using a trained embedding model.
__init__(k=100, eta=2, epochs=100, batches_count=100, seed=0, embedding_model_params={}, optimizer='adagrad', optimizer_params={'lr': 0.1}, loss='nll', loss_params={}, regularizer=None, regularizer_params={}, model_checkpoint_path='saved_model/', verbose=False, **kwargs)

Initialize an EmbeddingModel

Also creates a new Tensorflow session for training.
Parameters:
  • k (int) – Embedding space dimensionality
  • eta (int) – The number of negatives that must be generated at runtime during training for each positive.
  • epochs (int) – The iterations of the training loop.
  • batches_count (int) – The number of batches in which the training set must be split during the training loop.
  • seed (int) – The seed used by the internal random numbers generator.
  • embedding_model_params (dict) – HolE-specific hyperparams: Currently HolE does not require any hyperparameters.
  • optimizer (string) – The optimizer used to minimize the loss function. Choose between sgd, adagrad, adam, momentum.
  • optimizer_params (dict) –

    Parameters values specific to the optimizer. Currently supported:

    • lr - learning rate (used by all the optimizers)
    • momentum - learning momentum (used by momentum optimizer)
  • loss (string) –

    The type of loss function to use during training.

    • pairwise the model will use pairwise margin-based loss function.
    • nll the model will use negative loss likelihood.
    • absolute_margin the model will use absolute margin likelihood.
    • self_adversarial the model will use adversarial sampling loss function.
  • loss_params (dict) –

    Parameters dictionary specific to the loss.

    (Refer documentation of specific loss functions for more details)

  • regularizer (string) –

    The regularization strategy to use with the loss function.

    • LP the model will use L1, L2 or L3 based on the value passed to param p.
    • None the model will not use any regularizer
  • regularizer_params (dict) –

    Parameters dictionary specific to the regularizer.

    (Refer documentation of regularizer for more details)

  • model_checkpoint_path (string) – Path to save the model.
  • verbose (bool) – Verbose mode
  • kwargs (dict) – Additional inputs, if any
fit(X, early_stopping=False, early_stopping_params={})

Train a HolE model.

The model is trained on a training set X using the training protocol described in [NRP+16].
Parameters:
  • X (ndarray, shape [n, 3]) – The training triples
  • early_stopping (bool) – Flag to enable early stopping(default:False)
  • early_stopping_params (dictionary) –

    Dictionary of parameters for early stopping. Following keys are supported:

    • x_valid: ndarray, shape [n, 3] : Validation set to be used for early stopping.
    • criteria: string : criteria for early stopping hits10, hits3, hits1 or mrr (default).
    • x_filter: ndarray, shape [n, 3] : Filter to be used (no filter by default).
    • burn_in: int : Number of epochs to pass before kicking in early stopping (default: 100).
    • check_interval: int : Early stopping interval after burn-in (default:10).
    • stop_interval: int : Stop if criteria is performing worse over n consecutive checks (default: 3).
get_embeddings(entities, type='entity')

Get the embeddings of entities or relations.

Parameters:
  • entities (array-like, dtype=int, shape=[n]) – The entities (or relations) of interest. Element of the vector must be the original string literals, and not internal IDs.
  • type (string) – If ‘entity’, will consider input as KG entities. If relation, they will be treated as KG predicates.
Returns:

embeddings – An array of k-dimensional embeddings.

Return type:

ndarray, shape [n, k]

predict(X, from_idx=False, get_ranks=False)

Predict the score of triples using a trained embedding model.

The function returns raw scores generated by the model. To obtain probability estimates, use a logistic sigmoid.
Parameters:
  • X (ndarray, shape [n, 3]) – The triples to score.
  • from_idx (bool) – If True, will skip conversion to internal IDs. (default: False).
  • get_ranks (bool) – Flag to compute ranks by scoring against corruptions (default: False).
Returns:

  • scores_predict (ndarray, shape [n]) – The predicted scores for input triples X.
  • rank (ndarray, shape [n]) – Rank of the triple