Source: keras_text/models/sentence_model.py#L0


SentenceModelFactory


SentenceModelFactory.__init__

__init__(self, num_classes, token_index, max_sents, max_tokens, embedding_type="glove.6B.100d", \
    embedding_dims=100)

Creates a SentenceModelFactory instance for building various models that operate over (samples, max_sentences, max_tokens) input.

Args:

  • num_classes: The number of output classes.
  • token_index: The dictionary of token and its corresponding integer index value.
  • max_sents: The max sentence length across all documents.
  • max_tokens: The max number of tokens across all sentences.
  • embedding_type: The embedding type to use. Set to None to use random embeddings. (Default value: 'glove.6B.100d')
  • embedding_dims: The number of embedding dims to use for representing a word. This argument will be ignored when embedding_type is set. (Default value: 100)

SentenceModelFactory.build_model

build_model(self, token_encoder_model, sentence_encoder_model, trainable_embeddings=True, \
    output_activation="softmax")

Builds a model that first encodes all words within sentences using token_encoder_model, followed by sentence_encoder_model.

Args:

  • token_encoder_model: An instance of SequenceEncoderBase for encoding tokens within sentences. This model will be applied across all sentences to create a sentence encoding.
  • sentence_encoder_model: An instance of SequenceEncoderBase operating on sentence encoding generated by token_encoder_model. This encoding is then fed into a final Dense layer for classification.
  • trainable_embeddings: Whether or not to fine tune embeddings.
  • output_activation: The output activation to use. (Default value: 'softmax')
  • Use:
  • softmax for binary or multi-class.
  • sigmoid for multi-label classification.
  • linear for regression output.

Returns:

The model output tensor.