Shared Encoder

What is a shared encoder?

The concept of this library is to provide a single model for multiple tasks. To achieve this we place a transformer-based encoder at centre. Data for all tasks will go through this centre encoder. This encoder is called shared as it is responsible for majority of learnings on all the tasks. Further, task specific headers are formed over the shared encoder.

Task specific headers

The encoder hidden states are consumed by task specific layers defined to output logits in the format required by the task. Forward pass for a data batch belonging to say taskA occurs through the shared encoder and header for taskA. The computed loss (which is called as ‘task loss’) is back-propagated through the same path.

Choice of shared encoder

We support multiple transformer-based encoder models. For ease of use, we’ve integrated the encoders from the transformers library. Available encoders with their config names are mentioned below.

Model type Config name Default config
DISTILBERT distilbert-base-uncased distilbert-base-uncased
distilbert-base-cased
BERT bert-base-uncased bert-base-uncased
bert-base-cased
bert-large-uncased
bert-large-cased
ROBERTA roberta-base roberta-base
roberta-large
ALBERT albert-base-v1 albert-base-v1
albert-large-v1
albert-xlarge-v1
albert-xxlarge-v1
albert-base-v2
albert-large-v2
albert-xlarge-v2
albert-xxlarge-v2
XLNET xlnet-base-cased xlnet-base-cased
xlnet-large-cased

Losses

We support following two types of loss functions.

class models.loss.CrossEntropyLoss(alpha=1.0, name='Cross Entropy Loss')[source]
forward(inp, target, attnMasks=None)[source]

This is the standard cross entropy loss as defined in pytorch. This loss should be used for single sentence or sentence pair classification tasks.

To use this loss for training, set loss_type : CrossEntropyLoss in task file

class models.loss.NERLoss(alpha=1.0, name='Cross Entropy Loss')[source]
forward(inp, target, attnMasks=None)[source]

This loss is a modified version of cross entropy loss for NER/sequence labelling tasks. This loss ignores extra ‘O’ values through attention masks.

To use this loss for training, set loss_type : NERLoss in task file

Metrics

For evaluating the performance on dev and test sets during training, we provide the following standard metrics.

File for creating metric functions

utils.eval_metrics.classification_accuracy(yTrue, yPred)[source]

Accuracy score for classification tasks using the label provided in file and predictions from multi-task model. It takes a batch of predictions and labels.

To use this metric, add classification_accuracy into list of metrics in task file.

Parameters:
  • yPred (list) – [0, 2, 1, 3]
  • yTrue (list) – [0, 1, 2, 3]
utils.eval_metrics.classification_f1_score(yTrue, yPred)[source]

Standard f1 score from sklearn for classification tasks. It takes a batch of predictions and labels.

To use this metric, add classification_f1_score into list of metrics in task file.

Parameters:
  • yPred (list) – [0, 2, 1, 3]
  • yTrue (list) – [0, 1, 2, 3]
utils.eval_metrics.seqeval_f1_score(yTrue, yPred)[source]

f1 score for NER/sequence labelling tasks taken from the seqeval library.

To use this metric, add seqeval_f1_score into list of metrics in task file.

Parameters:
  • yTrue (list of list) – [[‘O’, ‘O’, ‘O’, ‘B-MISC’, ‘I-MISC’, ‘I-MISC’, ‘O’], [‘B-PER’, ‘I-PER’, ‘O’]]
  • yPred (list of list) – [[‘O’, ‘O’, ‘B-MISC’, ‘I-MISC’, ‘I-MISC’, ‘I-MISC’, ‘O’], [‘B-PER’, ‘I-PER’, ‘O’]]
utils.eval_metrics.seqeval_precision(yTrue, yPred)[source]

Precision score for NER/sequence labelling tasks taken from the seqeval library.

To use this metric, add seqeval_precision into list of metrics in task file.

Parameters:
  • yTrue (list of list) – [[‘O’, ‘O’, ‘O’, ‘B-MISC’, ‘I-MISC’, ‘I-MISC’, ‘O’], [‘B-PER’, ‘I-PER’, ‘O’]]
  • yPred (list of list) – [[‘O’, ‘O’, ‘B-MISC’, ‘I-MISC’, ‘I-MISC’, ‘I-MISC’, ‘O’], [‘B-PER’, ‘I-PER’, ‘O’]]
utils.eval_metrics.seqeval_recall(yTrue, yPred)[source]

Recall score for NER/sequence labelling tasks taken from the seqeval library.

To use this metric, add seqeval_recall into list of metrics in task file.

Parameters:
  • yTrue (list of list) – [[‘O’, ‘O’, ‘O’, ‘B-MISC’, ‘I-MISC’, ‘I-MISC’, ‘O’], [‘B-PER’, ‘I-PER’, ‘O’]]
  • yPred (list of list) – [[‘O’, ‘O’, ‘B-MISC’, ‘I-MISC’, ‘I-MISC’, ‘I-MISC’, ‘O’], [‘B-PER’, ‘I-PER’, ‘O’]]
utils.eval_metrics.snips_f1_score(yTrue, yPred)[source]

f1 score for SNIPS NER/Slot filling task taken from the MiuLab library.

To use this metric, add snips_f1_score into list of metrics in task file.

Parameters:
  • yTrue (list of list) – [[‘O’, ‘O’, ‘O’, ‘B-MISC’, ‘I-MISC’, ‘I-MISC’, ‘O’], [‘B-PER’, ‘I-PER’, ‘O’]]
  • yPred (list of list) – [[‘O’, ‘O’, ‘B-MISC’, ‘I-MISC’, ‘I-MISC’, ‘I-MISC’, ‘O’], [‘B-PER’, ‘I-PER’, ‘O’]]
utils.eval_metrics.snips_precision(yTrue, yPred)[source]

Precision score for SNIPS NER/Slot filling task taken from the MiuLab library.

To use this metric, add snips_precision into list of metrics in task file.

Parameters:
  • yTrue (list of list) – [[‘O’, ‘O’, ‘O’, ‘B-MISC’, ‘I-MISC’, ‘I-MISC’, ‘O’], [‘B-PER’, ‘I-PER’, ‘O’]]
  • yPred (list of list) – [[‘O’, ‘O’, ‘B-MISC’, ‘I-MISC’, ‘I-MISC’, ‘I-MISC’, ‘O’], [‘B-PER’, ‘I-PER’, ‘O’]]
utils.eval_metrics.snips_recall(yTrue, yPred)[source]

Recall score for SNIPS NER/Slot filling task taken from the MiuLab library.

To use this metric, add snips_recall into list of metrics in task file.

Parameters:
  • yTrue (list of list) – [[‘O’, ‘O’, ‘O’, ‘B-MISC’, ‘I-MISC’, ‘I-MISC’, ‘O’], [‘B-PER’, ‘I-PER’, ‘O’]]
  • yPred (list of list) – [[‘O’, ‘O’, ‘B-MISC’, ‘I-MISC’, ‘I-MISC’, ‘I-MISC’, ‘O’], [‘B-PER’, ‘I-PER’, ‘O’]]