active_learning module¶

Module containing the active learning pipeline

class active_learning.ActiveLearningPipeline(data_module, model, strategy, epochs, gpus, checkpoint_dir=None, active_learning_mode=False, initial_epochs=None, items_to_label=1, iterations=None, reset_weights=False, epochs_increase_per_query=0, heatmaps_per_iteration=0, logger=True, early_stopping=False, lr_scheduler=None, model_selection_criterion='loss', deterministic_mode=True, save_model_every_epoch=False, clear_wandb_cache=False, **kwargs)[source]¶

Bases: object

The pipeline or simulation environment to run active learning experiments.

Parameters

data_module (ActiveLearningDataModule) – A data module object providing data.
model (PytorchModel) – A model object with architecture able to be fitted with the data.
strategy (QueryStrategy) – An active learning strategy to query for new labels.
epochs (int) – The number of epochs the model should be trained.
gpus (int) – Number of GPUS to use for model training.
checkpoint_dir (str, optional) – Directory where the model checkpoints are to be saved.
early_stopping (bool, optional) – Enable/Disable Early stopping when model is not learning anymore. Defaults to False.
logger – A logger object as defined by Pytorch Lightning.
lr_scheduler (string, optional) – Algorithm used for dynamically updating the learning rate during training. E.g. ‘reduceLROnPlateau’ or ‘cosineAnnealingLR’
active_learning_mode (bool, optional) – Enable/Disabled Active Learning Pipeline. Defaults to False.
initial_epochs (int, optional) – Number of epochs the initial model should be trained. Defaults to epochs.
items_to_label (int, optional) – Number of items that should be selected for labeling in the active learning run. Defaults to 1.
iterations (int, optional) – iteration times how often the active learning pipeline should be executed. If None, the active learning pipeline is run until the whole dataset is labeled. Defaults to None.
reset_weights (bool, optional) – Enable/Disable resetting of weights after every active learning run
epochs_increase_per_query (int, optional) – Increase number of epochs for every query to compensate for the increased training dataset size. Defaults to 0.
heatmaps_per_iteration (int, optional) – Number of heatmaps that should be generated per iteration. Defaults to 0.
deterministic_mode (bool, optional) – Whether only deterministic CUDA operations should be used. Defaults to True.
save_model_every_epoch (bool, optional) – Whether the model files of all epochs are to be saved or only the model file of the best epoch. Defaults to False.
clear_wandb_cache (bool, optional) – Whether the whole Weights and Biases cache should be deleted when the run is finished. Should only be used when no other runs are running in parallel. Defaults to False.
**kwargs – Additional, strategy-specific parameters.

static remove_wandb_cache()[source]¶: Deletes Weights and Biases cache directory. This is necessary since the Weights and Biases client currently does not implement proper cache cleanup itself. See this github issue for more details.

run()[source]¶: Run the pipeline

setup_trainer(epochs, iteration=None)[source]¶

Initializes a new Pytorch Lightning trainer object.

Parameters

epochs (int) – Number of training epochs.
iteration (Optional[int], optional) – Current active learning iteration. Defaults to None.

Returns

A trainer object.

Return type

pytorch_lightning.Trainer

active_learning module¶

Docs