active_learning module¶
Module containing the active learning pipeline
- class active_learning.ActiveLearningPipeline(data_module, model, strategy, epochs, gpus, checkpoint_dir=None, active_learning_mode=False, initial_epochs=None, items_to_label=1, iterations=None, reset_weights=False, epochs_increase_per_query=0, heatmaps_per_iteration=0, logger=True, early_stopping=False, lr_scheduler=None, model_selection_criterion='loss', deterministic_mode=True, save_model_every_epoch=False, clear_wandb_cache=False, **kwargs)[source]¶
- Bases: - object- The pipeline or simulation environment to run active learning experiments. - Parameters
- data_module (ActiveLearningDataModule) – A data module object providing data. 
- model (PytorchModel) – A model object with architecture able to be fitted with the data. 
- strategy (QueryStrategy) – An active learning strategy to query for new labels. 
- epochs (int) – The number of epochs the model should be trained. 
- gpus (int) – Number of GPUS to use for model training. 
- checkpoint_dir (str, optional) – Directory where the model checkpoints are to be saved. 
- early_stopping (bool, optional) – Enable/Disable Early stopping when model is not learning anymore. Defaults to False. 
- logger – A logger object as defined by Pytorch Lightning. 
- lr_scheduler (string, optional) – Algorithm used for dynamically updating the learning rate during training. E.g. ‘reduceLROnPlateau’ or ‘cosineAnnealingLR’ 
- active_learning_mode (bool, optional) – Enable/Disabled Active Learning Pipeline. Defaults to False. 
- initial_epochs (int, optional) – Number of epochs the initial model should be trained. Defaults to epochs. 
- items_to_label (int, optional) – Number of items that should be selected for labeling in the active learning run. Defaults to 1. 
- iterations (int, optional) – iteration times how often the active learning pipeline should be executed. If None, the active learning pipeline is run until the whole dataset is labeled. Defaults to None. 
- reset_weights (bool, optional) – Enable/Disable resetting of weights after every active learning run 
- epochs_increase_per_query (int, optional) – Increase number of epochs for every query to compensate for the increased training dataset size. Defaults to 0. 
- heatmaps_per_iteration (int, optional) – Number of heatmaps that should be generated per iteration. Defaults to 0. 
- deterministic_mode (bool, optional) – Whether only deterministic CUDA operations should be used. Defaults to True. 
- save_model_every_epoch (bool, optional) – Whether the model files of all epochs are to be saved or only the model file of the best epoch. Defaults to False. 
- clear_wandb_cache (bool, optional) – Whether the whole Weights and Biases cache should be deleted when the run is finished. Should only be used when no other runs are running in parallel. Defaults to False. 
- **kwargs – Additional, strategy-specific parameters. 
 
 - static remove_wandb_cache()[source]¶
- Deletes Weights and Biases cache directory. This is necessary since the Weights and Biases client currently does not implement proper cache cleanup itself. See this github issue for more details. 
 - setup_trainer(epochs, iteration=None)[source]¶
- Initializes a new Pytorch Lightning trainer object. - Parameters
- epochs (int) – Number of training epochs. 
- iteration (Optional[int], optional) – Current active learning iteration. Defaults to None. 
 
- Returns
- A trainer object. 
- Return type
- pytorch_lightning.Trainer