Overview
Challenges of Deep Learning Experimentation
Deep learning has revolutionized many fields, but the process of conducting deep learning research is still complex and time-consuming. Practitioners often face challenges such as:
-
Boilerplate Code
Writing repetitive code for training loops, data loading, metric calculation, and experiment setup for each new project.
-
Experiment Management
Managing a range of hyperparameters, datasets, and architectures across different experiments.
-
Reproducibility
Ensuring that experiments are reproducible and that results can be reliably replicated.
-
Collaboration
Sharing and collaborating on experiments with other researchers or team members.
Lighter: Configuration-Driven Deep Learning
Lighter is a framework designed to address these challenges and streamline deep learning experimentation. It uses YAML configuration files for experiment definition and management.
Core Philosophy
Lighter's core philosophy is to separate configuration from code and provide a high-level, declarative way to define deep learning experiments. This is achieved through the following key design principles:
-
Declarative Experiment Definition with YAML: Lighter embraces a declarative approach, where experiments are defined entirely through YAML configuration files. This contrasts with an imperative approach, where the experiment logic is written directly in code. The YAML configuration acts as a blueprint, specifying what components to use and how they are connected, without dictating the implementation details of the training loop or data processing. This separation of concerns is fundamental to Lighter's design.
-
Component-Based Architecture: Lighter structures deep learning experiments into distinct, well-defined components:
model
: The neural network architecture.optimizer
: The optimization algorithm (e.g., Adam, SGD).scheduler
: Learning rate scheduler (optional).criterion
: The loss function.metrics
: Evaluation metrics (e.g., accuracy, F1-score).dataloaders
: Data loading and preprocessing pipelines.inferer
: (Optional) Handles inference logic, like sliding window inference.adapters
: Customize data handling, argument passing, and transformations.
Each component is defined in in the
system
section of the YAML configuration. Thissystem
then interacts with PyTorch Lightning'sTrainer
to orchestrate the training process. -
Encapsulation and Orchestration with
System
: Thelighter.System
acts as a central orchestrator, encapsulating all the experiment's components. It inherits from PyTorch Lightning'sLightningModule
, providing a familiar and well-defined interface for training, validation, testing, and prediction. Crucially, theSystem
class is responsible for:- Instantiating the components defined in the YAML configuration.
- Connecting these components and managing the flow of data between them. This includes passing the model's parameters to the optimizer, feeding data through the model, calculating the loss, and updating metrics. The
Adapters
play a vital role in this data flow management. - Providing hooks for custom logic (e.g.,
on_train_start
). - Defining the
forward
method, which is used for inference, and delegating to a custominferer
if one is specified. - Implementing the
_step
method, common to all stages (training_step
,validation_step
,test_step
,predict_step
), which handles batch preparation, forward pass, loss calculation, and metric computation.
The
System
class does not implement the training loop itself. Instead, it defines how the components interact within each step of the loop. -
Leveraging PyTorch Lightning's
Trainer
: Lighter builds upon PyTorch Lightning'sTrainer
, inheriting its powerful features for distributed training, mixed precision, checkpointing, and, most importantly, the training loop logic. Thetrainer
section in the YAML configuration allows users to directly configure theTrainer
object, providing access to all of PyTorch Lightning's capabilities. TheTrainer
executes the training loop, calling theSystem
's methods (e.g.,training_step
,validation_step
) at each iteration. This separation of concerns keeps Lighter's core logic focused on experiment configuration and component management. -
Project-Specific Custom Modules: Lighter is designed to be highly extensible. Users can define custom modules (models, datasets, metrics, callbacks, etc.) within their own project directory and seamlessly integrate them into Lighter experiments. The
project
key in the YAML configuration specifies the path to the project's root directory. Lighter then dynamically imports these custom modules, making them available for use in the configuration file via the_target_
key. This eliminates the need to modify Lighter's core code to add custom functionality, promoting flexibility and code organization. -
Adapters: The Key to Task-Agnosticism: Lighter's
Adapters
are a crucial design element that enables Lighter to be task-agnostic. Adapters provide a way to customize the flow of data between different components without modifying the components themselves. They act as intermediaries, handling variations in data formats, argument orders, and pre/post-processing requirements. Lighter provides several types of adapters:-
BatchAdapter
: Handles variations in batch structure. Different datasets or tasks might return batches in different formats (e.g.,(input, target)
,input
,{"image": input, "label": target}
). TheBatchAdapter
allows users to specify how to extract theinput
,target
, andidentifier
(e.g., filename) from the batch, regardless of its original structure. This uses accessor functions, integer indices, or string keys, as appropriate. -
CriterionAdapter
andMetricsAdapter
: Handle variations in the argument signatures of loss functions and metrics. Some loss functions might expect arguments in the order(pred, target)
, while others might expect(target, pred)
or even include theinput
. These adapters allow users to specify the correct argument mapping (positional or keyword) and apply transformations (e.g.,torch.sigmoid
) before the data is passed to the criterion or metric. This avoids the need to write wrapper functions for each loss function or metric. -
LoggingAdapter
: Handles transformations applied to data before it is logged. This is useful for tasks like converting predictions to class labels (usingtorch.argmax
) or converting tensors to NumPy arrays for visualization.
The adapter system allows Lighter to handle a wide range of tasks (classification, segmentation, self-supervised learning, etc.) without requiring changes to the core framework. It provides a powerful and flexible mechanism for customizing data flow, making Lighter highly adaptable to different research problems.
-
Recap
- Declarative Configuration: Defining experiments through YAML files.
- Modularity: Breaking down experiments into well-defined components.
- Encapsulation and Orchestration: Using the
System
class to manage component interactions and data flow. - Extensibility: Allowing users to easily incorporate custom modules.
- Flexibility: Adapting to diverse tasks and data formats through the adapter system.
- Simplicity: Leveraging PyTorch Lightning for robust training features, allowing the user to run their experiment with a single command.