Whittle#

Whittle is a Python library for compressing large language models (LLMs) by extracting sub-networks to balance performance and efficiency. It is based on LitGPT and allows to compress many state-of-the-art models.

Neural Architecture Search: Workflows for pre-training super-networks and multi-objective search to select sub-networks.
Structural Pruning: State-of-the-art approaches to pruning structural components of pre-trained LLMs.
Distillation: Workflow to distill a student model given a trained teacher model.
Evaluation: Easy extraction of sub-networks checkpoint and evaluation using LM-Eval-Harness
Efficiency: Different metrics to estimate efficiency of sub-networks, such as latency, FLOPs, or energy consumption.

Installation#

Whittle supports and is tested for python 3.9 to 3.11.

You can install whittle with:

pip install whittle

Install from source#

Install whittle from source to get the most recent version:

git clone git@github.com:whittle-org/whittle.git
cd whittle
pip install -e .

Projects that use whittle#

How to get involved#

We more than happy for any code contribution. If you are interested in contribution to whittle, please read our contribution guide.