Lightning Thunder is a new source-to-source compiler designed to enhance the performance of PyTorch programs, both on single accelerators and in distributed environments. It focuses on being user-friendly, understandable, and easily extendable. Thunder achieves notable speed improvements by optimizing code and utilizing top-tier executors like nvFuser, torch.compile, cuDNN, and TransformerEngine FP8. It has demonstrated a 40% increase in training throughput for models like Llama 2 7B.
The compiler supports distributed strategies such as DDP and FSDP, with ongoing development to expand its capabilities. Although still in alpha, users are encouraged to contribute to its development. Thunder can be tried without installation through a tutorial studio, and it can be installed via pip, with options to install from the main branch or edit locally.
Thunder’s JIT compiler allows for the optimization of Python callables and PyTorch modules, including efficient operation fusion and optimal distribution across machines. It is fully written in Python, allowing for high levels of introspection and extensibility. Compiled modules are fully compatible with standard PyTorch and its autograd system.
For developers, Thunder provides a comprehensive testing suite and the ability to build documentation locally. It is released under the Apache 2.0 license, with its source code available for contribution and use.
Read more at GitHub…