The key innovation of MFTCoder is its ability to address common challenges faced in multi-task learning, including data imbalance, varying task difficulties, and inconsistent convergence speeds. It does so through custom loss functions designed to promote equitable attention and optimization across diverse tasks.
Experiments demonstrate MFTCoder’s superiority over traditional approaches of individual task fine-tuning or mixed-task fine-tuning. When implemented with CodeLLama-34B-Python as the base model, MFTCoder achieved a remarkable 74.4% pass@1 score on the HumanEval benchmark. This even surpasses GPT-4’s 67% zero-shot performance (as reported in original paper).
The implications are significant – this multitask fine-tuning methodology could enable more performant and generalizable Code LLMs with efficient training. The MFTCoder framework has been adapted for popular LLMs like CodeLLama, Qwen, Baichuan, and more.
The researchers highlight innovative techniques like instruction dataset construction using Self-Instruct and efficient tokenization modes. MFTCoder also facilitates integration with PEFT methods like LoRA and QLoRA for parameter-efficient fine-tuning.
Overall, this study presents an important advancement in effectively leveraging multitask learning to boost Code LLM capabilities. The proposed MFTCoder framework could have far-reaching impacts, enabling rapid development of performant models for code intelligence tasks like completion, translation and test case generation. Its efficiency and generalizability across diverse tasks and models makes MFTCoder particularly promising.
MFTCoder is open-sourced at https://github.com/codefuse-ai/MFTCOder