Post-Training Coding LLMs with Frontier Alignment Methods
At JetBrains AI, we train coding models from scratch and own the full model lifecycle — from pretraining and midtraining to post-training and evaluation. One of our key research directions is pushing the quality of our coding assistants through state-of-the-art post-training and alignment.
In this internship project, you will work on improving small, high-performance coding language models using modern alignment techniques such as SFT, DPO, and reinforcement learning. The goal is to explore and implement frontier post-training approaches that make models more helpful, more reliable, and better at real-world software engineering tasks.
This is not a toy project. You will work in a team that has real training infrastructure, real models, and the freedom to experiment with cutting-edge ideas. We currently train a a new coding model, and we have the resources to test ambitious research ideas at meaningful scale. Depending on the project direction, your work may involve data curation, reward modeling, synthetic data generation, preference optimization, RL-based alignment, or training recipe design for coding tasks.
This project is ideal for someone excited by the question: How do you take a strong base coding model and turn it into an exceptional assistant for developers?
## What you will work on
- Participate in post-training and alignment of coding language models.
- Explore and implement modern methods such as SFT, DPO, and RL-based approaches.
- Design and run experiments on real models and real training infrastructure.
- Work with datasets for post-training, including curation and preprocessing.
- Evaluate the impact of different alignment recipes on coding quality and model behavior.
- Collaborate with researchers and engineers working on the full LLM pipeline.
## Why this project is exciting
- You will work on real production-oriented coding models, not toy prototypes.
- You will apply frontier alignment methods with room for genuine research.
- You will experiment at meaningful scale using serious compute resources.
- You will contribute directly to the quality of the next generation of JetBrains coding assistants.
## Requirements
We’ll be happy to have you on this project if you have:
- A solid background in machine learning, deep learning, or NLP.
- Good programming skills in Python.
- Familiarity with at least one modern deep learning framework, preferably PyTorch.
- Basic understanding of transformers and large language models.
- Interest in recent research on LLM post-training, alignment, and reasoning.
- Ability to read technical papers and turn ideas into working experiments with guidance from the team.
- Attention to detail, curiosity, and good communication skills.
## Nice to have
- Experience with LLM fine-tuning or alignment methods.
- Familiarity with Hugging Face, Weights & Biases, or distributed training tools.
- Experience with coding assistants, code generation, or program synthesis.
- Previous research, coursework, or open-source projects in ML/NLP.