“Attention Is All You Need” is a seminal 2017 paper by Vaswani et al., introducing the Transformer architecture. It replaces recurrence with self-attention, enabling parallelization and significantly improving training efficiency.
Read the PaperThis paper evaluates several large language models trained on code, such as Codex and GPT-Neo, and compares their capabilities across various programming tasks, providing insights into the effectiveness of code-specific pretraining.
Read the PaperStarCoder is an open-weight model trained on permissively licensed code from GitHub. It supports code completion, infilling, and other generation tasks, and is a strong contender in open LLM research for software engineering.
Read the PaperThis paper proposes an efficient Transformer decoding strategy that uses a single write-head. The method enables faster inference without compromising performance, offering practical improvements for real-time applications.
Read the PaperInCoder is a unified autoregressive model for code synthesis and infilling. Unlike traditional left-to-right models, InCoder supports flexible editing and is effective at filling in missing code spans with high-quality completions.
Read the Paper