Resources

Attention Is All You Need

“Attention Is All You Need” is a seminal 2017 paper by Vaswani et al., introducing the Transformer architecture. It replaces recurrence with self-attention, enabling parallelization and significantly improving training efficiency.

Read the Paper

Evaluating Large Language Models Trained on Code

This paper evaluates several large language models trained on code, such as Codex and GPT-Neo, and compares their capabilities across various programming tasks, providing insights into the effectiveness of code-specific pretraining.

Read the Paper

StarCoder: may the source be with you!

StarCoder is an open-weight model trained on permissively licensed code from GitHub. It supports code completion, infilling, and other generation tasks, and is a strong contender in open LLM research for software engineering.

Read the Paper

Fast Transformer Decoding: One Write-Head is All You Need

This paper proposes an efficient Transformer decoding strategy that uses a single write-head. The method enables faster inference without compromising performance, offering practical improvements for real-time applications.

Read the Paper

InCoder: A Generative Model for Code Infilling and Synthesis

InCoder is a unified autoregressive model for code synthesis and infilling. Unlike traditional left-to-right models, InCoder supports flexible editing and is effective at filling in missing code spans with high-quality completions.

Read the Paper