Workflow · saas · workflow

Inside GitHub: Working with the LLMs behind GitHub Copilot

General-purpose code generation was considered too difficult because existing models could not reliably produce useful completions, and an early chatbot prototype proved to be an inferior modality compared to in-IDE code completion.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Developer edits code in IDE

GitHub Copilot runs inside the IDE, described as interactive and useful in almost every situation.

Tools used

GitHub CopilotGPT-3Codex

Outcome

By incorporating neighboring-tab context retrieval and file path headers into prompts, and fine-tuning the Codex model on users' codebases, GitHub Copilot achieved a large lift in acceptance rate and characters retained, with the underlying model improving to solve upwards of 90 percent of benchmark coding problems.

What failed first

A static chatbot prototype for answering coding questions was abandoned in favour of in-IDE completion, and early model versions frequently suggested code in the wrong programming language.

Results

Volumeupwards of 90 percent

Source

https://github.blog/ai-and-ml/github-copilot/inside-github-working-with-the-llms-behind-github-copilot/

How we source this →

Grounding & classification

Source type: technical build writeup

20 fields verified against source quotes.

code generationpersonalizationragfailure mode describedhuman review describedmetric backednamed customersource backedtools describedworkflow describedsoftwareaccuracy improvementemployee productivitytechnical build writeup