Workflow · saas · workflow
Inside GitHub: Working with the LLMs behind GitHub Copilot
General-purpose code generation was considered too difficult because existing models could not reliably produce useful completions, and an early chatbot prototype proved to be an inferior modality compared to in-IDE code completion.
How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Developer edits code in IDE
GitHub Copilot runs inside the IDE, described as interactive and useful in almost every situation.
Tools used
GitHub CopilotGPT-3Codex
Outcome
By incorporating neighboring-tab context retrieval and file path headers into prompts, and fine-tuning the Codex model on users' codebases, GitHub Copilot achieved a large lift in acceptance rate and characters retained, with the underlying model improving to solve upwards of 90 percent of benchmark coding problems.
What failed first
A static chatbot prototype for answering coding questions was abandoned in favour of in-IDE completion, and early model versions frequently suggested code in the wrong programming language.
Results
Volumeupwards of 90 percent
Grounding & classification
Source type: technical build writeup
20 fields verified against source quotes.
code generationpersonalizationragfailure mode describedhuman review describedmetric backednamed customersource backedtools describedworkflow describedsoftwareaccuracy improvementemployee productivitytechnical build writeup