Workflow · saas · workflow
Grammarly builds CoEdIT: an instruction-tuned LLM for text editing that outperforms GPT-3-Edit while being up to 60 times smaller
General-purpose LLMs were trained for broad text-generation tasks and lacked instruction tuning for text editing, limiting their usability, quality, and performance on well-scoped editing tasks.
How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Identify LLM research gaps
Gaps in developing general-purpose text editing models using LLMs motivated the CoEdIT project.
Tools used
CoEdITGPT3-EditChatGPT
Outcome
CoEdIT achieves state-of-the-art performance on multiple benchmark test sets while being up to 60 times smaller than comparable LLMs. Human evaluators preferred CoEdIT's output 64 percent of the time compared to 10 percent for GPT3-Edit.
What failed first
Prior text editing LLMs suffered from four identified gaps: no instruction tuning, undersized models, highly general (not task-specific) training datasets, and lack of public availability.
Results
Volume64 percent
Grounding & classification
Source type: technical build writeup
17 fields verified against source quotes, 2 dropped as unverifiable.
content generationhuman review describedmetric backedsource backedtools describedsoftwareaccuracy improvementtechnical build writeup