Workflow · saas · workflow

Grammarly builds CoEdIT: an instruction-tuned LLM for text editing that outperforms GPT-3-Edit while being up to 60 times smaller

General-purpose LLMs were trained for broad text-generation tasks and lacked instruction tuning for text editing, limiting their usability, quality, and performance on well-scoped editing tasks.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Identify LLM research gaps

Gaps in developing general-purpose text editing models using LLMs motivated the CoEdIT project.

Tools used

CoEdITGPT3-EditChatGPT

Outcome

CoEdIT achieves state-of-the-art performance on multiple benchmark test sets while being up to 60 times smaller than comparable LLMs. Human evaluators preferred CoEdIT's output 64 percent of the time compared to 10 percent for GPT3-Edit.

What failed first

Prior text editing LLMs suffered from four identified gaps: no instruction tuning, undersized models, highly general (not task-specific) training datasets, and lack of public availability.

Results

Volume64 percent

Source

https://www.grammarly.com/blog/engineering/coedit-text-editing/

How we source this →

Grounding & classification

Source type: technical build writeup

17 fields verified against source quotes, 2 dropped as unverifiable.

content generationhuman review describedmetric backedsource backedtools describedsoftwareaccuracy improvementtechnical build writeup