Workflow · saas · workflow

Wix runs 250 AI agent evals comparing docs vs skills for developer task completion

As AI agents became an increasingly important audience for developer documentation, Wix faced an unexamined assumption that purpose-built skills are superior to docs for guiding agents. Independent teams were creating skills without coordination with the underlying documentation, creating a parallel layer at risk of drifting from the actual product.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Task assigned to sandboxed agent

Sandboxed AI agents are assigned developer tasks under different documentation access conditions.

Tools used

Wix MCP

Outcome

Agent-optimized docs improved CLI task completion from 67% to 87% while cutting token usage by 35%. Well-aligned skills reduced tokens by 30-50% and time by 30%. The team adopted a framework treating optimized docs as the backbone and skills as a caching layer for common tasks.

What failed first

Skills with small mistakes—misaligned project scaffolding, errors in code snippets, and best-practice bloat—eroded their advantage over docs and in some cases dramatically increased token usage.

Results

Time saved9%

Volume67%

Source

https://www.wix.engineering/post/we-ran-250-ai-agent-evals-to-find-out-if-skills-beat-docs-the-answer-is-more-complicated-than-we-ex

How we source this →

Grounding & classification

Source type: technical build writeup

31 fields verified against source quotes, 2 dropped as unverifiable.

agentic workflowcode generationragknowledge basefailure mode describedmetric backednamed customertools describedworkflow describedsoftwareaccuracy improvementautomation ratetime savedtechnical build writeupagentic task execution