Workflow · education · workflow

ClippingGPT: RAG-based AI tutor outperforms GPT-4 by 26% on Brazil's diplomatic career exam

LLMs like ChatGPT operate as language models rather than knowledge bases, making them prone to hallucinations, outdated content, and linguistic bias — rendering them unreliable for high-stakes educational settings where accuracy is essential.

How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Knowledge base indexing
Source documents are split into chunks and each chunk is transformed into an embedding.
Tools used
GPT-4OpenAI's Embeddings APIOpenAI's Completion APIRedis
Outcome

ClippingGPT, trained on a proprietary knowledge base via embeddings, achieved 23rd place with a score of 597.79, outperforming GPT-4 by 26%.

What failed first

GPT-4 alone scored 473.8, finishing at 177th place and failing to qualify in the diplomatic career entrance exam.

Results
Volume26%
Running since2018
Source

https://medium.com/@rafael_pinheiro/building-with-gpt-for-education-how-we-built-an-ai-tutor-that-aced-the-most-complex-exam-in-latam-19fabf8b746b

How we source this →

Grounding & classification
Source type: technical build writeup
23 fields verified against source quotes.
conversational aiknowledge searchragknowledge basebuilder submittedhuman review describedmetric backednamed customerproduction runtime claimedtools describedworkflow describededucationaccuracy improvementtechnical build writeuprag answering