Workflow · education · workflow

ClippingGPT: RAG-based AI tutor outperforms GPT-4 by 26% on Brazil's diplomatic career exam

LLMs like ChatGPT operate as language models rather than knowledge bases, making them prone to hallucinations, outdated content, and linguistic bias — rendering them unreliable for high-stakes educational settings where accuracy is essential.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Knowledge base indexing

Source documents are split into chunks and each chunk is transformed into an embedding.

Tools used

GPT-4OpenAI's Embeddings APIOpenAI's Completion APIRedis

Outcome

ClippingGPT, trained on a proprietary knowledge base via embeddings, achieved 23rd place with a score of 597.79, outperforming GPT-4 by 26%.

What failed first

GPT-4 alone scored 473.8, finishing at 177th place and failing to qualify in the diplomatic career entrance exam.

Results

Volume26%

Running since2018

Source

https://medium.com/@rafael_pinheiro/building-with-gpt-for-education-how-we-built-an-ai-tutor-that-aced-the-most-complex-exam-in-latam-19fabf8b746b

How we source this →

Grounding & classification

Source type: technical build writeup

23 fields verified against source quotes.

conversational aiknowledge searchragknowledge basebuilder submittedhuman review describedmetric backednamed customerproduction runtime claimedtools describedworkflow describededucationaccuracy improvementtechnical build writeuprag answering