Workflow · education · workflow
ClippingGPT: RAG-based AI tutor outperforms GPT-4 by 26% on Brazil's diplomatic career exam
LLMs like ChatGPT operate as language models rather than knowledge bases, making them prone to hallucinations, outdated content, and linguistic bias — rendering them unreliable for high-stakes educational settings where accuracy is essential.
How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Knowledge base indexing
Source documents are split into chunks and each chunk is transformed into an embedding.
Tools used
GPT-4OpenAI's Embeddings APIOpenAI's Completion APIRedis
Outcome
ClippingGPT, trained on a proprietary knowledge base via embeddings, achieved 23rd place with a score of 597.79, outperforming GPT-4 by 26%.
What failed first
GPT-4 alone scored 473.8, finishing at 177th place and failing to qualify in the diplomatic career entrance exam.
Results
Volume26%
Running since2018
Grounding & classification
Source type: technical build writeup
23 fields verified against source quotes.
conversational aiknowledge searchragknowledge basebuilder submittedhuman review describedmetric backednamed customerproduction runtime claimedtools describedworkflow describededucationaccuracy improvementtechnical build writeuprag answering