Workflow · media · workflow

We Built a News Site Powered by LLMs and Public Data: Here's What We Learned

As the volume and pace of data grows, it has become a challenge for data journalists to make sense of it all, and there is no scalable way to cover constantly updating datasets like economic indicators, political polls, and environmental data.

How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Poll data sources for updates
A pipeline continually asks data sources whether they have new data, diffing the known state against the latest published data.
Tools used
GPT-4 TurboOpenAIVegaVega-LiteDSPy
Outcome

Realtime automates the creation of data-driven story analyses and visualizations using LLMs, giving readers access to up-to-date information and allowing journalists to focus on in-depth reporting rather than rote data processing.

Results
Volumesignificant performance improvements
Source

https://generative-ai-newsroom.com/we-built-a-news-site-powered-by-llms-and-public-data-heres-what-we-learned-aba6c52a7ee4

How we source this →

Grounding & classification
Source type: technical build writeup
18 fields verified against source quotes.
content generationsummarizationknowledge basefailure mode describedmetric backednamed customerproduction runtime claimedtools describedworkflow describedmediaemployee productivitytechnical build writeupagentic task execution