Gerdau reduces data processing time from 1.5 hours to 12 minutes and cuts data costs 40% with Databricks
Gerdau's homegrown and open-source data tools were complex, disconnected, and hard to manage, requiring Python and Spark proficiency that limited adoption across the business. The platform lacked real-time processing for digital twin projects, and siloed teams created duplicate databases, driving data inconsistencies and a rising total cost of ownership.
How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Delta Lake data foundation
Delta Lake sets the foundation for Gerdau's new unified underlying data infrastructure.
Databricks delivered a 40% cost reduction in data processing and 80% in new streaming solution developments, cut average data processing time from 1.5 hours to 12 minutes, reduced ML model creation effort by 30%, and onboarded over 300 new global data users.
What failed first
Gerdau's proprietary and open-source tool ecosystem was fragmented and overly complex, requiring specialist skills that blocked broad adoption, and its lack of real-time processing capabilities prevented the digital twins use case from being realised.