quality_assurance · saas · workflow

LinkedIn builds agentic workflows to accelerate Liger Kernel GPU kernel engineering

Writing optimized GPU kernels requires deep expertise that is scarce, and demand far outpaces the supply of engineers who can write them. Maintaining the Liger Kernel project at scale — creating kernels, supporting new models, and optimizing performance — requires hours of expert time per task and does not scale with the pace of model innovation.

How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Engineer provides input
The agent accepts input as a PyTorch file, GitHub URL, code snippet, paper reference, or natural language description.
Tools used
Liger Kernelliger-kernel-devliger-autopatchliger-kernel-perfTritonPyTorchHuggingFace TransformersTRLLLaMa-FactoryFlash AttentionPyTorch FSDPDeepSpeedNVIDIA NCUTorchDynamotorch.compiletorch fx
Outcome

Agentic workflows automated kernel creation, model integration, and optimization, producing results including a 1.9x forward and 3.2x backward speedup with 37.5% memory reduction for the ReLU² kernel, a 3.35x backward speedup for the fused_add_rms_norm kernel, and a 10x encoder step-time improvement with 64.7% GPU hours saved in LinkedIn's internal training infrastructure.

What failed first

Early iterations of the agentic workflows generated plausible-looking code that failed convergence tests in subtle ways, such as a wrong casting mode or an incorrect stride computation.

Results
Time saved10x
Volume20%
Source

https://www.linkedin.com/blog/engineering/ai/ai-helping-build-better-ai-how-agents-accelerate-liger-kernel-engineering

How we source this →

Grounding & classification
Source type: technical build writeup
49 fields verified against source quotes.
agentic workflowai agentcode generationcode diff prfailure mode describedhuman review describedmetric backednamed customerproduction runtime claimedtools describedworkflow describedsoftwarecost reductioncycle time reductionthroughput increasetime savedtechnical build writeupquality assuranceagentic task executionai draft human approval