LinkedIn builds a GenAI application tech stack: LangChain framework, prompt management, skill inversion, and conversational memory
LinkedIn lacked a shared, scalable foundation for GenAI development. Early products relied on fragmented Java (online) and Python (offline) stacks that required substantial ongoing effort to stay in sync, manual string interpolation for prompts that was error-prone and unscalable, and per-product re-implementation of common skills.
How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · User query enters framework
Each incoming member or customer query enters the GenAI application framework independently.
Tools used
LangChainJinjaAzure OpenAI servicePyTorchDeepSpeedvLLMLlamaCouchbaseEspressoBingOpenAI Chat Completions API
Outcome
LinkedIn now has a standardized GenAI application framework built as a thin LangChain wrapper with integrated prompt management, conversational memory on the LinkedIn messaging stack, and a centralized skill registry. Fine-tuned Llama models achieve comparable quality to commercial models at much lower costs and latencies, and many existing applications have been migrated to the new stack.
What failed first
A single shared Java midtier for all GenAI products became an operational bottleneck and had to be split into multiple use-case-specific services, each requiring mirrored Python logic. Staying on Java for online serving proved a suboptimal long-term choice as the GenAI ecosystem evolved primarily in Python.