Patterns for Building LLM-based Systems & Products

Discussions on HackerNews , Twitter , and LinkedIn
“There is a large class of problems that are easy to imagine and build demos for, but extremely hard to make products out of. For example, self-driving: It’s easy to demo a car self-driving around a block, but making it into a product takes a decade.” - Karpathy
This write-up is about practical patterns for integrating large language models (LLMs) into systems & products. We’ll build on academic research, industry resources, and practitioner know-how, and distill them into key ideas and practices.
There are seven key patterns. They’re also organized along the spectrum of improving performance vs. reducing cost/risk, and closer to the data vs. closer to the user.
Evals : To measure performance
RAG : To add recent, external knowledge
Fine-tuning : To get better at specific tasks
Caching : To reduce latency & cost
Guardrails : To ensure output quality
Defensive UX : To anticipate & manage errors gracefully
Collect user feedback : To build our data flywheel
附件不支持打印
飞书文档 - 图片