How I use data to optimize AI apps

At Find AI, we use OpenAI a lot. Last week, we made 19 million requests.

Understanding what's happening at that scale can be challenging. It's a classic OODA loop:

Observe what our application is doing and which systems are triggering requests
Orient around what's happening, such as which models are the most costly in aggregate
Decide how to make the system more efficient, such as by testing a more efficient model or shorter prompt
Act by rolling out changes

Velvet, an AI Gateway, is the tool in our development stack that enables this observability and optimization loop. I worked with them this week to produce a video about how we use data to optimize our AI-powered apps at Find AI.

The video covers observability tools in development, cost attribution, using the OpenAI Batch API, evaluating new models, and fine-tuning. I hope it's a useful resource for people running AI models in production.

Watch the video on the Velvet Youtube.

How I use data to optimize AI apps

Keep reading

Archetypes of LLM apps

Internal tools of Find AI

Behind the scenes of the Find AI API