Build a useful AI recommendation feature

Connecting to an LLM is just the first step. This article walks through iterative improvements using tactics like prompt engineering, RAG, and fine-tuning to build a useful recommendation feature.

The evolution of a useful AI feature

Let’s imagine you want to build a book recommendation feature, which leverages OpenAI to accept natural language and return results from your content library. Connecting to the LLM will be relatively easy. Then you'll need to employ a variety of techniques to make the feature useful for your product and use case - including prompt engineering, document retrieval, function calls, fine-tuning, and general app development.

Connect to OpenAI

If you build an MVP that simply calls OpenAI and surfaces its responses, you’ll end up with a generalized chat bot (a “wrapper”). It may feel engaging the first time someone tries it, but there's no differentiated value. At this point, it's simply another interface for Open AI.

Layer on prompt engineering

Next, you’ll layer on prompts to instruct the LLM how to act. While this is often called prompt engineering, it doesn’t require much coding expertise. You’re sharing additional notes with the LLM to narrow its focus and behave in a particular way. The prompt will effect this specific response you get back from OpenAI, but it won’t train the model’s general knowledge.

For example: You're a librarian specialized in fictional literature. People come to you when they don't know what to read next. You take into consideration what that person has read in the past to inform your recommendations and suggest four books at a time. You can ask additional questions to produce better recommendations.”

Your app starts focusing more on the task at hand, it talks like a librarian, and it returns something distinct from ChatGPT.

Design a focused interface

While text-based chat is functional, you see a decline in engagement after someone tries the feature once. Plus, most users are asking some variation of the same question.

You try implementing a more focused book-buying experience that doesn't require chat.

Now your app starts feeling more familiar to users. It's not so obviously "AI", but it recommends relevant books quickly, and keeps the user focused on the task at hand.

Use RAG to focus responses

You realize your book recommendation feature is costly because it's not effectively converting to book purchases and each query to OpenAI is expensive. You need the LLM to focus on your specific content catalog, instead of all books in the world. Your catalog also updates daily based on buying and selling. So you implement retrieval-augmented generation (RAG). Now the model narrows it’s scope to only your database of books.

This new focus improves the accuracy, relevancy, and cost of your results. Note, applying RAG to the prompt will effect the results of this specific response. It won't change the way the model reasons going forward.

At this point, your recommendation feature is working - you’ve found product-market fit with millions of end users.

Invest in fine-tuning a model

Prompt engineering and RAG techniques let you validate the feature at a reasonable cost, and now you want to optimize further. It may be time to invest in fine-tuning a model to further improve costs, latency, and accuracy. With fine-tuning, you’re effecting the reasoning capabilities of the model itself, not just the results.

Note, the fine tuning process is expensive and can result in catastrophic failures, so it’s best to depend on other tactics until you identify a problem worth the investment.

This was a basic scenario of how you might apply various techniques to improve an AI feature as you collect feedback and data. Although tactics will certainly evolve, you should always have access to your LLM request logs.

Leverage your LLM logs

LLM requests, responses, and parameters can be used to analyze, optimize, and fine-tune your AI features. Storing them in a database gives you full control to query, forward, and use them however you want—both now and in the future.

ai-first data PIPELINE

Warehouse LLM requests, optimize AI features.

Try Velvet for free

More articles

Why Find AI logs OpenAI requests with Velvet

AI-powered B2B search engine logged 1,500 requests per second.

How we use OpenAI to automate our data copilot

Lessons learned using LLMs to automate data workflows.

Four ways to optimize your AI feature post launch

Tactics to analyze, test, and improve your AI-features.