Mar 5, 2025

Amir Houieh

From Query to Context: Building Agentic RAG with Unbody’s JSON Endpoint

I recently spoke at AI Thinkers Abu Dhabi about building robust RAG (Retrieval-Augmented Generation) systems. A key takeaway was that a naive approach to RAG—just dumping a user’s query into a semantic search and feeding results into a prompt—often isn’t enough. Real user queries can be nuanced:

“I need an Italian dish for two people, no sausage, and it should be ready in under 30 minutes!”

A naive system might not correctly parse all these constraints. That’s where Agentic RAG comes in. By parsing the query into structured parameters (like “no sausage,” “under 30 minutes,” “Italian dish,” etc.), we can do more precise retrieval and generate a better final answer.

Let’s see how to do this with Unbody—step by step.

Prerequisites

  1. You have a Unbody project set up with your data (e.g., a collection of recipes) already ingested and indexed.

  2. You have your API key and Project ID from the Unbody dashboard.

⚠️If you haven’t set this up yet, check out the Unbody docs for a quick start on creating a project and uploading your data.

Initializing the Unbody Client

import { Unbody } from 'unbody'

const u = new Unbody({
  apiKey: process.env.UNBODY_API_KEY!,   // from your env vars
  projectId: process.env.UNBODY_PROJECT_ID!,
})

That’s it! Now you can start calling Unbody’s APIs to search, filter, and generate.

Understanding Your Data

Before you write any code, you must know your data. In our example, we’re working with a collection of recipes. Knowing the schema tells us exactly which parameters we can extract from a user’s query. For instance, our data schema might look like this:

// RecipeCollection data schema
type RecipeCollection = {
  title: StringField,
  text: StringField,
  ingredients: StringArrayField,
  mealTypes: StringArrayField,
  courseTypes: StringArrayField,
  cuisineTypes: StringArrayField,
  seasons: StringArrayField,
  allergens: StringArrayField,
  prepTimeMinutes: NumberField,
  cookTimeMinutes: NumberField,
  totalTimeMinutes: NumberField,
  servings: NumberField,
}

This schema tells us what information is available for each recipe. Knowing these fields lets us decide which parts of the user’s query should be parsed and parameterized. For example, if a user says,

“I want an Italian dish with no sausage and ready in under 30 minutes”

we know to extract the concept (Italian dish), ingredients to exclude (sausage), and a time constraint (30 minutes).

Naive RAG vs. Agentic RAG

RAG stands for Retrieval-Augmented Generation and consists of two main parts: Retrieval and Generation. The retrieval part is crucial because that's where you gather the context and information based on which the model can generate whatever is needed. Let me walk you through two different approaches:

Naive RAG

A straightforward RAG approach might do this:

  1. Take the user’s entire query (e.g. “Italian dish, no sausage, 30 minutes max”).

  2. Pass it to .search.about(...).

  3. Feed the top documents into a text-generation prompt.

const {data} = unbody.get
    				 .textDocuments
					 .collection('RecipeCollection')
  					 .search.about(userQuery)
                     .generate.fromMany(userQuery)
					 .exec()

This works for simple queries but fails when the user has multiple constraints. The system doesn’t actually understand the user’s intention; it just tries to match the entire text in a semantic way, which can lead to incomplete or irrelevant results.

Agentic RAG

With an Agentic approach, we:

  1. Parse the user’s query into structured parameters.

  2. Apply filters to our knowledge base (e.g., “exclude sausage,” “max 30 minutes,” “Italian”).

  3. Generate the final response using only the truly relevant documents.

The general syntax we will be using is going to be with Unbdoy’s chain syntax methods which is going to be like this

const {data} = unbody.get
					 .textDocuments
					 .collection('RecipeCollection')
					 .search.about(...)
					 .where(...)
					 .generate.fromMany(...)
					 .exec()

Parsing the User Query into Parameters

Imagine the user types something like:

“I want an Italian pasta with no sausage.”

To translate this into a proper RAG query, we need to extract three pieces of information:

  1. Main keywords or concepts from the user's query to apply to semantic search .search.about(...concepts)

  2. Any specific filters to pass to the where function .where(...filters)

  3. The prompt or task description .generate(task)

To do this, we will be using Unbody’s generative JSON endpoint which utlitize the Zod library under the hood in order to validate and define the output of the model.

the general syntax is like the following. Unbody’s generative JSON receives two argumenst; a set of messages with role definition and a zod schema.

const { data } = await u.generate.json(
    [
      { role: 'system', content: "Analyze the user's query and extract parameters." },
      { role: 'user', content: query },
    ],
    {
      schema: z.object({...}),
    }
  )
}

In Zod schema basicly we can define what the shape of enearive json should be like. Let’s start with the concepts field. Here’s a small snippet where we define a schema for just the concepts field using Zod:

import { z } from 'zod'

const schema = z.Object({
	concepts: z.array(z.string())
               .nullable()
			   .describe('Search terms that the user is interested in.')
})
  • We’re expecting an array of strings (or null if nothing is extracted).

  • The .describe() method adds a note so that anyone reading the code knows that this field represents the main search terms.

Next, we need to capture ingredients that the user wants to include. We do it the same way:

const schema = z.Object({
	concepts: ...
	includeIngredients: z.array(z.string())
                         .nullable()
						 .describe('Ingredients that the user wants to include in the search results.')
})
  • Again, an array of strings (or null).

  • The description clarifies that this field is for ingredients the user explicitly wants in the recipe.

Build the Complete Schema

Now, let’s expand our schema by adding all the fields we expect from the query. This might include:

  • concepts (main search terms)

  • includeIngredients (ingredients to include)

  • excludeIngredients (ingredients to exclude)

  • includeCuisineTypes (preferred cuisine types)

  • totalTimeMinutesMax (a maximum time constraint)

Here’s the full Zod schema with descriptions for every field:

const querySchema = z.object({
  concepts: z.array(z.string())
    .nullable()
    .describe('Search terms that the user is interested in.'),
  
  includeIngredients: z.array(z.string())
    .nullable()
    .describe('Ingredients that the user wants to include in the search results.'),
  
  excludeIngredients: z.array(z.string())
    .nullable()
    .describe('Ingredients that the user wants to exclude from the search results.'),
  
  includeCuisineTypes: z.array(z.string())
    .nullable()
    .describe("Cuisine types that the user wants to include in the search results."),
  
  totalTimeMinutesMax: z.number()
    .nullable()
    .describe('Maximum total time in minutes.')
  
})

Now that our schema is complete, lets create a function called queryParser and integrate the entire logic inot it.

Here’s the complete parseQuery function:

import { z } from 'zod'

const querySchema = z.object({
  concepts: z.array(z.string())
    .nullable()
    .describe('Search terms that the user is interested in.'),
  
  includeIngredients: z.array(z.string())
    .nullable()
    .describe('Ingredients that the user wants to include in the search results.'),
  
  excludeIngredients: z.array(z.string())
    .nullable()
    .describe('Ingredients that the user wants to exclude from the search results.'),
  
  includeCuisineTypes: z.array(z.string())
    .nullable()
    .describe("Cuisine types that the user wants to include in the search results."),
  
  totalTimeMinutesMax: z.number()
    .nullable()
    .describe('Maximum total time in minutes.')
  
})

async function parseQuery(query) {
  const { data: { payload } } = await u.generate.json(
    [
      {
        role: 'system',
        content: "Analyze the user's query and extract relevant parameters.",
      },
      {
        role: 'user',
        content: query,
      },
    ],
    {
      schema: querySchema,
    }
  )
  // Return the original query along with the parsed data
  return { query, ...payload.content }
}

Summery:

  • We pass our complete querySchema to the generative endpoint.

  • The system message instructs the model to analyze the query.

  • The returned payload (after validation) gives us a well-structured object containing all the extracted parameters.

In this snippet, the model processes the user’s input and returns a structured JSON object that you can later use to build precise queries.

If the query is:

“Find me an Italian pasta with no sausage”

the returned JSON might look like this:

{
  "concepts": ["Italian pasta"],
  "includeIngredients": null,
  "excludeIngredients": ["sausage"],
  "includeCuisineTypes": ["italian"],
  "totalTimeMinutesMax": null
}

From Parsed Parameters to a Complete RAG Query

Now that we’ve parsed the query into detailed parameters, it’s time to build the final RAG pipeline. Our pipeline consists of three parts:

  1. Semantic Search using .search.about(...)

  2. Filtering using .where(...)

  3. Generation using .generate.fromMany(...)

Let’s break each piece down.

Semantic Search with .search.about()

We want to search based on the main concepts extracted. If the model successfully extracts a non-empty array of concepts, we use that. Otherwise, we fallback to the full query.

.search.about(
  parsedQuery.concepts && parsedQuery.concepts.length > 0
    ? parsedQuery.concepts
    : parsedQuery.query // fallback if no concepts were extracted
)*:

This snippet ensures that our semantic search is as focused as possible by prioritizing specific keywords.

Building the Filters with .where()

Now comes the slightly more complex part: turning each extracted filter into a proper “where” clause.

If the query has an includeIngredients array, we want to add a filter like:

.where(({ ContainsAll }) => ({
  ingredients: ContainsAll(parsedQuery.includeIngredients)
}))

But—what if includeIngredients isn’t provided? We only want to add this filter when it exists. So we use a conditional:

.where(({ ContainsAll }) => ({
  ...(parsedQuery.includeIngredients
    ? [{ ingredients: ContainsAll(parsedQuery.includeIngredients) }]
    : []),
}))

We conditionally spread the filter into our query. If includeIngredients exists, we include the filter; if not, we simply add an empty array.

Next, if the query specifies ingredients to exclude (for example, “no sausage”), we want to add a filter for each item:

.where(({ NotEqual }) => ({
  ...(parsedQuery.excludeIngredients
    ? parsedQuery.excludeIngredients.map(ing => ({
        ingredients: NotEqual(ing),
      }))
    : []),
}))

Each ingredient that should be excluded is mapped into a NotEqual filter, and we conditionally add these only if they exist.

For cuisine types:

...(parsedQuery.includeCuisineTypes
  ? [{ cuisineTypes: ContainsAll(parsedQuery.includeCuisineTypes) }]
  : []),

For the time constraint:

...(parsedQuery.totalTimeMinutesMax
  ? [{ totalTimeMinutes: LessThanEqual(parsedQuery.totalTimeMinutesMax + 0.0001) }]
  : [])

Combining All Filters Using the And Operator

Finally, we combine all these filters into a single query:

.where(({ And, ContainsAll, NotEqual, LessThanEqual }) =>
  And(
    ...(parsedQuery.includeIngredients
      ? [{ ingredients: ContainsAll(parsedQuery.includeIngredients) }]
      : []),
    ...(parsedQuery.excludeIngredients
      ? parsedQuery.excludeIngredients.map(ing => ({
          ingredients: NotEqual(ing),
        }))
      : []),
    ...(parsedQuery.includeCuisineTypes
      ? [{ cuisineTypes: ContainsAll(parsedQuery.includeCuisineTypes) }]
      : []),
    ...(parsedQuery.totalTimeMinutesMax
      ? [{ totalTimeMinutes: LessThanEqual(parsedQuery.totalTimeMinutesMax + 0.0001) }]
      : [])
  )
)

This final combined filter checks each parameter and only adds a condition if that parameter exists.

Generating the Final Answer with .generate.fromMany()

After we’ve retrieved and filtered our documents, we use the generation step to produce a final, coherent answer:

.generate.fromMany({
  messages: [
    {
      role: 'system',
      content: "Generate a helpful response to the user's query. RESULTS: {$}",
    },
    {
      role: 'user',
      content: parsedQuery.query || '',
    },
  ],
})

The system message instructs the model to consider the retrieved results (indicated by {$}) when generating the response. The user message reiterates the original query for context.

Bringing It All Together: The Complete RAG Pipeline

Here’s what the full API handler might look like (for example, in a Next.js project):

import { z } from 'zod'
import { Unbody } from 'unbody'
import * as marked from 'marked'

const u = new Unbody({
  apiKey: process.env.UNBODY_API_KEY!,
  projectId: process.env.UNBODY_PROJECT_ID!,
})

// Complete query schema with descriptions
const querySchema = z.object({
  concepts: z.array(z.string())
    .nullable()
    .describe('Search terms that the user is interested in.'),
  includeIngredients: z.array(z.string())
    .nullable()
    .describe('Ingredients that the user wants to include in the search results.'),
  excludeIngredients: z.array(z.string())
    .nullable()
    .describe('Ingredients that the user wants to exclude from the search results.'),
  includeCuisineTypes: z.array(z.string())
    .nullable()
    .describe("Cuisine types that the user wants to include in the search results."),
  totalTimeMinutesMax: z.number()
    .nullable()
    .describe('Maximum total time in minutes.')
})

// Step 1: Parse the user's query into structured parameters
async function parseQuery(query: string) {
  const { data: { payload } } = await u.generate.json(
    [
      { role: 'system', content: "Analyze the user's query and extract relevant parameters." },
      { role: 'user', content: query },
    ],
    {
      schema: querySchema,
    }
  )
  return { query, ...payload.content }
}

// Step 2: Use the extracted parameters to build the retrieval query
async function getRecipes(parsedQuery) {
  const { data: { payload, generate } } = await u.get
    .collection('RecipeCollection')
    // Select fields needed for the final answer
    .select('title', 'text', 'ingredients', 'cuisineTypes', 'totalTimeMinutes')
    // a) Semantic search: use concepts if available; fallback to full query
    .search.about(
      parsedQuery.concepts && parsedQuery.concepts.length > 0
        ? parsedQuery.concepts
        : parsedQuery.query
    )
    // b) Limit to top 3 results
    .autocut(3)
    // c) Apply combined filters
    .where(({ And, ContainsAll, NotEqual, LessThanEqual }) =>
      And(
        ...(parsedQuery.includeIngredients
          ? [{ ingredients: ContainsAll(parsedQuery.includeIngredients) }]
          : []),
        ...(parsedQuery.excludeIngredients
          ? parsedQuery.excludeIngredients.map(ing => ({
              ingredients: NotEqual(ing),
            }))
          : []),
        ...(parsedQuery.includeCuisineTypes
          ? [{ cuisineTypes: ContainsAll(parsedQuery.includeCuisineTypes) }]
          : []),
        ...(parsedQuery.totalTimeMinutesMax
          ? [{ totalTimeMinutes: LessThanEqual(parsedQuery.totalTimeMinutesMax + 0.0001) }]
          : [])
      )
    )
    // d) Generate the final answer using the retrieved results
    .generate.fromMany({
      messages: [
        {
          role: 'system',
          content: "Generate a helpful response to the user's query. RESULTS: {$}",
        },
        {
          role: 'user',
          content: parsedQuery.query || '',
        },
      ],
    })
    .exec()

  return { payload, generate }
}

// Step 3: Full API handler (e.g., in Next.js)
export default async function handler(req, res) {
  const userQuery = req.body.query
  // 1. Parse the user’s query
  const parsed = await parseQuery(userQuery)
  // 2. Retrieve documents and generate the answer
  const { payload, generate } = await getRecipes(parsed)

  res.status(200).json({
    query: parsed,
    answer: marked.parse(generate.result || ''),
    recipes: payload.map(recipe => ({
      ...recipe,
      text: marked.parse(recipe.text || ''),
    })),
  })
}

Summary of the Flow:

  1. Parsing:

    The user’s free-form query is processed by parseQuery(), which returns a structured object with parameters like concepts, include/exclude ingredients, cuisine types, and time constraints.

  2. Retrieval:

    With these parameters, we build our search query:

    • Semantic Search: .search.about(...) focuses on the extracted concepts.

    • Filtering: .where(...) applies each filter conditionally using the And operator.

    • Limiting: .autocut(3) keeps our result set manageable.

  3. Generation:

    Finally, .generate.fromMany(...) produces a coherent answer that integrates the retrieved documents.

Watch the Demo

Check out the full demo of this Agentic RAG system in action on YouTube:

Get the Source Code

Interested in digging into the code? You can find the complete source code on GitHub:

https://github.com/unbody-io/example-next.js/tree/recipes

Final Thoughts

In wrapping up, remember that by breaking down a complex query into its essential parts, you empower your application to truly understand your users. With Unbody’s Agentic RAG approach, you’re not just matching keywords—you’re capturing intention, filtering precisely, and generating meaningful, context-aware responses.

Ready to build smarter, AI-native applications?

  • Dive In: Experiment with the sample code and start tailoring your own Agentic RAG pipeline.

  • Explore: Check out the Unbody documentation for further insights and advanced use cases.

  • Join the Conversation: We’d love to see what you create! Share your projects, ideas, or questions on our community forums or social channels.

  • Get Involved: If you’re passionate about AI-native development, consider joining our open-source community and help shape the future of intelligent applications.

Happy building, and let’s push the boundaries of what our applications can do together!