GenAI Agent Architecture in Databricks

Feb 9

6 min read

Keep it simple. Why make things harder than they need to be?

It’s the golden rule in Data Architecture — and GenAI Agent Architecture is no exception. Every additional component adds complexity, increases risk, and creates potential failure points.

Take Databricks, for example. Its biggest challenge? Underutilization. Too often, it’s seen as just another data platform. In reality, however, it’s a comprehensive data intelligence ecosystem — providing everything you need for GenAI Agent development. In this article, we break it down and explain why that’s the case.

Infrastructure

A solid infrastructure is the backbone of any architecture, providing both robustness and reusability. Databricks offers outstanding Terraform support, enabling you to deploy almost every component through it. Although DAB (Databricks Asset Bundles) is evolving, it currently focuses primarily on ETL processes, with GenAI functionalities still out of scope—that's why I haven’t mentioned it here. One of the standout features of Databricks is its excellent REST API support, complete with comprehensive documentation. Since both Terraform and DAB are built on this REST API, you can develop your own advanced automation for the infrastructure—a possibility I find especially exciting.

Data & Code

Databricks GenAI agent Data and Code sector, containing Unity Catalog, Vector Search and Tools

Databricks initially emerged as a data platform, introducing the data lakehouse architecture to the mainstream. Instead of juggling countless integrations between disparate systems, it centralizes data in one location—enabling you to build data products, streamline governance, and cater to various data consumption needs. This approach dramatically simplified the architecture while significantly boosting robustness. As anyone working with data knows, everything starts with the data you have. The timeless adage "garbage in, garbage out" remains as relevant in the world of GenAI Agents as in any data-driven process. For this reason, implementations should remain as close to the data as possible to minimize transfer costs, reduce migration efforts, and lower the risk of data corruption. Databricks offers native access to business data, making its utilization almost too easy.

In addition, with just a few lines of code you can create vector search online tables from existing Delta tables, making them immediately available for RAG (retrieval augmented generation) applications. Embeddings can be generated through Model Serving with external models or by using Databricks' self-hosted solution. Setting this up takes only about 5 minutes plus a brief deployment wait, effectively eliminating the need for an army of consultants.

Creating and maintaining tools is seamless as well. Databricks provides Unity Catalog support for SQL/Python UDF tools, making them accessible through LangChain’s community tool. However, I personally prefer to develop my own tools and store configurations in YAML files, with the Python code kept in a dedicated repository. There are several variations available depending on your requirements and maturity level, but the key point is that you have plenty of options.

LLM Models

Databricks acquired Mosaic AI in 2023 for billions, sparking questions about the platform's future functionalities. Now, those features are being revealed one by one — most notably, the seamless integration of LLM models into the ecosystem. With Mosaic AI, you can leverage external LLM models (such as Azure OpenAI, Anthropic, etc.), use Databricks-hosted models, or even host your own. Additionally, model training is now supported, emerging as a viable option for addressing specific needs with smaller models.

However, what truly stands out is its ease of use and governance. To address this, Mosaic AI Gateway was launched, offering a simple and effective solution for managing essential operational aspects of LLM models—such as guardrails, token usage limits, permissions, and more. You can learn more about this in our earlier article: Simplifying GenAI Architecture with Databricks Mosaic AI Gateway

Development

Now that you have all the essential components for creating an agent — access to data/RAG, Git integration, access to LLM models and the necessary tools — you can begin development. Some developers prefer writing code outside of Databricks using another IDE like Visual Studio Code, which is made possible with Databricks Connect. However, coding directly within the UI is often more straightforward and even offers AI assistants to streamline your work. When building agents, you have two primary options: develop your own framework or utilize pre-built libraries like LangChain or LlamaIndex. Each approach offers its own advantages and challenges, and it’s crucial to understand the underlying logic of how everything works. Keep in mind that relying on external libraries means depending on their developers for updates and support. With technology evolving rapidly — bringing both new opportunities and inherent limitations — it’s important to carefully assess your skillset and available resources to choose the best approach.

Databricks offers a tracing feature via MLflow that simplifies troubleshooting by displaying the entire process chain at once: MLflow Tracing for LLM Observability. This approach is highly recommended when starting out, but as your expertise grows, you may opt to develop your own logging solutions. That said, GenAI Agents are relatively straightforward to build — it's not like constructing a rocket. By leveraging the best practices of Databricks Mosaic AI, you can quickly gain momentum: Mosaic AI Agent Framework | Databricks

Monitoring

Now that you've built your first GenAI Agent, you naturally want to see how it performs. Databricks system tables offer high-level token usage monitoring, allowing you to track endpoint activity and manage costs effectively. Additionally, you can enable inference tables for the LLM model directly from the service endpoint, providing out-of-the-box logging of both queries and responses. This handy feature lets you monitor your agent's behavior and analyze its performance retrospectively.

In addition, Databricks offers Agent Evaluation capabilities: What is Mosaic AI Agent Evaluation? - Azure Databricks | Microsoft Learn. Currently, these features primarily focus on chatbot functionalities, but they're rapidly evolving toward enhanced process monitoring. This provides a quick and efficient way to assess quality. Once all the data is available, it's advisable to build clear AI/BI dashboards for continuous tracking, and you can also enable GenAI to assist with the analysis.

Usage (dev)

Before moving to production, it's essential to test your agent’s functionality and reliability. The Playground is an excellent option for this purpose. I’ll be writing a dedicated article on it soon, but in short, it’s a platform provided by Databricks for comparing models. You can choose individual LLM models, ready-made agents, and even standalone tools from Unity Catalog. It’s designed to be extremely user-friendly, allowing you to involve end users in testing the agents at this stage. Best of all, you can run multiple agents and LLM models concurrently in a clean environment, enabling you to benchmark the best LLM model and compare how your agent implementations perform against it.

In addition, traditional notebook testing works exceptionally well — especially when integrated with MLflow. This approach allows you to thoroughly explore the functionality of every component of your agent, with all metadata automatically logged. You can effortlessly switch experiment locations with just a single code snippet and add tags to streamline both analysis and monitoring.

Usage (prod)

Once testing is complete and everything looks ready, it’s time to move to production. There are several options available, and here we’ll cover the most common ones. Databricks offers a straightforward way to deploy your agent as its own model endpoint via agent deployment. This setup lets you interact with your agent just like any other REST API. While this method is highly efficient and effective, one drawback is that the agent runs continuously. If you don't need it active 24/7, you'll end up incurring unnecessary costs.

For one-off runs, I recommend using a traditional workflow approach. With a dedicated workflow for your task or process, you can parameterize the agent or import it interactively from a notebook. While it's possible to run it locally on your own machine, this method isn't ideal from a monitoring and management perspective. For interactive use, Databricks Apps is highly recommended — especially when end users are business-oriented. This approach enables you to build a sleek user interface that reflects your company's branding, creating the feel of a full-fledged application. Learn more about Databricks Apps in this article: Databricks Apps - Revolutionizing User-Friendly Internal Tools.

Databricks GenAI Agent Architecture, it just works

While we've only scratched the surface of each component, the aim was to show how seamlessly the entire system works together. I've seen far too many GenAI agent solutions built as makeshift proofs-of-concept, only to later wonder why quality control and business integration fail. Naturally, these never make it to production and end up wasting resources. That’s why it’s essential to choose the right ecosystem — one where every part of the process supports a cohesive, efficient whole and successful outcomes. While agents can offer remarkable automation capabilities, they aren’t a cure-all. A rapid iteration cycle — from concept to testing — helps validate promising automation targets quickly, justifying the development investment. GenAI agent architecture in Databricks, Why make things harder than they need to be?

Ikidata is a pioneer in GenAI Agent automation, providing deep insights into this emerging technology from technical, architectural, and business perspectives. We make it simple to bring your ideas to life.