GenAI is here to stay, and everyone wants to benefit from the advantages it brings. That's why it's important to provide all users with convenient tools and start moving towards the Citizen GenAI model. Unsurprisingly, Databricks is once again a step ahead of the rest in this.
AI Playground is a new feature, currently in the Public Preview phase. It allows you to easily use LLM models and GenAI applications without writing a single line of code. This makes AI Playground an extremely useful tool, as it enables multiple use cases for different user groups simultaneously. And best of all, you can run multiple applications at the same time, comparing speed, quality, and response relevance. To support further development, you can export the active agent to your own notebook and continue from there. Databricks provides an excellent default foundation for agents, leveraging LangChain in the background. But let's go through a concrete walkthrough step by step.
Start using Databricks AI Playground
The Playground can be found in your Databricks workspace on the left under Machine Learning. The first step is to choose the correct endpoint. The available options are Databricks-hosted models, external models, and custom agents. If you haven’t created anything before, don’t worry — Databricks provides the LLaMA 3.3 70B Instruct model by default. Its pricing is also quite reasonable. The cost model is pay-per-token: Input: 14.286 DBUs per 1M tokens and Output: 42.857 DBUs per 1M tokens. The next step is to add a system prompt which should be familiar to everyone at this stage. You can also add tools, but only those registered in Unity Catalog. Creating new tools is straightforward, but keep in mind that they must be in SQL format. So, if you want to create Python tools, you’ll need to wrap them inside SQL and save them as a tool in Unity Catalog. And let’s not forget about evaluation — Databricks offers a built-in AI judge evaluation, which is quite handy.
Multitasking with applications
Now that you’ve seen how things work with AI Playground, it’s time to spice things up a bit. Typically, when you have an agent or a RAG solution, you want to evaluate it against other applications as well. Does RAG actually improve the results and which LLM model performs best for the given problem? To make this really easy, you can select multiple LLM models, run them in parallel and compare the results. This is truly amazing — you can choose your own agent, a basic LLM model and perhaps another agent, run them simultaneously, and start analyzing the results. The basic LLM model should act as a benchmark, allowing you to assess the added value of the agent compared to traditional prompt queries with a pure LLM model. At the same time, you can experiment with multiple LLM models to find the fastest and highest-quality one for the given problem. For example, you can clone your agent into five different versions, each using a different LLM model and compare the results side by side.
Simple, yet so effective
Databricks AI Playground is a very easy-to-use and powerful tool. It’s so user-friendly that I can hardly think of anything else to add. It allows you to quickly run tests, empower business users to dive into the world of GenAI, and leverage agents in practice. And best of all, once again, all of this happens within the same ecosystem that houses your data and governance.
Ikidata is a pioneer in GenAI Agent automation, providing deep insights into this emerging technology from technical, architectural, and business perspectives. We make it simple to bring your ideas to life.

-𝐾𝑟𝑎𝑡𝑡𝑖