As experienced Databricks users, we prepared a curated list of resources to help you navigate different aspects of MLOps, particularly focusing on Databricks. The resources are categorized by topic for easy reference. Whether you’re just getting started with Databricks or looking for advanced features that can be useful for your MLOps implementation, this guide will serve as a useful reference.
1. MLOps principles and components
Understanding the core principles and components of MLOps is a must for successfully deploying and maintaining machine learning projects. Below are some essential resources to get you started.
Resources by Marvelous MLOps:
MLOps maturity assessment — a checklist for ML models before they go to production
The minimum set of must-haves for MLOps — understand the the minimal MLOps set up to deploy ML models to production
The Ultimate Must-Haves and Nice-to-Haves for MLOps & LLMOps — extended version of MLOps Toolbelt with additional LLM components
MLOPs Roadmap — A comperehensive guide to taking you from beginner to expert in MLOps, covering everything from foundational machine learning principles and programming skills to advanced operational practices.
Other recommended resources:
MLOps Org by INNOQ — Collection of a wide range of articles. Perfect for staying updated on the latest in MLOps
5 Levels of MLOps maturity — The journey of MLOps maturity into five levels, helping you understand where you stand and what’s needed to get more advanced.
Our MLOps Maturity Assessment is created with inspiration from the approaches developed by both Google and Microsoft.
2. Developing on Databricks
Resources by Marvelous MLOps:
Handy Databricks Features for Development — start developing your ML code like a pro already on day one with these cool features from Databricks.
Bridging the Gap: Converting Data Science Notebooks into Production ML Code — Notebooks are not designed for production deployment and can be difficult to maintain. This article will show you how to convert your notebook into a production-ready code.
How to debug ML deployments 20x faster — deploying MLflow Model serving endpoints locally
Other recommended resources:
Advancing Spark — Local Development with Databricks — A nice demo to learn Databricks Connect. (also mentioned in our article)
Bridging the Production Gap: Develop and Deploy Code Easily — A nice demo from Databricks to increase productivity by integrating IDEs with Databricks, utilizing tools like code linters, AI assistants, and CI/CD integrations.
3. Databricks asset bundles
Databricks Asset Bundles enable the adoption of software engineering best practices like source control, code review, and CI/CD for data and AI projects by describing Databricks resources as source files, streamlining project structure, testing, and deployment for easier collaboration. It’s also a great approach used in ML model deployments.
Resources by Marvelous MLOps:
Getting started with Databricks Asset Bundles — Easy way to deploy Databricks workflows and manage dependencies
Dealing with private packages — Different ways to manage private packages using asset bundles
Other recommended resources:
4. Mlflow experiment tracking & registering models in Unity Catalog
MLflow is one of the most popular tools for model registry and experiment tracking. As an open-source platform, it integrates easily with different tools and platforms. We highly recommend learning and practicing with MLflow to gain hands-on experience in model tracking.
Resources by Marvelous MLOps:
Find your way to MLflow without confusion MLflow has an extensive support and a lot of options, this article is beginner friendly, focusing on the fundamentals to help you get started
Lessons learned from migrating models to Unity Catalog: Changes introduced due to the Unity Catalog and some tips.
MLflow Cheatsheets Handy reference sheets to quickly find key information on MLflow. Great for quick consultations during development.
Other recommended resources:
What Exactly Is a Model in MLflow Clarifies the concept of a model within the MLflow ecosystem, helping you understand its various components
Dive into Databricks Model Deployment -1 Nice article to explain a cyclical process involving six steps that guide the development, deployment, and ongoing maintenance of models to ensure their effectiveness.
5. Model serving architectures
Resources by Marvelous MLOps:
• Model Serving Architectures on Databricks An overview of various model serving architectures available in Databricks. A good starting point for understanding your options and choosing what’s best for your use case.
Getting Started with Databricks Feature Store Focuses on the Databricks Feature Store, explaining how to use it for model serving.
Going Serverless with Databricks — Part 1, Going Serverless with Databricks — Part 2: An introduction to the serverless model endpoints in Databricks, highlighting the benefits and how to get started. Includes examples of custom model deployment. Perfect for those exploring serverless architectures.
Other resources:
6. Inference tables and lakehouse monitoring
Drifting Away: Testing ML Models in Production | Databricks Lakehouse Monitoring
Learn How to Reliably Monitor Your Data and Model Quality shows how to monitor and manage ML models to maintain their performance.
Resources covering end-to-end MLOps on Databricks
There are not so many resources covering end-to-end MLOps on Databricks. Here are some we recommend:
Big Book of MLOps by Databricks A comprehensive guide that covers everything from the basics of MLOps to advanced concepts, tailored for the Databricks environment.
Course announcement
Courses on the topic (except private training) barely exist. Materials provided by Databricks Academy only cover the basics and are notebook-heavy. This is what inspired us to start writing about Databricks in the first place.
Now we are proud to announce that we created our End-to-end MLOps with Databricks course, packed with condensed knowledge that comes from many years of experience with MLOps and Databricks. Use code MARVELOUS for a 100 euro discount.
Want to learn more about the course? Check out an article with the learnings from one of our previous students.
Amazing stuff! I am initiating a project for our school district that is just at the very beginning stages of this and this article helped a lot in understanding the first steps, thank you!
Great stuff!! Looking forward to the course 🙌🏼