Use scikit-learn on Azure Databricks
This page provides examples of how you can use the scikit-learn
package to train machine learning models in Azure Databricks. scikit-learn is one of the most popular Python libraries for single-node machine learning and is included in Databricks Runtime and Databricks Runtime ML. See Databricks Runtime release notes for the scikit-learn library version included with your cluster’s runtime.
You can import these notebooks and run them in your Azure Databricks workspace.
For additional example notebooks to get started quickly on Azure Databricks, see Tutorials: Get started with ML.
Basic example using scikit-learn
This notebook provides a quick overview of machine learning model training on Azure Databricks. It uses the scikit-learn
package to train a simple classification model. It also illustrates the use of MLflow to track the model development process, and Hyperopt to automate hyperparameter tuning.
If your workspace is enabled for Unity Catalog, use this version of the notebook:
scikit-learn classification notebook (Unity Catalog)
If your workspace is not enabled for Unity Catalog, use this version of the notebook:
scikit-learn classification notebook
End-to-end example using scikit-learn on Azure Databricks
This notebook uses scikit-learn to illustrate a complete end-to-end example of loading data, model training, distributed hyperparameter tuning, and model inference. It also illustrates model lifecycle management using MLflow Model Registry to log and register your model.
If your workspace is enabled for Unity Catalog, use this version of the notebook:
Use scikit-learn with MLflow integration on Databricks (Unity Catalog)
If your workspace is not enabled for Unity Catalog, use this version of the notebook:
Use scikit-learn with MLflow integration on Databricks
Feedback
https://aka.ms/ContentUserFeedback.
Coming soon: Throughout 2024 we will be phasing out GitHub Issues as the feedback mechanism for content and replacing it with a new feedback system. For more information see:Submit and view feedback for