Databricks data engineering
Databricks data engineering features are a robust environment for collaboration among data scientists, data engineers, and data analysts. Data engineering tasks are also the backbone of Databricks machine learning solutions.
Note
If you are a data analyst who works primarily with SQL queries and BI tools, you might prefer Databricks SQL.
Name | Use this when you want to… |
---|---|
Delta Live Tables | Learn how to build data pipelines for ingestion and transformation with Databricks Delta Live Tables. |
Structured Streaming | Learn about streaming, incremental, and real-time workloads powered by Structured Streaming on Databricks. |
Apache Spark | Learn how Apache Spark works on Databricks and the Databricks platform. |
Notebooks | Learn what a Databricks notebook is, and how to use and manage notebooks to process, analyze, and visualize your data. |
Workflows | Learn how to orchestrate data processing, machine learning, and data analysis workflows on the Databricks platform. |
Libraries | Learn how to make third-party or custom code available in Databricks using libraries. Learn about the different modes for installing libraries on Databricks. |
Git folders | Learn how to use Git to version control your notebooks and other files for development in Databricks. |
DBFS | Learn about Databricks File System (DBFS), a distributed file system mounted into a Databricks workspace and available on Databricks clusters |
Files | Learn about options for working with files on Databricks. |
Migration | Learn how to migrate data applications such as ETL jobs, enterprise data warehouses, ML, data science, and analytics to Databricks. |
Optimization & performance | Learn about optimizations and performance recommendations on Databricks. |
Feedback
https://aka.ms/ContentUserFeedback.
Coming soon: Throughout 2024 we will be phasing out GitHub Issues as the feedback mechanism for content and replacing it with a new feedback system. For more information see:Submit and view feedback for