Migrating from Azure Synapse Spark to Fabric
Before you begin your migration, you should verify that Fabric Data Engineering is the best solution for your workload. Fabric Data Engineering supports lakehouse, notebook, environment, Spark job definition (SJD) and data pipeline items, including different runtime and Spark capabilities support.
Key considerations
The initial step in crafting a migration strategy is to assess suitability. It's worth noting that certain Fabric features related to Spark are currently in development or planning. For more details and updates, visit the Fabric roadmap.
For Spark, see a detailed comparison differences between Azure Synapse Spark and Fabric.
Migration scenarios
If you determine that Fabric Data Engineering is the right choice for migrating your existing Spark workloads, the migration process can involve multiple scenarios and phases:
- Items: Items migration involves the transfer of one or various items from your existing Azure Synapse workspace to Fabric. Learn more about migrating Spark pools, Spark configurations, Spark libraries, notebooks, and Spark job definition.
- Data and pipelines: Using OneLake shortcuts, you can make ADLS Gen2 data (linked to an Azure Synapse workspace) available in Fabric lakehouse. Pipeline migration involves moving existing data pipelines to Fabric, including notebook and Spark job definition pipeline activities. Learn more about data and pipelines migration.
- Metadata: Metadata migration involves moving Spark catalog metadata (databases, tables, and partitions) from an existing Hive MetaStore (HMS) in Azure Synapse to Fabric lakehouse. Learn more about HMS metadata migration.
- Workspace: Users can migrate an existing Azure Synapse workspace by creating a new workspace in Microsoft Fabric, including metadata. Workspace migration isn't covered in this guidance, assumption is that users need to create a new workspace or have an existing Fabric workspace. Learn more about workspace roles in Fabric.
Transitioning from Azure Synapse Spark to Fabric Spark requires a deep understanding of your current architecture and the differences between Azure Synapse Spark and Fabric. The first crucial step is an assessment, followed by the creation of a detailed migration plan. This plan can be customized to match your system's unique traits, phase dependencies, and workload complexities.
Related content
- Fabric vs. Azure Synapse Spark
- Learn more about migration options for Spark pools, configurations, libraries, notebooks and Spark job definition
- Migrate data and pipelines
- Migrate Hive Metastore metadata
Feedback
https://aka.ms/ContentUserFeedback.
Coming soon: Throughout 2024 we will be phasing out GitHub Issues as the feedback mechanism for content and replacing it with a new feedback system. For more information see:Submit and view feedback for