Quickstart: Create a new lake database leveraging database templates
This quick start gives you a complete sample scenario on how you can apply database templates to create a lake database, align data to your new model, and use the integrated experience to analyze the data.
Prerequisites
- At least Synapse User role permissions are required for exploring a lake database template from Gallery.
- Synapse Administrator, or Synapse Contributor permissions are required on the Azure Synapse workspace for creating a lake database.
- Storage Blob Data Contributor permissions are required on data lake when using create table From data lake option.
Create a lake database from database templates
Use the new database templates functionality to create a lake database that you can use to configure your data model for the database.
For our scenario we will use the Retail
database template and select the following entities:
- RetailProduct - A product is anything that can be offered to a market that might satisfy a need by potential customers. That product is the sum of all physical, psychological, symbolic, and service attributes associated with it.
- Transaction - The lowest level of executable work or customer activity. A transaction consists of one or more discrete events.
- TransactionLineItem - The components of a Transaction broken down by Product and Quantity, one per line item.
- Party - A party is an individual, organization, legal entity, social organization, or business unit of interest to the business.
- Customer - A customer is an individual or legal entity that has or has purchased a product or service.
- Channel - A channel is a means by which products or services are sold and/or distributed.
The easiest way to find entities is by using the search box above the different business areas that contain the tables.
Configure lake database
After you have created the database, make sure the storage account and the filepath is set to a location where you wish to store the data. The path will default to the primary storage account within Azure Synapse Analytics but can be changed to your needs.
To save your layout and make it available within Azure Synapse, Publish all changes. This step completes the setup of the lake database and makes it available to all components within Azure Synapse Analytics and outside.
Ingest data to lake database
To ingest data to the lake database, you can execute pipelines with code free data flow mappings, which have a Workspace DB connector to load data directly to the database table. You can also use the interactive Spark notebooks to ingest data to the lake database tables:
%%sql
INSERT INTO `retail_mil`.`customer` VALUES (1,date('2021-02-18'),1022,557,101,'Tailspin Toys (Head Office)','Waldemar Fisar',90410,466);
Query the data
After the lake database is created, there are different ways to query the data. Currently, SQL databases in serverless SQL pools are supported and automatically understand the newly created lake database format.
SELECT TOP (100) [ProductId]
,[ProductName]
,[ProductDescription]
,[ProductInternalName]
,[ItemSku]
,[PrimaryBrandId]
FROM [Retail_mil].[dbo].[RetailProduct]
The other way to access the data within Azure Synapse is to open a new Spark notebook and use the integrated experience there:
df = spark.sql("SELECT * FROM `Retail_mil`.`RetailProduct`")
df.show(10)
Train machine learning models
You can use lake database to train your machine learning models and score the data. For more details Train machine learning models
Next steps
Continue to explore the capabilities of the database designer using the links below.
Feedback
https://aka.ms/ContentUserFeedback.
Coming soon: Throughout 2024 we will be phasing out GitHub Issues as the feedback mechanism for content and replacing it with a new feedback system. For more information see:Submit and view feedback for