Resource prerequisites
This article details the resources required for getting started with HDInsight on AKS. It covers the necessary and the optional resources and how to create them.
Necessary resources
The following table depicts the necessary resources that are required for cluster creation based on the cluster types.
Workload | Managed Service Identity (MSI) | Storage | SQL Server - SQL Database | Key Vault |
---|---|---|---|---|
Trino | ✅ | |||
Flink | ✅ | ✅ | ||
Spark | ✅ | ✅ | ||
Trino, Flink, or Spark with Hive Metastore (HMS) | ✅ | ✅ | ✅ | ✅ |
Note
MSI is used as a security standard for authentication and authorization across resources, except SQL Database. The role assignment occurs prior to deployment to authorize MSI to storage and the secrets are stored in Key vault for SQL Database. Storage support is with ADLS Gen2, and is used as data store for the compute engines, and SQL Database is used for table management on Hive Metastore.
Optional resources
- Virtual Network (VNet) and Subnet: Create virtual network
- Log Analytics Workspace: Create Log Analytics workspace
Note
- VNet requires subnet without any existing route table associated with it.
- HDInsight on AKS allows you to bring your own VNet and Subnet, enabling you to customize your network requirements to suit the needs of your enterprise.
- Log Analytics workspace is optional and needs to be created ahead in case you would like to use Azure Monitor capabilities like Azure Log Analytics.
You can create the necessary resources in two ways:
Using ARM templates
The following ARM templates allow you to create the specified necessary resources, in one click using a resource prefix and more details as required.
For example, if you provide resource prefix as “demo” then, following resources are created in your resource group depending on the template you select -
- MSI is created with name as
demoMSI
. - Storage is created with name as
demostore
along with a container asdemocontainer
. - Key vault is created with name as
demoKeyVault
along with the secret provided as parameter in the template. - Azure SQL database is created with name as
demoSqlDB
along with SQL server with name asdemoSqlServer
.
Note
Using these ARM templates require a user to have permission to create new resources and assign roles to the resources in the subscription.
Using Azure portal
Create user-assigned managed identity (MSI)
A managed identity is an identity registered in Microsoft Entra ID (Microsoft Entra ID) whose credentials managed by Azure. With managed identities, you need not to register service principals in Microsoft Entra ID to maintain credentials such as certificates.
HDInsight on AKS relies on user-assigned MSI for communication among different components.
Create storage account – ADLS Gen 2
The storage account is used as the default location for cluster logs and other outputs. Enable hierarchical namespace during the storage account creation to use as ADLS Gen2 storage.
Assign a role: Assign “Storage Blob Data Owner” role to the user-assigned MSI created to this storage account.
Create a container: After creating the storage account, create a container in the storage account.
Note
Option to create a container during cluster creation is also available.
Create Azure SQL Database
Create an Azure SQL Database to be used as an external metastore during cluster creation or you can use an existing SQL Database. However, ensure the following properties are set.
Necessary properties to be enabled for SQL Server and SQL Database-
Note
- Currently, we support only Azure SQL Database as inbuilt metastore.
- Due to Hive limitation, "-" (hyphen) character in metastore database name is not supported.
- Azure SQL Database should be in the same region as your cluster.
- Option to create a SQL Database during cluster creation is also available. However, you need to refresh the cluster creation page to get the newly created database appear in the dropdown list.
Create Azure Key Vault
Key Vault allows you to store the SQL Server admin password set during SQL Database creation. HDInsight on AKS platform doesn’t deal with the credential directly. Hence, it's necessary to store your important credentials in the Key Vault.
Assign a role: Assign “Key Vault Secrets User” role to the user-assigned MSI created as part of necessary resources to this Key Vault.
Create a secret: This step allows you to keep your SQL Server admin password as a secret in Azure Key Vault. Add your password in the “Value” field while creating a secret.
Note
- Make sure to note the secret name, as this is required during cluster creation.
- You need to have a “Key Vault Administrator” role assigned to your identity or account to add a secret in the Key Vault using Azure portal. Navigate to the Key Vault and follow the steps on how to assign the role.
Feedback
https://aka.ms/ContentUserFeedback.
Coming soon: Throughout 2024 we will be phasing out GitHub Issues as the feedback mechanism for content and replacing it with a new feedback system. For more information see:Submit and view feedback for