Fixing ADF Databricks Linked Service

Azure Data Factory (ADF) is effectively a programming model built upon Azure Resource Manager (ARM) templates. One way to automate deployments of ADF is to make use of ARM template exports which are automatically exported when you click the publish button.

You can use linked services to interface ADF with other services, such as Azure Databricks. Here is what the configuration for an Azure Databricks linked service looks like:

In the image above, there are several items that you may want to change between environments that this linked service is deployed to. Moreover, Azure Databricks recently announced a move from generic URLs to per-workspace URLs.

Unfortunately, by default, the “domain” (Databricks Workspace URL in the image above) attribute of an Azure Databricks linked service is not exposed in the ARM template parameters file. Therefore, it cannot be overridden easily at deployment time.

Luckily, the ADF team opted to have a configuration file determine which components of the ARM template are exposed to a parameters file, and which ones are not. You can read more about it here.

Taking a look at the default configuration, one notices that the “domain” keyword is missing from the parameters list.

    "Microsoft.DataFactory/factories/linkedServices": {
        "*": {
            "properties": {
                "typeProperties": {
                    "accountName": "=",
                    "username": "=",
                    "userName": "=",
                    "accessKeyId": "=",
                    "servicePrincipalId": "=",
                    "userId": "=",
                    "clientId": "=",
                    "clusterUserName": "=",
                    "clusterSshUserName": "=",
                    "hostSubscriptionId": "=",
                    "clusterResourceGroup": "=",
                    "subscriptionId": "=",
                    "resourceGroupName": "=",
                    "tenant": "=",
                    "dataLakeStoreUri": "=",
                    "baseUrl": "=",
                    "database": "=",
                    "serviceEndpoint": "=",
                    "batchUri": "=",
                    "databaseName": "=",
                    "systemNumber": "=",
                    "server": "=",
                    "url": "=",
                    "aadResourceId": "=",
                    "connectionString": "|:-connectionString:secureString",
                    "existingClusterId": "=",
                    "host": "=",
                    "secretName": "="
                }
            }
        },

You can simply add “domain”:"-", under type properties above to have that property now exposed. For reference, the - means that no default should be given to the parameter, forcing you to override it at deployment time. This is likely what you want given you hopefully are not reusing workspaces between environments.

What is really nice is that you can make use of the new Azure Data Factory Management Hub to expose an editor to help you make these changes.

Hope that helps!

Previous
Previous

Using Azure Custom Roles to Secure your Azure Data Factory Resources

Next
Next

Exploring Azure Databricks Permissions