How to migrate Azure SQL tables to databricks dynamically?
During Azure SQL to databricks migration, I was trying to find a way to migrate multiple tables within multiple databases as per defined(TableName,DBName) in a configuration table .
Configuration table structure like
DBname
SchemaName
SqlTableName
SqlServerName
DatabricksDBName
data like
DB1 dbo t1 serverUrl1 Cp_DB1
DB1 dbo t2 Serverurl2 Cp_DB1
DB2 dbo t1 serverurl3 Cp_DB2
DB3 dbo t2 serverUrl1 Cp_DB2
So I need to migrate these tables by picking data from corresponding Databases and server and need to load in Databricks bronze layer with expected databricks dbname
Solution
I created a adf pipeline to loop through with this table by lookup activity + foreach activity and passed servername, Tablename , database name, User,Password(from Key Vault) as parameters to databricks notebook inside pipeline.
ADF pipeline -> lookup activity > foreach activity> scale up corresponding db > notebook activity> scale down corresponding db.
Databricks generic notebook-
% python
dbutils.widgets.text(name = "DbName", defaultValue = "DB1")
dbutils.widgets.text(name = "dbTable", defaultValue = "dbo.t1")
dbutils.widgets.text(name = "Server", defaultValue = "serverUrl1 ")
dbutils.widgets.text(name = "DatabricksDbName", defaultValue = "DB1")
dbutils.widgets.text(name = "User", defaultValue = "user1")
dbutils.widgets.text(name = "Password", defaultValue = "xxx")
DatabricksTableName = dbutils.widgets.get('dbTable').split(".")[1]
url = "jdbc:sqlserver://" + dbutils.widgets.get('Server') + ";DatabaseName=" + dbutils.widgets.get('DbName') + ";"
dbutils.widgets.text(name = "DatabricksTableName", defaultValue = DatabricksTableName)
dbutils.widgets.text(name = "Url", defaultValue = url)
create database
if not exists ${DatabricksDbName};
drop table
if exists ${DatabricksDbName}.jdbc_InputTable;
CREATE TABLE $ {DatabricksDbName
}.jdbc_InputTable
USING org.apache.spark.sql.jdbc
OPTIONS( url '${Url}', dbtable '${dbTable}', user '${User}', password '${Password}');
drop table
if exists ${DatabricksDbName}.${DatabricksTableName};
create table $ {DatabricksDbName
}.${ DatabricksTableName}
using delta as
select * from $ { DatabricksDbName
}.jdbc_InputTable;
drop table
if exists ${DatabricksDbName}.jdbc_InputTable;
#Databricks #Azure #AzureSQL #AzureKeyVault
Comments
Post a Comment