Power Platform Tidbits 11: Load large tables /views from SQL Server to Power Platform

There are several ways of loading (or migrating) data to Power Platform. I have blogged about it in length before. Today I’m going to show how you can easily and reliably load large amount of data from Microsoft SQL Server to Power Platform.

We are going to use the most famous PowerShell among SQL Server folks, dbatools and the glorious PSDataverse to load data. We will use one command from dbatools to extract data from a table (could be more tables or views) in SQL Server and then load it to a table in Dataverse directly.

Connect-Dataverse "authority=https://login.microsoftonline.com/<your-tenant-id>/oauth2/authorize;clientid=1950a258-227b-4e31-a9cf-717495945fc2;device=true" -InformationAction Continue
Invoke-DbaQuery -SqlInstance $dbinstance -Database $db -Query $select | 
                ForEach-Object { $_.Operation } | 
                Send-DataverseOperation -InformationAction Continue -ErrorVariable ErrOperation

In the above script block, The first line connects to Power Platform using a device authentication flow. There are several way to connect to Power Platform. For example using a client certificate, secret code and application user are some of them. To learn more about connecting to Power Platform you can read more here.

The second line, uses Invoke-DbaQuery command to connect to a SQL Server database and invoke a SELECT command that returns a table with ID and Operation columns (more about it later). The Operation column contains the JSON object that PSDataverse understands.

The output of Invoke-DbaQuery is piped to a ForEach-Object loop that goes over every row and extracts only the Operation column, then pipes it to Send-DataverseOperation which sends each operation to Dataverse and receives a response that indicates success or failure. You may have noted that the last command asks PowerShell to not raise errors and instead put any potential error in ErrOperation variable. This allows the data flow to continue even in case of failures. You should use this strategy only if you have already tested your script and you are not expecting the whole flow to fail.

At this point, you might be asking what is in that Operation column that can be sent directly to Dataverse. Send-DataverseOperation can accept several kinds of input. One possibility is to send a JSON string in the following format.

{
  "ContentId": "any string, used for traceability",
  "Method": "PATCH", // could be any method supported by OData in Dataverse.
  "Uri": "accounts(alternatekey='value')", // could be primary key as well.
  "Value": {
    "logicalcolumnname": "value",
    ...
}

This is a sample SELECT statement that generates a JSON operation to create accounts in dataverse, but you can use it as a template to load data to any table.

SELECT [Id], [Level], (
  SELECT [Id] AS [ContentId]
    ,'PATCH' AS [Method]
    ,'accounts(my_migrationid=''' + [Id] + ''')' AS [Uri]
    , JSON_QUERY((
      SELECT [Id] AS [my_migrationid]
        ,[Name] AS [name]
        ,[Phone] AS [telephone1]
        ,[Fax] AS [fax]
        ,[Website] AS [websiteurl]
        ,[AnnualRevenue] AS [revenue]
        ,[NumberOfEmployees] AS [numberofemployees]
        ,[Description] AS [description]
        ,[CreatedBy] AS [createdonbehalfby@odata.bind] 
      FROM [Account] AS [Value]
      WHERE [Value].[Id] = [Operation].[Id]
      FOR JSON AUTO, WITHOUT_ARRAY_WRAPPER)) AS [Value]
    FROM [Account] AS [Operation]
    WHERE [Operation].[Id] = [Tbl].[Id]
    FOR JSON PATH, WITHOUT_ARRAY_WRAPPER 
) AS Operation
FROM [Account] AS [Tbl]

SELECT statements are very powerful and enable you to do many kinds of data transformations like aggregation, transposition, mappings and more. In the very rare case that some transformation is not easy to implement in T-SQL language you can always rely on PowerShell. This way you have an object oriented language on top and you have virtually no limit in what you can do.

In real-world scenarios you will need tracing and measurement and more, which can be added with a few more commands. When you are using SQL Server, you have a full RDBMS at your disposal, you might as well use a table to log the result and be able to stop, resume and analyze your flows. These features are specially necessary if you need to load millions of rows and you need to think about locking or the amount of load you are putting on the data source.

I hope I can find some time to write about a real-world example soon. You will be surprised how easily these and more can be achieved just using simple PowerShell.


Posted

in

by

Comments

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: