Introducing PSDataverse, PowerShell’s new super power

PowerShell is the solution for task automation. It has everything you need to iterate fast. A command-line shell with auto-completion, a scripting language that is object oriented and it runs on Windows, Linux, and macOS. PowerShell’s command-line shell is the best in class with tab completion, and even command prediction, aliases, in-console help (Unix man pages anyone) and my favorite feature Pipelines that lets you chaim commands together. You can even debug directly in the shell! If you are new to PowerShell start here.

Dataverse on the other hand is a managed relational database and a lot more. It hosts Dynamics 365’s data, provides rich metadata, and can even host logic and validation with a range of its own productivity tools. PowerShell can be handy to automate tasks, and orchestrate one-off solutions in Power Platform. From another perspective Dataverse can be usefull as a managed database. You don’t need a licese if the user is not a human, by the way.

This picture depicts an equation. The first figure is showing PowerShell Hero, then a plus sign, then the logo of Microsoft Dataverse, then an equal sign and the result of the equation is PowerShell hero with logo of Dataverse hovering over her palm.
PowerShell + Dataverse = PSDataverse

While working with Power Platform, often time you migh need to quicky try something to see if it works, or to just check some data real quick. Other times you might need to automate some tasks or integrate processes that are not exposed through the UI. Take data migration as an example, or simply loading a a couple of huge CSV files with some data transformation. Are you going to build a fully custom .NET solution just for that? or are you going to use a full-fledged SSIS solution with its notoriously troublesome Script components and non-Git friendly projects? you other options are learning 3rd-party one size fits all cloud-based services with their edge cases. Well, now you have a 3rd option. PSDataverse, a PowerShell module that brings Dataverse to PowerShell.

Show me an example

Let’s check how easy it is to load a CSV file called Account.CSV to Microsoft Dynamics 365.

Connect-Dataverse "authority=<tenant-id>/oauth2/authorize;clientid=<client-id>;thumbprint=<client-cert-thumbprint>;resource=https://<environment>"
Import-Csv "Account.csv" | ForEach-Object {
    @{ Uri = "account", Method= "PATCH", Value = @{
        name = $_.AccountName,
        overridencreatedon = $_.CreationDate,
        ...}}} | ConvertTo-Json | Send-DataverseOperation

You know, that technically is two lines of code in PowerShell. Sure, I’m cheating here. I have piped four commands together, but that makes it even easier to read. Let’s go over what I’m doing here.

  • The first line is obviously using Connect-Dataverse command to connect to Dataverse using a client certificate.
  • The second line (which I have broken into multiple lines for clarity):
    • Imports a CSV file called Account.csv.
    • Transforms each line into a Dataverse operation using a ForEach-Object.
    • Converts operations to JSON using ConvertTo-Json command.
    • Sends the operations to Dataverse

A Power Platform professional might say sending several operations one-by-one will not be ideal in terms of performance. I do agree with that statement. Send-DataverseOperation has two parameters for just that.

Improving performance

Dataverse provides two features to improve performance over its API for when needed. The first one is parallelism. Each user can send data in parallel up to a configurable (by Microsoft) number of lanes. By default the number is 20, but you can ask Microsoft to increase it for when needed (e.g. data migration). The other one is called batching. In order to use it, you’ll need to put several operations in a single HTTP request with a special syntax and structure and send it to the API, where it will be taken apart and executed one-by-one. All the operations will run in a single transaction however. In other words they succeed or fail (and rolled back) together. To use these two features, Send-DataverseOperation provides two parameters.

  • -BatchCapacity a number between 0 (default) and 1000 that defines the maximum number of operations that can be put in a single batch.
  • -MaxDop a number between 1 (default) and 1000 that indicates how many data lanes will be used to send data in parallel.

As you see, the default values indicate that batching will be off and only one lane will be used by default. This makes experimentation easier and safer by default. For example, running the following command:

@{Uri="WhoAmI"} | Send-DataverseOperation

Will give you BusinessUnitId, UserId, and OrganizationId of the current user.

@odata.context :$metadata#Microsoft.Dynamics.CRM.WhoA
BusinessUnitId : 6f202e6c-e471-ec11-8941-000d3adf0002
UserId         : 88057198-a9b1-ec11-9840-00567ab5c181
OrganizationId : e34c95a5-f34c-430c-a05e-a23437e5b9fa

What’s more

There is a lot more to PSDataverse that what has been covered above. The best way to know it is to download PSDataverse from its GitHub repository and try it. There are several features in its core like automaticall transient fault detection and retry. Automatically adopting the pace based on hints provided by Dataverse and more.

Keep in mind that PSDataverse is a free, open-source PowerShell module and it can benefit from your suggestions, bug reports and contributions. Currently yhe module is in alpha, but I have already used it in a data migration project for a major bank in Europe and I try to use it for different tasks as frequent as I can to enhance it to fit more and more use cases and I put developer productivity above all.

Convert JSON files to Word documents

A while back I built the first version of ndocx – a utility that you can call in your terminal to populate standard Word templates. It can used to generate documents based on simple JSON files. You can think of it as a Word document visualizer of your JSON files. The ndocx utility doesn’t need Word or any other application to be installed on the machine, which makes it a good fit for services and you don’t need any license to run it. It’s real open source alternative that you can use as you wish.

ndocx is in fact the command line interface of Novo Docx, a library based on the latest .NET Framework. You can also find an Azure Functions App that exposes the same functionality as a REST service. This is a good alternative if you don’t want to deal with document manipulation yourself.

If you have a JSON file and a word template, using ndocx is as easy as the following command.

ndocx populate mytemplate.docx -params params.json -output output.docx

Take the following word template as an example.

A very simple word template

It has just one placeholder called a. In order to populate that place holder, you would need a JSON file like the following.

  "a": "simple test"

In other words for each placeholder in your word template, you would need an attribute in your JSON file with the exact same name.

If you have repeating section in your word template, you will need an array for your repeating section and inside the array each placeholder (that you want to populate) needs a matching attribute. For example the following JSON, has a simple attribute called “a” and an array called “repeating1”.

  "a": "simpla test",
  "repeat1": [
    {"a": "Col 1 - row 1", "b": "col 2 - row 1"},
    {"a": "Col 1 - row 2", "b": "col 2 - row 2"}]

The given JSON file can match the following word document, with a simple placeholder called “a” and a repeating section called “repeat1”.

A simple template with a repeating section

Just keep in mind that currently only “Plain text content control” and “Repeating section content control” are supported which should be enough for majority of use cases.

You can download the latest release of ndocx from its Github repository, Novo Docx.

Power Platform Tidbits #7 – Optimize for server-side execution (folding) when adding columns

Power query can take some of the relational transformations and ask the data source to take care of them. This means that the transformations – like grouping, filtering, adding columns and more – can run at the data source, where they are more efficient. This also avoids downloading huge amounts of data and accelerate the execution. In Power BI terms, this is called folding.

When writing a Power query you need to make sure every step that can be passed to the data source is passed to the data source. If you pay attention to the Applied steps, when a green vertical line is displayed besides a step, it mean that the step is passed to the data source. For the following picture shows that the first steps, until “Kept top rows” are passed to the data source.

Please note that once you add a step that cannot be passed to the data source, any step that come after will not be passed to the data source. This means as a general rule of thumb, you should try to apply all foldable steps in the beginning when you can.

You should also try to pay attention to when you are mixing multiple functions in a step. You might notice that when you separate them, a part can fold to the data source. Sometimes it can be tricky to spot them, specially when the UI is mixing them behind the scene. Let’s take an example to make it clear.

For example when you add a new column and define its data type through the UI, Power BI will mix two functions Table.AddColumn and Table.TransformColumnTypes, even though it’s just one column and Table.AddColumn is capable of defining the column type. See for yourself.

Adding new column through UI, will mix Table.AddColumn with Table.TransformColumnTypes.

Now the issue is even though Table.AddColumn is a foldable function, the Table.TransformColumnTypes isn’t. To fix this issue, you either need to separate them is different steps or just rely on Table.AddColumn to both add the column and define the type. In the above example we must change:

Table.TransformColumnTypes(Table.AddColumn(#"Selected columns", "Custom", each 1000), {{"Custom", Int64.Type}})


Table.AddColumn(#"Selected columns", "Custom", each 1000, Int64.Type)

And voila:

|Adding column and simultaneously defining the type in Table.AddColumn makes the step foldable.

NovoDocx – An open source service that generates documents using standard Word templates

Lately, as part of solution I have been building for a client, we had to deal with several document generation flows. I’m talking about really complicated document templates. For example there is one that is composed of several tables and sections that compare different credits / mortgages that a client can take and their financial impact, benefits and downsides and more. This is nothing like the simple mortgage documents that your local bank shares with you. In fact, I am happy for their clients that will receive such well-thought documents, but as for the poor developers, this was not an easy task.

There are two out-of-the-box options when you need to populate Word documents.

  • Word templates feature that existed for ages in Dynamics CRM and now inherited by Power Platform. This is normally the easiest option, but it has many limitation and only works in harmony with rows in a table and its direct related tables.
  • Word Online Connector which is a premium connectors and more capable without dependency on any table. The main issue with this one is when you have flows in your solution that will be deployed in different environments or any time the location of your document needs to change. In this case instead of the name of these placeholders you would need to use random numbers because this connector cannot figure out the schema of the document or lock it after your build one.

For the second options there are some workaround, that I will be writing about soon in future, but what if the logic is so complex that using Flows or a Power App does not make sense or not enough.

In search for an open source library

At this point I started looking for an open source .NET library (preferably) to let us get the job done. It should exist in this day and age, right? The answer is no, it doesn’t. There are only some paid libraries and services that are quite expensive for what they do and working with them is not as easy as one might think. After digging deeper into GitHub, I was able to find some libraries that either rely on Open Office or a headless browser or the brain of the library is in a closed-source black box. I have to say honestly the fact that there is no open-source library and not in .NET was annoying enough, so I decided to start one and I decided to start simple and make it as easy as possible for the developer to use with no knowledge of the underlying format (😎 ehem which I know pretty well, but not to Eric White level well ☺).

Novo Docx

NovoDocx’s first readme file push in Git

Today I share with you my first iteration of Novo Docx, my take on a simple-to-use and simple-to-host library and service that you can use in you own projects. The current features are enough to handle a complex documents and quite fast. I have tried to capture many edge cases too. I will be focusing on enriching the functionality and providing more hosting options. The source code is hosted in GitHub. It includes a simple library and an Azure Functions App that you can use however you like.

Power Platform Tidbits #6 – Plugins in single-project vs multi-project

Plugins has been and are an integral extension point in Dynamics and now Dataverse. They allow you to put logic close to the data. Logic that can be invoked at different stages of execution of an action (aka message).

When the number of plugins grows, you might be willing to separate them into different projects and even share common parts of logic through a project that referenced by other projects. This is the most common way of organizing code in almost any kind of application development. However when it comes to Plugin development for Dataverse, this is not the recommended way of development and for good reasons.

Benefits of single-project plugins

In fact, Microsoft recommends that you consolidate all your plugins into one assembly until the size of the resulting assembly grows larger than 16MB. So far it seems that single-project is the only way to go, but there is a caveat here!

Benefits of multi-project plugins

When you are in a big project with multiple developers assigned to plugin development at the same time, the complexity of deployment and test will increase with the number of developers. Imagine each developer working on a new plugin (presumably each in a different git branch) and the test environment is shared. Every time a new version of the assembly is deployed, it might contain only one of the new plugins.

One solution is to give every developer an isolated environment, but it is not always possible due to the number of integrations required and many other complications that comes with provisioning a new environment.

Another solution is to put in place a better discipline around your branching and deployment strategy. For example, you can ask developers to communicate with each other before deployment and cherry-pick other plugins into their own branch to make sure all the latest plugins are included in the deployment. This would work, but it is prone to human error (what isn’t 😏). So let’s summarize. Whenever you need to deploy a new plugin for testing:

  • Try to manage requirement in such way to reduce the number of plugins in development at any given time.
  • Put each plugin in a dedicated branch.
  • Communicate before each deployment.
  • Cherry-pick from other branches, anything that needs to be included in deployment.
  • Implement unit-testing in your plugins to reduce the number of deployment required.

This last one (i.e. unit testing) can substantially reduce the number of deployment, increase quality and save precious time. The first one is important too.

Is there a better way?

There isn’t, or is there? 😑 Ok I will tell you. If the above does not suit for some reason. There is another solution that might help you. In other words, if you tried the above and you are still facing issues, or if you are avoiding single-project approach due to the issues you have faced, there might be a solution for you.

Visual Studio has a feature called linked files. This feature is very easy to use and allows you add links to files in you project without physically copying the file to your projects. I have always used this approach in SSIS development to target multiple versions of SSIS without duplicating the project. But how can we benefit from it in our plugins?

Imagine you are developing several plugins and each developer is working on a plugin (or more). You still put all your plugins in a single project and follow the same branching strategy, but when a developer needs to deploy the new plugin he/she can still create a new project and add all the files required as links into that project, deploy and test as needed. You can cleanup and remove that project in future before merging back to the main branch. The following demonstrates a situation when a developer is working on a LeadQualification plugin in a fancy project.

- PluginBase.cs <──────────────╮
- LeadQualificationPlugin.cs <─│─╮
- NamingPlugin.cs              │ │
- ConnectionPlugin.cs          │ │
- ...                          │ │
                               │ │
Fancy.Lead:                    │ │
+ PluginBase.cs ─────────────────╯
+ LeadQualification.cs ────────╯

As you have notices Fancy.Lead project is just an empty project with two links to the main project, but when you compile, it will give you Fancy.Lead.dll in addition to Fancy.Plugins.dll that means you can deploy and test this smaller assembly TEMPORARILY until you are ready to merge it to the main branch.

To create those links you will need to simply hold down Ctrl + Shift while you drag and drop the file from the Fancy.Plugins project to the Fancy.Lead project. In fact if you have ever made shortcuts to files in Windows, this is pretty similar.