Tuesday, 22 April 2014

SSIS integration with Dynamics CRM using ExecuteMultipleRequest for bulk operations

There are several tutorials on the Web explaining how to integrate SSIS with Dynamics CRM using the script component. All of them however show you only the basic setup, where records from a data source are processed 1 by 1 when executing CRM commands (e.g. creating CRM records). In this post I would like to show you have to leverage the ExecuteMultipleRequest class from CRM SDK to create bulk operations for records from the SSIS data source.

Tutorial scenario

  1. At first we will create a simple database with 1 table that stores user names
  2. Then we will create an SSIS project
  3. Next, we will add our db table as data source, so SSIS can read information about users
  4. Then, we will add a script component that creates contacts in CRM for each user from the table
  5. Finally, we will modify the script to import CRM contacts in batches
  6. At the end we will compare execution time of both scripts

Basic setup

Database
Let's create a basic db table with only 2 columns:
CREATE TABLE Users (
 FirstName VARCHAR(100) NOT NULL,
 LastName VARCHAR(100) NOT NULL
 )
Now populate your table with some dummy data, in my case I've added 1000 records.

SSIS project
  1. Open "Sql Server Data Tools" (based on Visual Studio 2010)
  2. Got to File -> New -> Project...
  3. Select "Integration Services Project", provide project name and click OK
  4. When the project is created add a Data Flow task to your main package:
Data Source
  1. Double click your Dat Flow task to open it
  2. Double click "Source Assitance" from the toolbox
  3. On the first screen of the wizard select "SQL Server" as source type and select "New..."
  4. On second screen provide you SQL server name and authentication details and select your database
  5. A new block will be added to you Data Flow, representing your DB table. It has an error icon on, cause we haven't selected the table yet. Also, you will see a new connection manager representing you DB connection:
  6. Double click the new block, from the dropdown select the Contacts table we created and hit OK. The error icon should disappear
Script component
  1. Drag and drop the Script Component from the toolbox to you Data Flow area
  2. Create a connection (arrow) from your data source to your script:
  3. Double click your script componet to open it
  4. Go to "Input Columns" tab and select all columns
  5. Go to "Inputs and Outputs" tab and rename "Input 0" to "ContactInput"

1-by-1 import

Now that we have basic components setup let's write some code! In this step we will create a basic code for importing Contacts into CRM. I'm assuming you have basic knowledge of CRM SDK, therefore the CRM specific code will not be explained in details.

Open the script component created in the previous steps and click "Edit Script...". A new instance of Visual Studio will open with a new, auto-generated script project. By default the main.cs file will be opened - this is the only file you need to modify. However, before modyfing the code you need to add references to following libraries:

  • Microsoft.Sdk.Crm.Proxy
  • Microsoft.Xrm.Client
  • Microsoft.Xrm.Sdk
  • Microsoft.Runtime.Serialization
Now we are ready to write the code. Let's start by creating a connection to you CRM organization. This will be created in the existing PreExecute() method like this:
OrganizationService _service;

public override void PreExecute()
{
    base.PreExecute();
        
    var crmConnection = CrmConnection.Parse(@"Url=https://******.crm4.dynamics.com; Username=******.onmicrosoft.com; Password=*********;");
    _service = new OrganizationService(crmConnection);
}
Now that we have the connection created let's write code, that actually imports our contacts to CRM. This can be done be modyfing the existing method ContactInput_ProcessInputRow:
public override void ContactInput_ProcessInputRow(ContactInputBuffer Row)
{
    var contact = new Entity("contact");
    contact["firstname"] = Row.FirstName;
    contact["lastname"] = Row.LastName;
    _service.Create(contact);
}
Obviously the code above requires some null-checks, error handling etc but in general that's all you need to do in order to import your contacts into CRM. If you close the VS instance with the script project it will be automatically saved and built.

You can now hit F5 in the original VS window to perform the actual migration.

Bulk import

In the basic setup described above there is 1 CRM call for each record passed to the script component. Calling web services over the network may be a very time consuming operation. CRM team is aware of that and that is why they introduced the ExecuteMultipleRequest class, which basically allows you to create a set of CRM requests on the client side and send them all at once in a single web service call. In response you will receive an instance of the RetrieveMultipleResponse class, allowing you to process response for each single request.

Let's modify the script code to leverage the power of the ExecuteMultipleRequest class. To do that overwrite the ContactInput_ProcessInput method. The default method implementation can be found in the ComponentWrapper.cs file and it as simple as this:

 public virtual void ContactInput_ProcessInput(ContactInputBuffer Buffer)
{
     while (Buffer.NextRow())
     {
        ContactInput_ProcessInputRow(Buffer);
     }
}
As you can see by default it calls the ContactInput_ProcessInputRow method that we implemented in the previous step for each record from the source. We need to modify it, so it creates a batch of CRM requests and then send it to CRM at once:
List<Entity> _contacts = new List<Entity>();

public override void ContactInput_ProcessInput(ContactInputBuffer Buffer)
{
    int index = 0;
    while (Buffer.NextRow())
    {
        _contacts.Add(GetContactFromBuffer(Buffer));
        index++;

        // Let's use buffer size 500. CRM allows up to 1000 requests per single call
        if (index == 500)
        {
            ImportBatch();
            index = 0;
        }
    }
    ImportBatch();
}

private void ImportBatch()
{
    if (_contacts.Count > 0)
    {
        // Create and configure multiple requests operation
        var multipleRequest = new ExecuteMultipleRequest()
        {
            Settings = new ExecuteMultipleSettings()
            {
                ContinueOnError = true, // Continue, if processing of a single request fails
                ReturnResponses = true // Return responses so you can get processing results
            },
            Requests = new OrganizationRequestCollection()
        };

        // Build a CreateRequest for each record
        foreach (var contact in _contacts)
        {
            CreateRequest reqCreate = new CreateRequest();
            reqCreate.Target = contact;
            reqCreate.Parameters.Add("SuppressDuplicateDetection", false); // Enable duplicate detection 
            multipleRequest.Requests.Add(reqCreate);
        }

        ExecuteMultipleResponse multipleResponses = (ExecuteMultipleResponse)_service.Execute(multipleRequest);            

        // TODO: process responses for each record if required e.g. to save record id

        _contacts.Clear();
    }
}

private Entity GetContactFromBuffer(ContactInputBuffer Row)
{
    Entity contact = new Entity("contact");
    contact["firstname"] = Row.FirstName;
    contact["lastname"] = Row.LastName;
    return contact;
}

Execution time comparison

As you can see the code for sending requests in batches is a bit longer (but still quite simple I believe) so you may be tempted to go with the simpler version. If you don't care about performance too much (little data, no time limitations) then it might be the way to go for you. However, it's always better to know your options and take a conscious decision. SSIS packages usually process large amount of data, which often takes a lot of time. If you add additional step performing CRM operations via CRM SDK (i.e. via CRM web services) you may be sure this will affect significantly the execution time.

I've measured the execution time for both methods. Importing 1000 contacts into CRM took:

  • 1-by-1 - 2:22s
  • Bulk import - 0:44s
In my simple scenario bulk import was 3x faster than 1-by-1. The more data you send to CRM the bigger the difference may be.

7 comments:

VP said...

Hi Filips, I have requirement where in I pull say for example 1 million Revenue record from upstream system and push the raw data in to staging and from staging push 75000 revenue records in to CRM 2013 online. I have various filter criteria in CRM like Area,fiscal period. Once I apply that filters in CRM and click retrieve. it should just fetch the revenue nos and not the actual records. Method you have suggested will that support this option please? ofcourse performance is the main part since we retrieving huge no of data. Please suggest for CRM online

Ramzi GHRIBI said...

hi, i have this error:

Could not load file or assembly 'Microsoft.Xrm.Client, Version=6.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35' or one of its dependencies. The system cannot find the file specified.

any idea please?

rox said...

Hi,
Very informative. THanks for sharing.
Can we do updateRequest in the same GetContactsBuffer method?
Do you have updateRequest sample for reference?

Filip Czaja said...

Hi rox
You can put all sorts of request types into your batch. Updaterequest should look very similar to create.

Roxanna Appleby said...

So do I have to create a new package for update same like you created for create in batches?
Thanks

chapelain36 said...

Hi, didn't you lose the first row using "while(Row.NextRow())" in this way?

Unknown said...

Hi,plz tell me how to avoid duplicate record when execute package second time,Please help me