Your Automations crash due to longer than expected values coming from the external source? Make your life easier with SQL.
Working with Marketing Cloud (or any other Marketing Automation Platform) is always the dance between optimisation and flexibility. It is easily visible when working with SFMC Data Extensions.
On the one hand, limiting the maximum field length is a must-have best practice. Salesforce recommends keeping the total sum of all fields length under 4000 characters. For many use cases, a tiny number. But if you want to have lean Data Extensions to query them quickly, it is the way. And in the Marketing Cloud, nearly everything is a query - even if you don't write a line of SQL.
On the other hand, it might be tough to limit your Data Extension so much in real-life scenarios. Especially when you are bringing data from other systems. Example? Salesforce CRM integration through Marketing Cloud Connect. The picklist fields from CRM come to Synchronized Data Extensions as 255 character text fields - even if they have only one-digit values associated with it. Select a few such picklists to be available in Marketing Cloud, and you hit the Data Extension character length limit recommended by Salesforce.
What to do with that?
Limit the maximum length of those picklists in the Marketing Cloud Data Extension you use for campaigns. You are sure that your Gender field will always have one character value (
U)? Make the maximum length equal to 1. It is an excellent way to trim unneeded length, but sometimes, you might be surprised, especially with less obvious picklists, like a job title. Similar issues might also come from standard text fields.
Once you build your standard Data Extension with those optimisations in mind, it is time to pull the data from Synchronized Data Extensions.
You Should Know
I'm mentioning the Marketing Cloud Connect Synchronized Data Extensions, but you can also leverage it for FTP-based data uploads. The solution will require additional intermediate Data Extension with generous field lengths in place of Marketing Cloud Connect Synchronized Data Extension. Rest will be the same.
All good, until one of the values, exceed the maximum length you set for its field. The Automation crashes with a cryptic error that tells you nothing and can push you for a long crusade to find the culprit. What now?
You can make quick manual check with a simple SQL query. Open your Query Studio and copy-paste the below code:
SELECT MAX(LEN(c.Id)) AS SubscriberKey , MAX(LEN(c.FirstName)) AS FirstName , MAX(LEN(c.LastName)) AS LastName , MAX(LEN(c.Email)) AS EmailAddress , MAX(LEN(c.JobTitle__c)) AS CurrentRole , MAX(LEN(c.Industry__c)) AS IndustryFROM Contact_Salesforce AS c
You Should Know
If you are running those queries from the child Business Unit, add
Ent. prefix to Synchronized Data Extension names, as those are stored on the parent level.
This query will output one row of data with the current maximum length of the values that you have in your Synchronized Data Extension for the selected fields. It uses
MAX function on the
LEN outcome to find the single maximum length available across all records. It has two great uses:
You can use it even before configuring the Data Extension and Profile Attributes to analyse your real-life data.
For example, if out of your whole database, the longest First Name is 33 characters, you will probably be just fine with a maximum length of 40 on this field. If a picklist field returns you 5 with this query - you won't need the default 255 characters.
Of course, as mentioned above, the values might get longer in the future. But as Marketing Cloud allows you to change maximum field length up but not down, it might be a good idea to try as low as reasonable and leave space for growth.
The second use of the above snippet is quick debugging when an error occurs. Execute it in Query Studio and compare against the column lengths you set up in your Data Extension. If you see in Query Studio any value longer than the maximum you set up in Data Extension - you found a culprit.
Now you can look in the data source and check whether this longer-than-expected value is correct. If yes, it is time to update your maximum length in the Data Extension and Profile Attribute configuration.
You Should Know
In most cases, the field length of 255 should be more than enough. But if you need it, Marketing Cloud allows you to go up to 4000 characters per field.
4000 characters length is an instant way of getting over the recommended limit, but it might be useful for logging Data Extensions that you won't query in the future. For example, if you want to save
_Bounce Data View to standard Data Extension,
SMTPBounceReason might use such a long field.
If you need more, there is a trick reported by Markus Slabina. Use Email Studio to add a new field and make the length empty. Save. It will create a
nvarchar(max) field that allows absurdly long data to be stored.
- It works only in Email Studio - Contact Builder won't work
- It is a workaround and might disappear at any moment
- You shouldn't use it unless there is no other option
The snippet above is nice but suitable only for short term manual checks. And you don't want to perform manual checks for the long term. If you wonder whether there is a way to automate it nicely, you are here for a treat.
Yes, you can automate such validation by using two additional Activities.
Go to the Automation that takes care of moving the data from Synchronized Data Extension do Standard Data Extension and Profile Attributes. Before the Activities that are transfering the data, add a SQL Query Activity. In it, paste the below code:
SELECT c.Id AS ContactID , LEN(c.Id) AS SubscriberKey , LEN(c.FirstName) AS FirstName , LEN(c.LastName) AS LastName , LEN(c.Email) AS EmailAddress , LEN(c.JobTitle__c) AS JobTitle , LEN(c.Industry__c) AS IndustryFROM Contact_Salesforce AS cWHERE 18 - LEN(c.Id) < 0 OR 40 - LEN(c.FirstName) < 0 OR 80 - LEN(c.LastName) < 0 OR 254 - LEN(c.Email) < 0 OR 80 - LEN(c.JobTitle__c) < 0 OR 40 - LEN(c.Industry__c) < 0
Of course, you will want to adapt it to your needs. Change the fields along with their names and lengths. Select the right Synchronized Data Extension. How? Let's dive in what this code does so that you can make it your own.
The first field we
SELECT is the Unique Identifier (Contact ID / Subscriber Key or other based on the specific object you are validating). It is useful to quickly know which record(s) should you checked when some values lengths exceed the expectations.
Finally, there is a
WHERE statement. In it, we are checking whether at least one field on each record has a value longer than expected. To adapt it - change the number on the left to the length of the field you set up in the target Data Extension / Profile Attribute.
This SQL Query Activity should output the rows to a technical Value Length Data Extension with all the same columns as the target Data Extension / Profile Attributes list. However, it can have much shorter max lengths for the columns, as all fields but the Unique Identifier one will output only the field's length (up to 4 digits). Configure the SQL Activity to Overwrite the technical Data Extension and always have the latest state data.
Right after this SQL Query Activity add Verification Activity and configure it to check the technical Data Extension mentioned above. If there is any row in it - it should stop the whole Automation and send an alert email to the Marketing Cloud administrator.
To sum up, sample Automation could look like this:
- SQL Query Activity with the Debugging Value Length query
- Verification Activity that checks Value Length Data Extension
- SQL Query Activity that moves data from Synchronized Data Extension to target Data Extension
- Export Activity
- Transfer Activity
- Import Activity
The last three Activities above are the classic ETL group for updating the All Subscribers and Profile Attributes.
Such setup won't silently crash your Automation whenever there is a single value longer than expected. It will stop the Automation before the crash, inform the administrator that there is an issue and provide Unique Identifiers of records leading to that error for an easy check. Neat, right? Well, it can be even better.
The basic version above is already lovely, but we can make a few improvements to make the administrator's life even easier.
Previously shown query will show the problematic record, but it will display all field lengths for it. The administrator will still have to compare the lengths to find which one is the issue's source.
However, we already are coding the expected field length in the
FROM part of the query. Why not use it to make life easier?
Let's change the
LEN(c.Id) AS [Subscriber Key], in the
SELECT statement part to something a bit more interesting:
CASE WHEN 18 - LEN(c.Id) < 0 THEN LEN(c.Id)END AS SubscriberKey
As you can see, we changed a simple line to a more complex
CASE statement. There is a good reason for it. Thanks to it, the query will evaluate each field against the expected length and in the Value Length Data Extension display information only for the problematic values.
The administrator will no longer have to compare each field with configured maximum - he will see it directly in the Data Extension, which saves time and limits the errors.
We can go even further with this approach and add a bit more data to the output for those problematic fields by making few calculations in
THEN part of the
CASE WHEN 18 - LEN(c.Id) < 0 THEN CONCAT(LEN(c.Id), '(', LEN(c.Id) - 18, ' over limit)')END AS SubscriberKey
Here, apart from the problematic field's length, we are also showing how much it exceeds the current configuration.
For example, if we expect the First Name to be under 40 characters and one record comes with 46 characters, the previous solution would output
46. This extended one will be a bit more verbose by showing
46 (6 over limit).
Remember that if you implement this option, it will impact the output length in your technical Data Extension. 4 characters will be no longer enough. 22, however, will work like a charm.
After applying both of the above options to the Basic SQL Snippet, it looks like this:
SELECT c.Id AS ContactID , CASE WHEN 18 - LEN(c.Id) < 0 THEN CONCAT(LEN(c.Id), '(', LEN(c.Id) - 18, ' over limit)') END AS SubscriberKey , CASE WHEN 40 - LEN(c.FirstName) < 0 THEN CONCAT(LEN(c.FirstName), '(', LEN(c.FirstName) - 40, ' over limit)') END AS FirstName , CASE WHEN 80 - LEN(c.LastName) < 0 THEN CONCAT(LEN(c.LastName), '(', LEN(c.LastName) - 80, ' over limit)') END AS LastName , CASE WHEN 254 - LEN(c.Email) < 0 THEN CONCAT(LEN(c.Email), '(', LEN(c.Email) - 254, ' over limit)') END AS EmailAddress , CASE WHEN 80 - LEN(c.JobTitle__c) < 0 THEN CONCAT(LEN(c.JobTitle__c), '(', LEN(c.JobTitle__c) - 80, ' over limit)') END AS JobTitle , CASE WHEN 40 - LEN(c.Industry__c) < 0 THEN CONCAT(LEN(c.Industry__c), '(', LEN(c.Industry__c) - 40, ' over limit)') END AS IndustryFROM Contact_Salesforce AS cWHERE 18 - LEN(c.Id) < 0 OR 40 - LEN(c.FirstName) < 0 OR 80 - LEN(c.LastName) < 0 OR 254 - LEN(c.Email) < 0 OR 80 - LEN(c.JobTitle__c) < 0 OR 40 - LEN(c.Industry__c) < 0