MCP Catalog Architecture
Marketing Cloud Personalization (Interaction Studio) Catalog Objects architecture, data ingestion, limitations, tricks and tips. Build right from the start.
Catalog Basics & Limitations
The purpose of the Marketing Cloud Personalization (Interaction Studio) Catalog is to store your asset data along with interconnecting relationships for personalisation and machine learning purposes.
There are five out-of-the-box objects: Products, Articles, Blog Posts, Categories and Promotions. You can also create up to 25 custom Catalog Objects.
Each Object contains some built-in attributes (like Name, URL, Description, Promotable) and can be extended with custom attributes up to a total of 35 attributes on an Object.
All of the above can be stitched together using prebuilt and custom relationships between Catalog Objects (for example, the built-in relationship between a Product and a Category) - allowing for connections of up to 15 Categories per Item and up to 50 related Catalog Object values per Item.
With up to 2 000 000 Items per Catalog and up to 10 000 000 Items in total across all Catalogs, MC Personalization provides a lot of flexibility to architect a Catalog of your dreams.
However, some quirks and features require more consideration to make the most out of MCP capabilities. Especially as some wrong moves can be tough to reverse. Let's dive in.
Catalog Data Sources
There are three key sources of data for the Catalog:
- Manual via UI - Good for checking data and performing minor fixes, but awful for data ingestion.
- Web/Mobile SDK - Real-time, but performance-heavy and dependent on user behaviour.
- ETL Feed - Best performance and can cover the whole Catalog regardless of user behaviour, but can be updated at most every 15 minutes.
So which one should you use - Sitemap or ETL?
Why not both?
Mixing both sources is very tempting as it sounds like the best of both worlds. Unfortunately, it is not recommended approach due to MCP's backend limitations. Sending the same Items through both channels impacts performance, creates concurrency and can lead to incorrect and not-so-easy-to-fix issues with your Catalog.
For example, if you control the Exclusion and Eligibility of the Products using Sitemap and the promotable
attribute, the ETL won't be able to overwrite it (despite officially having higher priority as a data source). It can quickly lead to considerable discrepancies in Product availability for recommendations. Due to that, I recommend enabling Strict Catalog Security setting to protect your Catalog integrity if you are using ETL.
Another issue related to mixing is functional differences between the sources. For example, Sitemap and Category ETL use different mechanisms for building hierarchy that are not compatible with each other.
To sum up - do not mix Sitemap and ETL. And if you have to - do not mix Sitemap and ETL on the same Object. And if you have to - do not mix Sitemap and ETL for the promotable and archived attributes. But really, do not mix it.
Sitemap vs ETL
Assuming you want to keep things clean, you have two options:
1. MC Personalization with Sitemap
Pros:
- You need Sitemap either way.
- The updates will happen in real-time once the User views the Item.
- You can build a drill-down Category hierarchy in the UI.
Cons:
- Sitemap will be more complex - depending on where the Catalog details are available on the website (dataLayer, JSON LD, HTML), getting them might get convoluted and impact the performance of your data capture and Campaigns.
- Changes to the website can break your data capture (for example, updates to breadcrumb attributes or dataLayer structure).
- Catalog gets updated only when a user views an Item, which creates a risk of incorrect recommendations for less-visited Items.
- Can trigger massive amounts of concurrent updates for high-traffic Items.
- Cannot overwrite Multistring Object attributes and relationships (like Category) - Sitemap can only append.
- Catalog can be manipulated from the front end by malicious actors.
2. MC Personalization with ETL
Pros:
- Full Catalog upsert possible every 15 minutes (delta files recommended).
- Full control over final Catalog values - regardless of Item page visits.
- Better control over history of value changes and easier debugging.
- Much better performance, especially for bigger Catalogs.
- More secure with Strict Catalog Security option
WARNING: A bug currently stops Add To Cart and Purchase actions from being associated with Categories with this setting enabled. - Much more lean and performant Sitemap.
Cons:
- Require you to export data in a particularly formatted .csv to MCP SFTP.
- Not real-time (but every 15 minutes is pretty damn close, come on!).
- Doesn't support drill-down Category hierarchy in the UI*.