Frequently when importing / scrubbing data within CRM, identifying duplicates to remove or ‘flag’ may be needed. It’s helpful to have a query that will return all the duplicates in the database –but exclude a single ‘surviving’ record. (i.e. Return all-but-one of the duplicated records.)
This query is one of my go-to query patterns – I rarely, if ever, use it as-is, but will modify it to fit the particular entity / scenario I need to investigate.
The returned record set might used as the source for a Scribe Insight job to either ‘flag’ or inactivate or delete the duplicates depending on the situation. – Of course you’ll need to take into account the record’s relationship to other entities etc.
This example query assumes the duplicated contacts were entered more than 3 milliseconds apart in SQL. If your data was migrated from another system or entered as a result of a large batch and you need finer control, consider using the 'versionnumber' field as a unique timestamp field. (The VERSIONNUMBER field is a unique value that gets incremented as records are updated – it can be very useful.)
Another variations might be that you want to change MAX to MIN – and adjust the JOIN in the query to return all but the most recently created or updated record.
The key is that this is a pattern you can build on and adapt to your particular situation.
( Also, by blogging it here, I no longer have to search my laptop for copies of it when I need a example for someone. )
Title: Find Duplicate Contact to Remove (Example)
Author: Scott Sewell, Customer Effective
Description: Find the most recently created contacts that duplicates an existing contact based on fullname and emailaddress1. The Subquery finds the Name/Email and CreatedOn date of the most recently created duplicates. e.g. if there are multiple contact records with the same name and emailaddress1, return the date of all but the oldest one. */
SELECT FC.contactid ,FC.fullname ,FC.emailaddress1 FROM dbo.FilteredContact AS FC JOIN ( SELECT FC2.fullname ,FC2.emailaddress1 ,MIN(FC2.createdon) MIN_CREATED_ON FROM dbo.FilteredContact AS FC2 WHERE FC2.emailaddress1 IS NOT NULL AND FC2.statecode = 0 GROUP BY FC2.fullname ,FC2.emailaddress1 HAVING COUNT(FC2.contactid) >= 2 ) AS FC_DUPE ON FC_DUPE.MIN_CREATED_ON < FC.createdon AND FC.statecode = 0 AND FC.fullname = FC_DUPE.fullname AND FC.emailaddress1 = FC_DUPE.emailaddress1
Privacy & Cookies Policy
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.