Frequently when importing / scrubbing data within CRM, identifying duplicates to remove or ‘flag’ may be needed. It’s helpful to have a query that will return all the duplicates in the database –but exclude a single ‘surviving’ record. (i.e. Return all-but-one of the duplicated records.)
This query is one of my go-to query patterns – I rarely, if ever, use it as-is, but will modify it to fit the particular entity / scenario I need to investigate.
The returned record set might used as the source for a Scribe Insight job to either ‘flag’ or inactivate or delete the duplicates depending on the situation. – Of course you’ll need to take into account the record’s relationship to other entities etc.
This example query assumes the duplicated contacts were entered more than 3 milliseconds apart in SQL. If your data was migrated from another system or entered as a result of a large batch and you need finer control, consider using the 'versionnumber' field as a unique timestamp field. (The VERSIONNUMBER field is a unique value that gets incremented as records are updated – it can be very useful.)
Another variations might be that you want to change MAX to MIN – and adjust the JOIN in the query to return all but the most recently created or updated record.
The key is that this is a pattern you can build on and adapt to your particular situation.
( Also, by blogging it here, I no longer have to search my laptop for copies of it when I need a example for someone. )
/*
Title: Find Duplicate Contact to Remove (Example)
Author: Scott Sewell, Customer Effective
Description: Find the most recently created contacts that duplicates an existing contact based on fullname and emailaddress1. The Subquery finds the Name/Email and CreatedOn date of the most recently created duplicates. e.g. if there are multiple contact records with the same name and emailaddress1, return the date of all but the oldest one. */
SELECT FC.contactid ,FC.fullname ,FC.emailaddress1 FROM dbo.FilteredContact AS FC JOIN ( SELECT FC2.fullname ,FC2.emailaddress1 ,MIN(FC2.createdon) MIN_CREATED_ON FROM dbo.FilteredContact AS FC2 WHERE FC2.emailaddress1 IS NOT NULL AND FC2.statecode = 0 GROUP BY FC2.fullname ,FC2.emailaddress1 HAVING COUNT(FC2.contactid) >= 2 ) AS FC_DUPE ON FC_DUPE.MIN_CREATED_ON < FC.createdon AND FC.statecode = 0 AND FC.fullname = FC_DUPE.fullname AND FC.emailaddress1 = FC_DUPE.emailaddress1
This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Cookie settingsACCEPT
Privacy & Cookies Policy
Privacy Overview
This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.