Logo
banner

Blogs

How to quickly match data of two recordsets in SQL Server?

, August 10, 2022 159 Views

While working with T-SQL, many times we create complex queries, and then we might need to optimize them at some later point of time to reduce the execution time. This activity always demands an approach to compare data for the older and newer version of the T-SQL. And here comes the question – how do we validate the data of both queries in a way which is reliable, quick & simple? 

There are many ways to do this and here, I will be explaining a way which I find myself using a lot of times.

Before diving deeper into the approach, let’s first understand the below two functions of SQL which I will be using in my approach:

  • CHECKSUM()
  • CHECKSUM_AGG()

CHECKSUM()calculates a hash for one or more column values in a row and returns an integer. This turn value is deterministic in nature; meaning- will always return the same output for the same input values.

CHECKSUM_AGG()is an aggregate function that takes a single integer value from multiple rows and calculates an aggregated checksum for each group.

So, to match the data and check if there is any discrepancy, we need to follow the below steps:

  • Check the CHECKSUM_AGG() value for both queries.
  • If the value of CHECKSUM_AGG() is not matching, then using the CHECKSUM() value for each row, we can easily find the non-matching rows.

Let’s see how to implement the above steps:

Here, we have created a view named “vw_innerwithwhere” using the below query:

CREATE VIEW vw_innerwithwhere 
AS 

SELECT  
 * 
FROM  
 [dbo].[tblEpisode] E 
INNER JOIN  
 [dbo].[tblEpisodeEnemy] EE 
 ON E.EpisodeId = EE.EpisodeId 
WHERE  
 EE.EpisodeId IN (1,2,3,4,6,8)

Now, we re-write this query by removing WHERE condition and applying the condition in INNER JOIN and we create a new view named “vw_innerwithcondition” just to test whether both the query results the same output or not.

CREATE VIEW vw_innerwithcondition 
AS 

SELECT  
 * 
FROM  
 [dbo].[tblEpisode] E 
INNER JOIN  
 [dbo].[tblEpisodeEnemy] EE 
ON 
 E.EpisodeId = EE.EpisodeId 
 AND EE.EpisodeId IN (1,2,3,4,6,8)

Now, let’s test whether both the views return the same output or not by using CHECKSUM() and CHECKSUM_AGG().

Please note that for best precision and validation, make sure we include the Primary Key column, as the value of the primary key will always be unique and will always generate a unique CHECKSUM(), i.e., with more accuracy.

Let’s do it practically and see how to validate whether the data is the same or not in 2 different views.

Example-1:

  • First, we’ll check CHECKSUM_AGG() function for both the views.

    If both the values are same, we can confirm that both the views return the same data and if they are different, then there is some discrepancy in the data.

Example-2: 

Now, we have modified the view “vw_innerwithcondition” by manipulating the condition as below: 

ALTER VIEW vw_innerwithcondition 
AS 

SELECT  
 * 
FROM  
 [dbo].[tblEpisode] E 
INNER JOIN  
 [dbo].[tblEpisodeEnemy] EE 
 ON E.EpisodeId = EE.EpisodeId  
 AND EE.EpisodeId IN (1,2,3,4,6,9)

Now we follow the following process to validate the data:

  • First, we will check CHECKSUM_AGG() value for both the views.
    We can see here that both the values are different which means there is some discrepancy in the data. So now, we move to Step 2.
  • We will check CHECKSUM() value of each row as shown below:

    As we can clearly see from the output that it shows a discrepancy in the data with the missing EpisodeIds from both the views. And their checksum value is different because both are completely different records.

In this way, we can test if there is any discrepancy between two queries in SQL with less manual work and in less time.

mm

Inkey

INKEY is your solution partner.
Our focus is to deliver you in-time intelligent innovative solutions ("key") for the problems in hand. Maintaining a quality standard right from the inception of a project is our top most priority.

Our team of talented professionals will execute your projects with dedication and excellence. We take ownership and accountability for the effort that goes into meeting our client’s needs.

Years of experience and proven success of delivering innovative custom solutions.

More posts by

Leave a Reply

Your email address will not be published.

The maximum upload file size: 2 MB. You can upload: image, audio, video, document, spreadsheet, interactive, text, archive, code, other. Drop file here