Using OPENQUERY()

When compiling data from many different sources you may need to reach through to linked servers to add to your datasets. This type of task can take a heavy toll on your query execution times and on your overall query performance. Instead of the query optimizer staying within a single instance of SQL Server it now has to process the best route available to reach across to another instance to grab your data. For very large datasets or datasets with heavy aggregations this can pose a problem and will take minutes to run.

Luckily for us Microsoft included an open rowset function to allow a specified pass-through query to execute on a specified linked server. What this means is that you are telling the query to process a query that you provide to run as if it were natively on the specified instance and return the results to the client instance. Let’s look at an example:

I am logged into an instance of SQL Server that I am pulling data out of, let’s call this SQL1. From this instance I am given the task to pull data that lives locally but I also need to grab information that exists in another instance/server all together. These 2 instances of SQL Server are set up as linked servers, giving them the ability to communicate with one another. You have 2 ways to go about this; 1) write your query to manually jump from your current instance and pull your related data in, or 2) grab the dataset that you wish to filter using OPENQUERY and then relate your data together. Which one is faster and which one will work better overall? Let’s find out.

I run the below query, which ultimately takes 9 seconds to run and returns 5401 rows after chewing through millions of rows to filter down to a certain date.

OPENQUERY1

When looking at the execution plan for this distributed query you receive the following:

OPENQUERY2

Using remote queries is expensive, period. The above execution plan shows that all of the I/O is being allocated to running a “remote query” which means that it’s creating a connection through the linked server at run time to pull the information through little by little. The OPENQUERY results will be a little bit different.

OPENQUERY3

OPENQUERY query execution plan:

 

OPENQUERY4

The first thing I noticed when using OPENQUERY was how fast it was to run. Instead of taking 9 seconds like the previous query did it took less than one and also generated a different execution plan. Now you’ll see that the query is using a “remote scan” to scan the table index locally after creating a remote connection and running the query locally through the connection.

Using OPENQUERY()

Leave a comment