Stored Procedure optimization is an art mixed with some techniques and needs lots of patience with an eye for detail at every aspect. In our DNF forums, we have observed that this question has been asked multiple times e.g. here and here. In this series of articles, we will provide some of the measurements that will ensure for a better performant Store Procedure. While providing the tips, we will also provide the references that will act as a detail reference point as why the measurement is consider as a better technique. This is the first in the series.
- Never name the user defined store procedures with "sp_". They are system defined and mostly resides under the master db. So if we write a user defined stored procedure by the name "sp_" the query engine will first search the sp inside the master db and if not found then it will search in the current session db. This brings unnecessary round trip. Better to use some other naming convention as "usp_". For a more detail explanation, please read our article on the same.
Avoid * in SELECT statement since SQL Server converts the * to columns name before query execution. One more thing, instead of querying all columns by using * in select statement, give the name of columns which you required. For a more detail explanation, please read our article on the same.
Use EXISTS instead of IN clause since EXISTS is faster than IN Clause. For a more detail explanation, please read our article on the same.
Avoid using user defined functions. For a more detail explanation, please read our article on the same.
Try to maintain as small clustered index as much as possible since the fields used in clustered index may also used in non clustered index.Data in the database is also stored in the order of clustered index.So, maintaining a huge clustered index on a table with a large number of rows increases the size drastically. Interested readers may find this article Effective Clustered Indexes as a useful reference for the same.
Avoid cursor since they are very slow in performance. Instead try to write SET based TSQL queries and instead of a Row by Row approach, the Set based approach submits the batch at a time to the query engine there by boosts the query performance.For a more detail explanation, please read our article on the same.
Avoid using Loop(While).In the case of While loops the rows are rows are processed one after another and hence if we want to submit any job(s) to the query engine, it process in a sequential manner.Where as if we can re-write the same query by using in a SET Based manner, we can submit a complete batch of job(s) to the query engine which on the other hand will be processed much faster way.For a more detail explanation, please read our article on the same.
Try to use less number of Temp tables.Instead use Common table expressions.Though it depends.
Avoid writing recursive Stored procedure. Instead use Recursive Common Table Expression.For a more detail explanation, please read our article on the same.
Try to avoid writing co-related sub query since for every outer row, the inner row gets evaluated.
Use TRY-CATCH for handling errors in T-SQL statements.Or if we are using SQL Server Version 2012 onward, we can use the Throw statement, which is even better.For a more detail explanation, please read our article on the same.
If possible perform a filter by using a where clause and then perform JOIN and not the reverse.This ensures less read operation.
Instead of writing
Select
col1,col2
From TableName
write
Select
col1,col2
From TableName With NOLOCK
as the later solves Share lock issue. However, please don't use it for DML statements like (INSERT/UPDATE/DELETE/MERGE).
Use UNION ALL in place of UNION. UNION sort the result set for distinguished values but not UNION ALL hence it is faster.
Use schema naming convention (e.g. servername.databasename.ownername.storedprocedurename) as it helps the SQL Server for finding that object in a specific schema.
Choose appropriate FILL factors.
If we need to perform a huge join, avoid it. Create views and apply the Projection on that.
Use clustered index while designing the DB design as that won't yield HEAP tables.
Try to avoid NCHAR/NVARCHAR and use CHAR/VARCHAR as the former will unnecessary consumes 2 bytes more as compared to the later since the former is meant for unicode characters storage.
Avoid using VARCHAR(MAX) because we cannot apply indexing on that.
So we have started our journey with Stored Procedure Optimization. In the second part we will look into some more optimization tips. So stay tuned for the next one. Thanks for reading.