How to Avoid exceptions in SQL script and making DB robust

Posted by Nakul Lande in Sql Server category on 8/16/2012 for Intermediate level | Points: 250 | Views : 7382

Rating: 5 out of 5

1 vote(s)

Post Article |

Search Articles |

Articles Home

Hi, In my earlier post I gave some tips to improve sql performace and some tips to improve database quality.

This post is a next step in which I gave some excellent tips which will help you to avoid or atleast minimize the database surprises which we called as 'Exceptions'.

Introduction

Exceptions are unavoidable and no one is so perfect who can write defect free script. But by adopting few scripting habits, we can not only minimize the chances of exceptions but also make logic smart enough to handle exception situations. Here are some useful points which can help you to minimize the exceptions in SQL script and boost up performance

Using the Tricks

Please go through each point mentioned below :

1. Normalize your database and tables.

Normalizing a logical database design involves using formal methods to separate the data into multiple, related tables. Several narrow tables with fewer columns is characteristic of a normalized database. Reasonable normalization frequently improves performance. When useful indexes are available, the SQL Server query optimizer is efficient at selecting rapid, efficient joins between tables.

Some of the benefits of normalization include the following:

(a) Faster sorting and index creation.
(b) A larger number of clustered indexes.
(c) Narrower and more compact indexes.
(d) Fewer indexes per table. This improves the performance of the INSERT, UPDATE, and DELETE statements.
Many people ignore normalization during initial database design but sooner or later they have to pay for it. Always give a first thought to table normalization before you design your database.

2. Implement Indexes on Columns.

An index is an on-disk structure associated with a table or views that speed retrieval of rows from the table or view. An index contains keys built from one or more columns in the table or view. These keys are maintained through the B-tree structure that helps SQL Server to find the row or rows associated with the key values quickly and efficiently. Indexes are vital to efficient data access; however, sometime there is a cost associated with creating and maintaining an index structure. For every insert, update and Delete, each index must be updated; But SQL DBMS take care of that.

The Index helps in :

(a) Rapid access of information.

(b) Efficient access of information.

(c) Enforcement of uniqueness constraints.

So, No matter what kind of Index you choose (Clustered or Non-Clustered) you will always gain an improvement in performance and less hanging of (complex) queries.

3. Say No to 'Cursors' or at least avoid them.

The complete set of rows returned by the statement is known as the result set and each row in this result set satisfied the condition mentioned in WHERE clause. In real life scenario many applications need a mechanism to work with one row or a small block of rows at a time and this is the place where Cursors comes in to picture.If you came across such situation then use cursors wisely. Theoretically Cursors DBA's friend but they are fundamentally evil. Cursors force the database engine to repeatedly fetch rows, manage locks, and transmit results. Fetching a row from the cursor may result in a network round trip each time. This uses much more network bandwidth than would ordinarily be needed for the execution of a single SQL.

Cursors allocate resources on the server, for instance locks, packages, processes, temporary storage and combined all force the SQL in resource deadlock if script not written wisely.If you are helpless and had to go with Cursors only then use the appropriate cursors in wise manner for the job get done.

Take care of following while using cursor:

(a) Avoid scrollable cursors unless required.
(b) Use read-only cursors if you don’t intend to update. 90% of the situations need read-only cursors.
(c) Try to use Forward Only cursor when using cursors.
(d) Always (without fail) close and deallocate the cursors used.
(e) minimize the number of columns and records fetched in a cursor.

4. Increase execution Timeouts period.

Based on my past experience and lesson learned, I understood 1 thing that 90% of complex queries failed to return result just because of small timeout period.The default timeout is often low, like 30 seconds. When we execute the complex query or any report query through C# or .Net application then overall network traffic, bandwidth get added in execution cycle.If the timeout is set low then .net application throws 'Timed out' exception. Always wisely set the timeout period for queries which are processing complex or lengthy database (like reporting queries).This will help you greatly to reduced unwanted, unexpected exceptions in query output.

5. Use Transactions blocks to main data integrity.

In database world Transactions are defined as sequence of operations performed as a single logical unit of work. Transaction is also a great life saving term in DBA's world.Transactions works on a terminology called 'ACID' which is a acronym for Atomic, Consistent, Isolated, Durability.

(a) Atomic: Atomic means that the whole block of work in the transaction is treated as a single unit so that either the whole block will performed or whole block will fail. If transaction fails then all the changes can be rolled back. This help to maintain to maintain database consistency.

(b) Consistent: Consistent means that a completed transaction leaves the database in a consistent internal state.

(c) Isolation: Isolations means that the transaction sees the database in a consistent state. This transaction operates on a consistent view of the data. If two transactions try to update the same table, one will go first and then the other will follow.

(d) Durability: Durability means that the results of the transaction are permanently stored in the system.

Use transaction whenever database consistency, integrity is on priority. This will save you when things get wrong. Through transactions you can easily avoid exception and unexpected results in output.

6. Minimize GOTO Usage.

Use the GOTO statement wisely and think twice before using it in script. Excessive use of the GOTO statement can make it difficult to understand the logic of a Transact-SQL batch This is also a infamous time-proven means of adding disorder to program flow. GOTO is best used for breaking out of deeply nested control-of-flow statements but on the other hand, Unintelligent use of GOTO is a quick ticket to unreadable code.

7. Analyze Deadlocks for resource sharing.

Deadlocks are real nightmares for any DBA and avoiding them in complex and multiuser environment (social web sites) are always a challenge. SQL Server automatically detects and breaks the deadlock. The terminated transaction will be automatically rolled back and an error code 1205 will be issued. But still there are many areas which cause and participate in deadlock situations and with wise thinking we can avoid those deadlocks.

(a) Locks: Waiting to acquire locks on resources, such as objects, pages, rows, metadata, and applications can cause deadlock.

(b) Worker Process threads: A queued task waiting for an available worker thread can cause deadlock. If the queued task owns resources that are blocking all worker threads, a deadlock will result.

For example, session S1 starts a transaction and acquires a shared (S) lock on row r1 and then goes to sleep. Active sessions running on all available worker threads are trying to acquire exclusive (X) locks on row r1. Because session S1 cannot acquire a worker thread, it cannot commit the transaction and release the lock on row r1. This results in a deadlock.

(c) Memory: When concurrent requests are waiting for memory grants that cannot be satisfied with the available memory, a deadlock can occur. For example, two concurrent queries, Q1 and Q2, execute as user-defined functions that acquire 10MB and 20MB of memory respectively. If each query needs 30MB and the total available memory is 20MB, then Q1 and Q2 must wait for each other to release memory, and this result in a deadlock.

(d) Parallel query execution-related resources: Coordinator, producer, or consumer threads associated with an exchange port may block each other causing a deadlock usually when including at least one other process that is not a part of the parallel query. Also, when a parallel query starts execution, SQL Server determines the degree of parallelism, or the number of worker threads, based upon the current workload. If the system workload unexpectedly changes, for example, where new queries start running on the server or the system runs out of worker threads, then a deadlock could occur.

8. Avoid NULLable columns or give them special treatment.

If possible, always normalize your table to separate nullable and not nullable columns. Split the table into 'Required Data' table and 'Optional Data' table.The required data table contain basic information that will need all the time whereas optional data table contains extra or null information which don't need as often.NULL is always a bit of an oddball in databases and consume an extra byte on each NULLable column in each row and have more overhead associated when querying data.

If the nullable columns are unavoidable then encapsulate such nullable columns with the condition which can handle null value. For example if .Net code is reading SQL result set with nullable columns then read nullable columns in special nullable variables or use some if-else condition.

The Try-catch block in .Net code can also handle null value exception.In general 'Null' value or 'Nullable' columns are not bad guys and did not even affect any performance but they are the most frequent reasons for exceptions while reading result set.

Always apply thumb rule of exception handling mechanism whenever your result set containing nullable columns.

9. Avoid TEXT data type.

Don’t use 'TEXT' data type in table unless it is really unavoidable. The TEXT data type is not friendly for flexible query and for performance. Even with the log you can easily find out how TEXT column degrade the query performance.TEXT data type is slow and wastes a lot of space if used incorrectly. Better use VARCHAR(MAX) instead of TEXT data type.

Few more drawbacks of TEXT data type are:

(a) We can't use DISTINCT keyword for TEXT data type.
(b) We can't GROUP BY any TEXT data type column.
(c) We can't use TEXT data type in UNION clause.
(d) We can't compare TEXT data type using ' = '.
(e) We can't use TEXT data type in any aggregate function, like COUNT().

All these points lead to unexpected exceptions in your script logic so better stay away from TEXT data type.

10. Multiple User Scenarios - Data Integrity.

In multiuser scenario the updating by last writer always wins and some of the earlier updates will be lost. If the business logic is driven based on the data value then wrong value in database can turn into exception in business logic.This situation can be easily avoided by adding a timestamp column in table and always check the last value in this column before inserting or updating records.Make it a habit to write business logic by considering timestamp value in logic. This will help you to avoid many future exceptions.

11. Use SCOPE_IDENTITY to get new identity record value.

Using "SELECT max(ID) from Table" sounds like the easiest solution to fetch the most recent record entered in table. And many new comers and less experienced users use to fetch recent record with the same logic.This simple logic is the most weird logic in multiuser scenario where multiple users simultaneously making database entry. Using this logic, user will never get the correct record ID which clash in business logic.To avoid this common mistake, always use one of SCOPE_IDENTITY or IDENT_CURRENT constants in script. SCOPE_IDENTITY is more precise as this would give you the identity value of the record created from the current context/session.

12. Check Query Execution Plans before freezing your script.

SQL Server query analyzer is very powerful and handy tool and you'll learn a lot of how it works and how the query and index design can affect performance through it. It help you to understand the execution plan that the execution plan window shows for potential bottlenecks and you can take corrective actions to avoid future exceptions.

13. Keep Procedures/UDF Small and sweet.

Many experienced people loves to write log, bulky and complex SPs. Many of these people are affected with the influence of showing their superiority of writing complex tasks.

But long and bulky SPs have other long term drawbacks.

(a) It became hard to understand the business logic and flow if user referring the SP after long period or if the user is new. This is a major reason to exceptions.
(b) Complicated business flow leads to unhandled exceptions which can occur for certain conditions which we cannot judge from current complicated script.
(c) Two users invoking the same complicated stored procedure simultaneously will cause the procedure to create two query plans in cache.

Try to keep SPs or UDF small in size and scope. It is much more efficient to have one stored procedure call other SP then to have one large complicated procedure.

14. For Bulk INSERT avoid insert scripts.

Use DTS or the BCP utility for bulk insert of large data. Both these tools are flexible and fast in bulk record insertion operation.Avoiding use of large or N number of insert statement in SP for bulk uploading feature, they are not efficient and are not designed for the same and gives lots of exceptions plus failures.

15. Catch Exceptions to avoid surprises.

All modern PL/SQL provides a feature to handle the Exceptions which occur in a PL/SQL Block known as exception Handling. Using Exception Handling we can test the code and avoid it from exiting abruptly instead of getting surprise exception.When an exception occurs a message which explains its cause is received and user can write suitable piece of code to take corrective actions.

There are 3 types of Exceptions.

(a) Named System Exceptions.
(b) Unnamed System Exceptions.
(c) User-defined Exceptions.

Named System Exceptions: System exceptions are automatically raised by DBMS, when a program violates a RDBMS rule. There are some system exceptions (NO_DATA_FOUND or ZERO_DIVIDE) which are raised frequently, so they are pre-defined and given a name in DBMS which are known as Named System Exceptions.

Unnamed System Exceptions: Those system exceptions for which DBNS does not provide a name is known as unnamed system exception. These exceptions do not occur frequently. These Exceptions have a code and an associated message.

User-defined Exceptions: Apart from system exceptions we can explicitly define exceptions based on business rules. These are known as user-defined exceptions.

There are few steps to be followed to use user-defined exceptions:

-They should be explicitly declared in the declaration section.
-They should be explicitly raised in the Execution Section.
-They should be handled by referencing the user-defined exception name in the exception section.

Make it a habit of using exception handling mechanism while writing SQL scripts.

Some of the exceptions handling techniques are:

(a) TRY .. CATCH block in SQL 2008 - Try catch construct catches all execution errors that have a severity higher than 10 that do not close the database connection.

(b) @@ERROR global variable - Make sure that the @@ERROR global variable is checked after every statement which causes an update to the database (INSERT, UPDATE, DELETE).

16. Date Assumption.

Prevent issues with the interpretation of centuries in dates; do not specify years using two digits. Assuming dates formats is the first place to break an application. Hence avoid making this assumption.

17. Keep your database with latest Security / Service Packs

Although this point is not applicable to each and everyone and totally under the decision of network and database administrators, but it is always good to stay date with latest security patches on the software's platforms.The latest s/w patches comes with solutions for the defect in earlier version and minimized the occurrence of any unprecedented surprise exception.

So stay alert.. stay frosty.

18. Handle SQL Injection

As per Wikipedia, SQL injection is a technique often used to attack databases through a website. This is done by including portions of SQL statements in a web form entry field in an attempt to get the website to pass a newly formed rogue SQL command to the database. The vulnerability happens when user input is either incorrectly filtered for string literal escape characters embedded in SQL statements or user input is not strongly typed and unexpectedly executed. SQL commands are thus injected from the web form into the database of an application (like queries) to change the database content or dump the database information like credit card or passwords to the attacker. SQL injection is mostly known as an attack vector for websites but can be used to attack any type of SQL database.

You can same yourself from SQL ejection by adopting following steps:

(a) Constrain and sanitize input data. - Check for known good data by validating for type, length, format, and range.

(b) Use type-safe SQL parameters for data access. - You can use these parameters with stored procedures or dynamically constructed SQL command strings. Parameter collections such as SqlParameterCollection provide type checking and length validation. If you use a parameters collection, input is treated as a literal value, and SQL Server does not treat it as executable code. An additional benefit of using a parameters collection is that you can enforce type and length checks. Values outside of the range trigger an exception. This is a good example of defense in depth.

(c) Limits database permission - Use an account that has restricted permissions in the database. Ideally, you should only grant execute permissions to selected stored procedures in the database and provide no direct table access.

(d) Avoid direct system error - Avoid disclosing database error information. In the event of database errors, make sure you do not disclose detailed error messages to the user.

Security is the most important thing in DBMS as it is the soul of any application and a prime concern for everyone. Always validate all the incoming parameters at all levels of the application. Limit the scope of possible damage by permitting only minimally privileged accounts to send user input to the server.

Adding to it, run SQL Server itself with the least necessary privileges.

19. Avoid Un-necessary Indexes.

Avoid creating un-necessary indexes on table thinking they would improve your performance. For example creating an index on 'Gender' column of table will never improve performace because there are 50% chance for each value (Male/Female) and sql has to traverse entire table to sort data.Understand that creating Indexes and maintaining them are overheads that you incur. And these surely do reduce the throughput for the whole application.

Conclusion

I am sure that this article will definitely help you to take corrective actions and right steps in your work and will encourage you to take some smarts steps going forward.

Cheers !!!!

Reference

en.wikipedia.org/wiki/

About the Author

Full Name: Nakul Lande
Member Level: Bronze
Member Status: Member
Member Since: 7/24/2012 5:13:29 PM
Country: United States
Nakul Lande
http://www.dotnetfunda.com

Bookmark It

Login to vote for this post.

Latest Articles

Comments or Responses

Comment using

(Author doesn't get notification)