1. Normalizing:
Normalize your table’s schema in such a way that all tables are reduced in columns and are related to other table with some reference.
That will improve performance while you are fetching data from tables that will also reduce the fetching of redundant data.
2. Define Primary and Foreign keys:
Make relation of your table in such a way that you can access any combination of data from various tables just by referencing keys.
3. Choose the most appropriate data types.
This is the main issue when actual data will be stored on local disk, because if you have given some inappropriate data type then it will consume more space than actually needed. That will degrade the disk IO performance.
4. Make Index:
Try to make Index on such columns which are frequently used in searching operations, so it will improve our query performance.
Do not make more indexes on one table that it will degrade the performance, because when table is to be updated by updating or inserting new data, indexes also will be updated, so try to make fewer indexes on table.
5. Return Values:
Return only those columns and rows which are actually needed for requirement. Do not try to fetch all rows and columns which are not needed, because it will slow down the fetching as well as make the high IO operations.
6. Avoid Expensive operations:
Try to avoid expensive operations such as “LIKE”, because we are normally using “LIKE %abc%” wildcard entries…and it will require table scan and it will make very slow response of query result.
7. Avoid Explicit or Implicit functions in WHERE Clause:
The optimizer cannot always select an index by using columns in a WHERE clause that are inside functions. Columns in a WHERE clause are seen as an expression rather than a column. Therefore, the columns are not used in the execution plan optimization.
EX: do not use where clause like this,
SELECT OrderID FROM NorthWind.dbo.Orders WHERE DATEADD(day, 15, OrderDate) = '07/23/1996'
Instead, we can use like,
SELECT OrderID FROM NorthWind.dbo.Orders WHERE OrderDate = DATEADD(day, -15, '07/23/1996')
8. Use stored Procedures and Parameterized Queries:
Advantages of using stored procedures are,
Logical separation of business logic: we can reduce the amount of code which is written in application code, so we can separate out some logic from application and embed into stored procedure. This has its own advantage like we have to change only single change in stored procedure and it will be reflected into all places.
Reduced deployment time: When we have embedded sql command in application code then if we want to change that business logic then after changes we have to deploy whole application again, but is we have used stored procedures then it requires only change in stored procedures.
Reduced Network bandwidth: if we supply whole sql command to the server then it will require more bandwidth, but if we have used stored procedures then we just have to supply only stored procedure name and its parameters.
SQL injections: stored procedure will protect against sql injections which are produced by direct user input which is used in sql command.
9. Minimize cursor use:
Cursors force database engine to repeatedly fetch rows, negotiate blocking, manage locks, and transmit results. Use forward only and read only cursors unless you wan tot update the tables. In cursors more locks are needed.
Forward only and read only cursors are fastest and least resource intensive to get data from the server.
10. Use Temporary Tables and Table Variables Appropriately
If your application frequently creates temporary tables, consider using the table variable or a permanent table. You can use the table data type to store a row set in memory. Table variables are cleaned up automatically at the end of the function, stored procedure, or batch that they are defined in. Many requests to create temporary tables may cause contention in both the tempdb database and in the system tables. Very large temporary tables are also problematic. If you find that you are creating many large temporary tables, you may want to consider a permanent table that can be truncated between uses.
Table variables use the tempdb database in a manner that is similar to how table variables use temporary tables, so avoid large table variables. Also, table variables are not considered by the optimizer when the optimizer generates execution plans and parallel queries. Therefore, table variables may cause decreased performance. Finally, table variables cannot be indexed as flexibly as temporary tables.
You have to test temporary table and table variable usage for performance. Test with many users for scalability to determine the approach that is best for each situation. Also, be aware that there may be concurrency issues when there are many temporary tables and variables that are requesting resources in the tempdb database.
11. Avoid LEFT JOINs and NULLs:
LEFT JOIN can be used to retrieve all of the rows from a first table and all matching rows from a second table, plus all rows from the second table that do not match the first one. For example, if you wanted to return every Customer and their orders, a LEFT JOIN would show the Customers who did and did not have orders.
LEFT JOINs are costly since they involve matching data against NULL (nonexistent) data. In some cases this is unavoidable, but the cost can be high. A LEFT JOIN is more costly than an INNER JOIN, so if you could rewrite a query so it doesn't use a LEFT JOIN, it could pay huge dividends.
One technique to speed up a query that uses a LEFT JOIN involves creating a TABLE datatype and inserting all of the rows from the first table (the one on the left-hand side of the LEFT JOIN), then updating the TABLE datatype with the values from the second table. This technique is a two-step process, but could save a lot of time compared to a standard LEFT JOIN. A good rule is to try out different techniques and time each of them until you get the best performing query for your application.
12. Rewriting query with OR conditions as a UNION
You can speedup your query,
SELECT * FROM table WHERE (Field1 = 'Value1') OR (Field2 = 'Value2')
By creating indexes on each field in the above conditions and by using a UNION operator instead
of using OR:
SELECT ... WHERE Field1 = 'Value1'
UNION
SELECT ... WHERE Field2 = 'Value2'
13. SP Performance Improvement without changing T-SQL
There are two ways, which can be used to improve the performance of Stored Procedure (SP) without making T-SQL changes in SP.
1. Do not prefix your Stored Procedure with sp_.
In SQL Server, all system SPs are prefixed with sp_. When any SP is called which begins sp_ it is looked into masters database first before it is looked into the database it is called in.
2. Call your Stored Procedure prefixed with dbo.SPName - fully qualified name.
When SP is called prefixed with dbo. Or database.dbo. It will prevent SQL Server from placing a COMPILE lock on the procedure. While SP executes it determines if all objects referenced in the code have the same owners as the objects in the current cached procedure plan.
Some more, little but very important tips…..
1. Know your data and business application well.
Familiarize yourself with these sources; you must be aware of the data volume and distribution in your database.
2. Test your queries with realistic data.
A SQL statement tested with unrealistic data may behave differently when used in production. To ensure rigorous testing, the data distribution in the test environment must also closely resemble that in the production environment.
3. Write identical SQL statements in your applications.
Take full advantage of stored procedures, and functions wherever possible. The benefits are performance gain as they are precompiled.
4. Use indexes on the tables carefully.
Be sure to create all the necessary indexes on the tables. However, too many of them can degrade performance.
5. Make an indexed path available.
To take advantage of indexes, write your SQL in such a manner that an indexed path is available to it. Using SQL hints is one of the ways to ensure the index is used.
6. Understand the Optimizer.
Understand the optimizer how it uses indexes, where clause, order by clause, having clause, etc.
7. Think globally when acting locally.
Any changes you make in the database to tune one SQL statement may affect the performance of other statements used by applications and users.
8. The WHERE clause is crucial.
The following WHERE clauses would not use the index access path even if an index is available.
e.g. Table1Col1 (Comparision Operator like >, <, >=, <=,) Table1Col2, Table1Col1 IS (NOT) NULL, Table1Col1 NOT IN (value1, value2), Table1Col1 != expression, Table1Col1 LIKE ‘%pattern%’, NOT Exists sub query.
9. Use WHERE instead of HAVING for record filtering.
Avoid using the HAVING clause along with GROUP BY on an indexed column.
10. Specify the leading index columns in WHERE clauses.
For a composite index, the query would use the index as long as the leading column of the index is specified in the WHERE clause.
11. Evaluate index scan vs. full table scan. (Index Only Searches Vs Large Table Scan, Minimize Table Passes)
If selecting more than 15 percent of the rows from a table, full table scan is usually faster than an index access path. An index is also not used if SQL Server has to perform implicit data conversion. When the percentage of table rows accessed is 15 percent or less, an index scan will work better because it results in multiple logical reads per row accessed, whereas a full table scan can read all the rows in a block in one logical read.
12. Use ORDER BY for index scan.
SQL Server optimizer will use an index scan if the ORDER BY clause is on an indexed column. The following query illustrates this point.
13. Minimize table passes.
Usually, reducing the number of table passes in a SQL query results in better performance. Queries with fewer table passes mean faster queries.
14. Join tables in the proper order.
Always perform the most restrictive search first to filter out the maximum number of rows in the early phases of a multiple table join. This way, the optimizer will have to work with fewer rows in the subsequent phases of join, improving performance.
15. Redundancy is good in where condition.
Provide as much information as possible in the WHERE clause. It will help optimizer to clearly infer conditions.
16. Keep it simple, stupid.
Very complex SQL statements can overwhelm the optimizer; sometimes writing multiple, simpler SQL will yield better performance than a single complex SQL statement.
17. You can reach the same destination in different ways.
Each SQL may use a different access path and may perform differently.
18. Reduce network traffic and increase throughput.
Using T-SQL blocks over Multiple SQL statements can achieve better performance as well as reduce network traffic. Stored Procedures are better over T-SQL blocks as they are stored in SQL Server and they are pre-compiled.
19. Better Hardware.
Better hard ware always helps performance. SCACI drives, Raid 10 Array, Multi processors CPU, 64-bit operating system improves the performance by great amount.
20. Avoid Cursors.
Using SQL Server cursors can result in some performance degradation in comparison with select statements. Try to use correlated sub query or derived tables if you need to perform row-by-row operations.
Thursday, December 20, 2007
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment