Posts Tagged ‘SQL Server Index’

Here, I am writing some indexing related the best practices during index creation with SQL Server DB’s tables.

  • Periodically, run the Index Wizard or Database Engine Tuning Advisor against current Profiler traces to identify potentially missing indexes.
  • Check and remove unused indexes.
  • Its not a thumb rule to get optimized performance by creating Indexes, First check and create index instead of creating redundant indexes.
  • As a rule of thumb, every table should have at least a clustered index. Generally, but not always, the clustered index should be on a column that monotonically increases — such as an identity column, or some other column where the value is increasing — and is unique. In many cases, the primary key is the ideal column for a clustered index.
  • As we can have only one clustered index per table, take extra time to carefully consider how it will be used. Consider the type of queries that will be used against the table, and make an efficient judgement to which query is the most critical, and if this query will benefit from having a clustered index.
  • If a column in a table is not at least 95% unique, then most likely the query optimizer will not use a non-clustered index based on that column. Because of this, generally don’t want to add non-clustered indexes to columns that aren’t at least 95% unique.
  • Keep the width of indexes as narrow as possible. This reduces the size of the index and reduces the number of disk I/O reads required to read the index, boosting performance.
  • If possible, avoid adding a clustered index to a GUID column. GUIDs take up 16-bytes of storage, more than an Identify column, which makes the index larger, which increases I/O reads, which can hurt performance.
  • Try to avoid index on Timestamp column, its extra burden for I/O.
  • Indexes should be considered on all columns that are frequently accessed by the JOIN, WHERE, ORDER BY, GROUP BY, TOP, and DISTINCT clauses.
  • Don’t automatically add indexes on a table because it seems like the right thing to do. Only add indexes if you know that they will be used by the queries run against the table.
  • When creating indexes, try to make them unique indexes if at all possible. SQL Server can often search through a unique index faster than a non-unique index because in a unique index, each Row is unique, and once the needed record is found, SQL Server doesn’t have to look any further.
  • If we are regularly using joins between two or more tables in your queries, performance will be optimized if each of the joined columns has appropriate indexes.
  • Don’t automatically accept the default value of 100 for the fill factor for your indexes. It may or may not best meet your needs. A high fill factor is good for seldom changed data, but highly Modified data needs a lower fill factor to reduce page splitting.
  • Don’t over index OLTP tables, as every index will increases the time it takes to perform INSERTS, UPDATES, and DELETES. There is a fine line between having the ideal number of Indexes (for SELECTs) and the ideal number to minimize the overhead that occurs with indexes during data modifications.
  • If it’s known, an application is performing the same query over and over on the same table, consider creating a non-clustered covering index on the table. A covering index, which is a form of a composite index, includes all of the columns referenced in SELECT, JOIN, and WHERE clauses of a query.

    Happy Reading: Please Comment !

Advertisements

I hope, all professional like DBA/Dev/DB Architect feels, designing an appropriate set of indexes can be one of the more troubling aspects of developing efficient relational database applications. Sometime, it may be, the most important thing we can do to assure optimal application performance when accessing data in a relational/SQL database is to create the correct indexes for tables based on the queries which applications use. But we can start with some basics. For example, consider this SQL statement:


SELECT LASTNAME, SALARY FROM EMPLOYEE WHERE EMPID =‘00110’AND DEPTID =‘D001’;

What index or indexes would make sense for this simple query? First, think about all the possible indexes that may probably create. The list may looks something like this:

  • Index1 on EMPID
  • Index2 on DEPTID
  • Index3 on EMPID and DEPTID

Initially, it is a better way, and Index3 is probably the best of among these. It enables the DBMS/RDBMS or Any DB system to use the index to immediately search the row or rows that satisfy the two simple criteria in the WHERE clause, but as a practice we have to first analyze the impact of these indexes or creating yet another indexes on table of production system. We should be very cautious as appropriate index creation always very complicated, below are the few steps to be keep in mind at the time of index creation..

* INDEX BY WORKLOAD, NOT BY TABLE

Many people make the mistake of just guessing at some indexes to create when they are creating database tables. Without an idea of how the tables are going to be accessed, though, these guesses are often wrong – at least some of them.

Indexes should be built to optimize the access of SQL queries. To properly create an optimal set of indexes requires a list of the SQL to be used, an estimate of the frequency that each SQL statement will be executed, and the importance of each query. Only then can the delicate balancing act of creating the right indexes to optimize the right queries most of the time be made. Below is the query occasionally is used to check missing index as,

SELECT statement AS [database.scheme.table],
column_id , column_name, column_usage,
migs.user_seeks, migs.user_scans,
migs.last_user_seek, migs.avg_total_user_cost,
migs.avg_user_impact
FROM sys.dm_db_missing_index_details AS mid
CROSS APPLY sys.dm_db_missing_index_columns (mid.index_handle)
INNER JOIN sys.dm_db_missing_index_groups AS mig ON mig.index_handle = mid.index_handle
INNER JOIN sys.dm_db_missing_index_group_stats AS migs ON mig.index_group_handle=migs.group_handle
ORDER BY statement, mig.index_group_handle, mig.index_handle, column_id
GO

 

* BUILD INDEXES BASED ON PREDICATES

We can create an expression-based index to improve the performance of queries that use column-expression predicates. For example a query that contain multiple outer joins so that a predicate that can be satisfied by an index on expression is in a different query block than the outer joins.
The Cost Based Optimizer (CBO) is a rather complex piece of code that has to deal with countless different possible scenarios when trying to determine what the most optimal execution plan might be. It’s also a vitally important piece of code because not only do the decisions need to be reasonably accurate so that it doesn’t generate inefficient execution plans but it needs to make these decisions in a reasonably efficient manner else it wastes resources and most importantly wastes time while it performs its calculations.

 

* INDEX MOST-HEAVILY USED QUERIES

Numbers 2 and 3 can be thought of as consequences to Number 1… that is, these are the aspects of application workload that need to be examined to produce appropriate and effective indexes.

* INDEX IMPORTANT QUERIES

The more important the query, the more we might want to tune by index creation. If any management guy like CIO/CTO run a query/report every day, First make sure it should be up to optimal performance. So building indexes for that particular query is important. On the other hand, a query for a clerk might not necessarily be weighted as high, so that query might have to make do with the indexes that already exist. Of course, the decision depends on the application’s importance to the business-not just on the user’s importance. By the way, we have to indexed data properly and always maintain updated statistics and fragmentation.

* INDEX TO AVOID SORTING (GROUP BY, ORDER BY)

In addition to building indexes to optimize data access, indexes can be used to avoid sorting. The GROUP BY and ORDER BY clauses tend to invoke sorts, which can cause performance slowdowns. By indexing on the columns specified in these clauses the relational optimizer can use an index to avoid a sort, and thereby potentially improve performance.

* CREATE INDEXES FOR UNIQUENESS – Primary Key/unique index

Some indexes are required in order to make the database schema valid. Most database systems require that unique indexes be created when unique constraints and primary key constraints exist.

* CREATE INDEXES FOR FOREIGN KEYS

Creating indexes for each foreign key can optimize the performance when accessing and enforcing referential constraints (RI – referential integrity). Most database systems do not require such indexes, but they can improve performance.

* CONSIDER ADDING COLUMNS FOR INDEX ONLY ACCESS

 Sometimes it can be advantageous to include additional columns in an index to increase the chances of index-only access. With index-only access all of the data needed to satisfy the query can be found in the index alone — without having to read data from the table space.

For example, suppose that there is an index on the DEPTID column of the DEPT table. The following query may use this index:


SELECT DEPTNAME FROM DEPT WHERE DEPID =‘D001’;

The index could be used to access only those columns with a DEPTID greater than D000, but then the DBMS would need to access the data in the table space to return the DEPTNAME. If you added DEPTNAME to the index, that is, create the index on (DEPTID, DEPTNAME) then all of the data needed for this query exists in the index and additional I/O to the table space would not need be needed. This technique is sometimes referred to as index overloading.

 

* DON’T ARBITRARILY LIMIT NUMBER OF INDEXES

There should be no arbitrary limit on the number of indexes that you can create for any database table. Relational optimizers rely on indexes to build fast access paths to data. Without indexes data must be scanned – and that can be a long, inefficient means by which to retrieve your data.

Sometimes organizations develop database standards with rules that inhibit the number of indexes that can be created. When a standard such as this exists, it usually is stated as something like “Each table can have at most five indexes created for it” — or — “Do not create more than three indexes for any single table in the database.” These are bad standards. If you already have three indexes, or five indexes, or even 32 indexes, and another index will improve performance why would you arbitrarily want to avoid creating that index?

Anyway, a good indexing standard, if you choose to have one, should read something like this: “Create indexes as necessary to support your database queries. Limitations on creating new indexes should only be entertained when they begin significantly to impede the efficiency of data modification.”

* BE AWARE OF DATA MODIFICATION IMPLICATIONS

The DBMS must automatically maintain every index we create. This means every INSERT and every DELETE to an indexed table will insert and delete not just from the table, but also from its indexes.

Additionally, when we UPDATE the value of a column that has been defined in an index, the DBMS must also update the index. So, indexes speed the process of retrieval but slow down modification.

Source : Internet,Database information today,DBTA