Archive for the ‘Database Administrator’ Category

As a Developer / DBA, it’s a very frequent day to day routine task to delete unwanted objects – commonly Tables from SQL Server Instance. There may be different people have their own perception as per their system/environment to find unwanted objects, but here I am just discussing about empty table which belong to 0 (zero) records/rows and to list out those table here we can use any of below queries.

SELECT OBJECT_NAME(OBJECT_IDAS TableName , SUM(row_countAS [No. of Rows]
FROM sys.dm_db_partition_stats WHERE index_id in(0,1)
GROUP BY 
OBJECT_ID HAVING SUM(row_count)=0
ORDER 
BY TableName

———————————- OR ———————————————–

SELECT Obj.name as TableName, ps.row_count as [No. of Rows] FROM sys.indexes AS sIdx
INNER 
JOIN sys.objects AS Obj ON sIdx.OBJECT_ID = Obj.OBJECT_ID
INNER JOIN 
sys.dm_db_partition_stats AS ps ON sIdx.OBJECT_ID = ps.OBJECT_ID
WHERE sIdx.index_id < 2 and ps.row_count=0 AND sIdx.index_id = ps.index_id
ORDER BY TableName

 

 

It’s a very vast topic and people became very interested to know who has been deleted the database from their working environment , Here I am using an undocumented process to get the information from box very quickly and hope there will be not any incedent to using this command as its only select statement.

SELECT Operation, SUSER_SNAME([Transaction SID]) As UserName,  [Transaction Name][Begin Time][SPID]Description
FROM fn_dblog (NULL, NULL)  WHERE [Transaction Name] = ‘dbdestroy’

will return result as and Username here for further analysis.

I have seen many Developer/DBA uses user defined function or stored procedure to split a comma separated ( or any delimiter to separate values like |,;-.& or any character/special character) column into rows using various coding logic like while/for loop or using cursor etc.

Let see an example, below is a sample data,

And requirement is to split the every SkillSet column values in a single row according EMPID and EMPNAME like

To get above desired result without using any Function, Stored Procedure or any loop, simple flat SQL select statement will be like this,

SELECT EmpID,EmpName,Split.Data.value(‘.’, ‘VARCHAR(100)’AS String
FROM (Select EmpID,EmpName,CAST
(‘<M>’REPLACE(SkillSet,‘,’, ‘</M><M>’)+ ‘</M>’ AS XML) AS String FROM VirendraTestAS Data CROSS APPLY String.nodes (‘/M’AS Split(Data)

 

If you want to do RnD, below is the sample code for same.

–Step 1 ) Create a Sample table named as VirendraTest

CreateTable VirendraTest(EmpID Char(5),EmpName Varchar(30),SkillSet nvarchar(100))
go

–Step 2 ) Insert few sample data in above created table i.e. VirendraTest
Insert into VirendraTest
values(‘E001’,‘Virendra Yaduvanshi’,‘MS-SQL Server,C,C++,C#,.Net’),
(‘E002’,‘Manish Raj’,‘Account,Cash Management,Admin’)
      (‘E003’,‘Sanjay Singh’,‘.Net,VB6,Oracle’),
      (‘E004’,‘Shajia Khan’,‘Coldfusion,Delphi’),
      (‘E005’,‘Vikash Rai’,‘Php,Mongo,Cloud’),
      (‘E006’,‘Sandeep Arora’,‘Network,C,C++’),
      (‘E007’,‘Manpreet Kaur’,‘Java,Android,Mobile Devlopment’)
–Step 3 ) Check Table contents

Select from VirendraTest

–Step 4 ) Split comma seperated column SKILLSET in rows using below query

SELECT EmpID,EmpName,Split.Data.value(‘.’, ‘VARCHAR(100)’) AS String
FROM (Select EmpID,EmpName,CAST
(‘<M>’REPLACE(SkillSet,‘,’, ‘</M><M>’)+ ‘</M>’ AS XML) AS String FROM VirendraTestAS Data CROSS APPLY String.nodes (‘/M’AS Split(Data)

 

 

Its very common when a Developer/DBA wanted to see sample of few records, commonly they use SELECT TOP N records from a table being ordered by a column,  the query look like as

SELECT TOP N [COLUMNNAME] FROM TABLENAME ORDER BY [COLUMNNAME]

As we know above statement is that if the table has multiple records having the same value as that of the selected list then all those records will not be selected. It will select only one record. But if we want to select all the rows, with same value as the one selected we have to include WITH TIES option in the query. So the query would be

SELECT TOP N [COLUMNNAME] WITH TIES FROM TABLENAME ORDER BY [COLUMNNAME]
For demonstration purpose, below are the step by step details example

Step -1 , Create a Sample Table as PRODUCTLIST

Create table ProductList(ID Int Identity(1,1),PName varchar(30),Price Decimal(10,2))

Step -2 , Insert some sample data

insert into ProductList (Pname,Price) values
(‘Bajaj CFL’,    210.00),
(‘TL’,    135.00),
(‘Table Fan’    ,450.00),
(‘Iron’    ,450.00),
(‘Cable’    ,250.00),o
(‘USB Disk’    ,450.00),
(‘Floppy’    ,120.00),
(‘CD-R’    ,280.00),
(‘CD-W’    ,450.00),
(‘USB Cable’    ,180.00)

 

For testing, SELECT TOP 3 * from PRODUCTLIST Order by Price DESC will return only 3 records as


But SELECT TOP 3 WITH TIES from ProductList Order by Price DESC will return more than 3 records as


 If we examine about performance, initial one without TIES is more better compare to WITH TIES option, Let see an example here, suppose we have to find top 1 record having maximum price, let see the execution plan


Here we can see, WITH TIES option is performing very poorly compare to initial Select TOP query code, this happened because of ORDER BY clause, to resolve this here we have to create a proper index here for same to get the optimal performance.

It’s very rare but may be happen with us ,even we are trying to connect SQL Instance using sysadmin login and SSMS through error saying  Database ‘msdb’ cannot be opened. it has been marked suspect by recovery.


The error 926 commonly lead to inconsistencies in SQL database and affect it file format. The main factor is the due to corruption of the SQL MDF database file like hardware malfunctioning or less disk space on file system. The reason can be find to digging SQL Server Error Log.

We can use sp_resetstatus‘msdb’ at very initial stage, if problem not get resolved find more information using
DBCC CHECKDB.

Anyway MSDB has information on backups, jobs, dts/ssis etc, so it has not very frequent changes, so restoring last night backup work.
Other simplest way, if you are ready to bear/handle the loss of backup,jobs,dts/ssis information, Just simply find the MSDB’s MDF and LDF file named as stop the SQL server Instance service, rename both MDF and LDF files. Take a copy from running SQl server’s instance MDF and LDF file and copy it to problematic instance’s location where previously msdb db filename changed. (Note : To copy MDF and LDF files from running instance, first we have to stop that instance). After copying MDF n LDF files, just restart SQL server instance service and problem will got resolved.

 

2014 in review

Posted: December 30, 2014 by Virendra Yaduvanshi in Database Administrator

The WordPress.com stats helper monkeys prepared a 2014 annual report for this blog.

Here’s an excerpt:

The concert hall at the Sydney Opera House holds 2,700 people. This blog was viewed about 26,000 times in 2014. If it were a concert at Sydney Opera House, it would take about 10 sold-out performances for that many people to see it.

Click here to see the complete report.

As a SQL Server DBA, I hope we have to maintain following day to day activities.

  1. Check OS Event Logs, SQL Server Logs, and Security Logs for unusual events.
  2. Verify that all scheduled jobs have run successfully.
  3. Confirm that backups have been made and successfully saved to a secure location, also check where backups are restorable or not.
  4. Monitor disk space to ensure your SQL Servers won’t run out of disk space. For best performance, all disks should have 25% or more of free space.
  5. Throughout the day, periodically monitor server’s performance. Use System Monitor, Profiler, DMVs, or the SQL Server Performance Data Collector
  6. Use Management Studio or Profiler to help monitor and identify blocking issues.
  7. Keep a log of any changes what made on servers, including documentation of any performance issues which identified and corrected
  8. Create SQL Server alerts to notify about potential problems, and should have them e-mailed to team or at personal level. Always discuss about necessary topic with DEV/UAT/QC team and take action as needed.
  9. Take some time to learn something new as a DBA to further knowledge and professional development.
  10. Whenever get time, try to explain new things to Dev. team.

 My All dear DBA friends are requested, please share their checklist document what they are following on daily basis, please let discuss here.

I hope, all professional like DBA/Dev/DB Architect feels, designing an appropriate set of indexes can be one of the more troubling aspects of developing efficient relational database applications. Sometime, it may be, the most important thing we can do to assure optimal application performance when accessing data in a relational/SQL database is to create the correct indexes for tables based on the queries which applications use. But we can start with some basics. For example, consider this SQL statement:


SELECT LASTNAME, SALARY FROM EMPLOYEE WHERE EMPID =‘00110’AND DEPTID =‘D001’;

What index or indexes would make sense for this simple query? First, think about all the possible indexes that may probably create. The list may looks something like this:

  • Index1 on EMPID
  • Index2 on DEPTID
  • Index3 on EMPID and DEPTID

Initially, it is a better way, and Index3 is probably the best of among these. It enables the DBMS/RDBMS or Any DB system to use the index to immediately search the row or rows that satisfy the two simple criteria in the WHERE clause, but as a practice we have to first analyze the impact of these indexes or creating yet another indexes on table of production system. We should be very cautious as appropriate index creation always very complicated, below are the few steps to be keep in mind at the time of index creation..

* INDEX BY WORKLOAD, NOT BY TABLE

Many people make the mistake of just guessing at some indexes to create when they are creating database tables. Without an idea of how the tables are going to be accessed, though, these guesses are often wrong – at least some of them.

Indexes should be built to optimize the access of SQL queries. To properly create an optimal set of indexes requires a list of the SQL to be used, an estimate of the frequency that each SQL statement will be executed, and the importance of each query. Only then can the delicate balancing act of creating the right indexes to optimize the right queries most of the time be made. Below is the query occasionally is used to check missing index as,

SELECT statement AS [database.scheme.table],
column_id , column_name, column_usage,
migs.user_seeks, migs.user_scans,
migs.last_user_seek, migs.avg_total_user_cost,
migs.avg_user_impact
FROM sys.dm_db_missing_index_details AS mid
CROSS APPLY sys.dm_db_missing_index_columns (mid.index_handle)
INNER JOIN sys.dm_db_missing_index_groups AS mig ON mig.index_handle = mid.index_handle
INNER JOIN sys.dm_db_missing_index_group_stats AS migs ON mig.index_group_handle=migs.group_handle
ORDER BY statement, mig.index_group_handle, mig.index_handle, column_id
GO

 

* BUILD INDEXES BASED ON PREDICATES

We can create an expression-based index to improve the performance of queries that use column-expression predicates. For example a query that contain multiple outer joins so that a predicate that can be satisfied by an index on expression is in a different query block than the outer joins.
The Cost Based Optimizer (CBO) is a rather complex piece of code that has to deal with countless different possible scenarios when trying to determine what the most optimal execution plan might be. It’s also a vitally important piece of code because not only do the decisions need to be reasonably accurate so that it doesn’t generate inefficient execution plans but it needs to make these decisions in a reasonably efficient manner else it wastes resources and most importantly wastes time while it performs its calculations.

 

* INDEX MOST-HEAVILY USED QUERIES

Numbers 2 and 3 can be thought of as consequences to Number 1… that is, these are the aspects of application workload that need to be examined to produce appropriate and effective indexes.

* INDEX IMPORTANT QUERIES

The more important the query, the more we might want to tune by index creation. If any management guy like CIO/CTO run a query/report every day, First make sure it should be up to optimal performance. So building indexes for that particular query is important. On the other hand, a query for a clerk might not necessarily be weighted as high, so that query might have to make do with the indexes that already exist. Of course, the decision depends on the application’s importance to the business-not just on the user’s importance. By the way, we have to indexed data properly and always maintain updated statistics and fragmentation.

* INDEX TO AVOID SORTING (GROUP BY, ORDER BY)

In addition to building indexes to optimize data access, indexes can be used to avoid sorting. The GROUP BY and ORDER BY clauses tend to invoke sorts, which can cause performance slowdowns. By indexing on the columns specified in these clauses the relational optimizer can use an index to avoid a sort, and thereby potentially improve performance.

* CREATE INDEXES FOR UNIQUENESS – Primary Key/unique index

Some indexes are required in order to make the database schema valid. Most database systems require that unique indexes be created when unique constraints and primary key constraints exist.

* CREATE INDEXES FOR FOREIGN KEYS

Creating indexes for each foreign key can optimize the performance when accessing and enforcing referential constraints (RI – referential integrity). Most database systems do not require such indexes, but they can improve performance.

* CONSIDER ADDING COLUMNS FOR INDEX ONLY ACCESS

 Sometimes it can be advantageous to include additional columns in an index to increase the chances of index-only access. With index-only access all of the data needed to satisfy the query can be found in the index alone — without having to read data from the table space.

For example, suppose that there is an index on the DEPTID column of the DEPT table. The following query may use this index:


SELECT DEPTNAME FROM DEPT WHERE DEPID =‘D001’;

The index could be used to access only those columns with a DEPTID greater than D000, but then the DBMS would need to access the data in the table space to return the DEPTNAME. If you added DEPTNAME to the index, that is, create the index on (DEPTID, DEPTNAME) then all of the data needed for this query exists in the index and additional I/O to the table space would not need be needed. This technique is sometimes referred to as index overloading.

 

* DON’T ARBITRARILY LIMIT NUMBER OF INDEXES

There should be no arbitrary limit on the number of indexes that you can create for any database table. Relational optimizers rely on indexes to build fast access paths to data. Without indexes data must be scanned – and that can be a long, inefficient means by which to retrieve your data.

Sometimes organizations develop database standards with rules that inhibit the number of indexes that can be created. When a standard such as this exists, it usually is stated as something like “Each table can have at most five indexes created for it” — or — “Do not create more than three indexes for any single table in the database.” These are bad standards. If you already have three indexes, or five indexes, or even 32 indexes, and another index will improve performance why would you arbitrarily want to avoid creating that index?

Anyway, a good indexing standard, if you choose to have one, should read something like this: “Create indexes as necessary to support your database queries. Limitations on creating new indexes should only be entertained when they begin significantly to impede the efficiency of data modification.”

* BE AWARE OF DATA MODIFICATION IMPLICATIONS

The DBMS must automatically maintain every index we create. This means every INSERT and every DELETE to an indexed table will insert and delete not just from the table, but also from its indexes.

Additionally, when we UPDATE the value of a column that has been defined in an index, the DBMS must also update the index. So, indexes speed the process of retrieval but slow down modification.

Source : Internet,Database information today,DBTA

It can be very frustrating problems to investigate and debug the lock contention issues. Its happened due to concurrency problems. We have to first find out the self-question before blaming the Database system as.

  • Has the application run in the past without locking problems?
  • Have the lock timeouts or deadlocks started recently?
  • What version and level of the DBMS are running?
  • Does the problem only occur at certain times?
  • What has changed on the system (e.g., number of users, number of applications, amount of data in the tables, database maintenance/fix packs, changes to any other relevant software, etc?)
  • What, if anything, has changed in the application (e.g., isolation level, concurrent executions, volume of data, etc.)?

A developer who has written applications to access database data probably has had to deal with concurrency problems at some point in their career. When one application program tries to read data that’s in the process of being changed by another, the DBMS must control access until the modification is complete to ensure data integrity. Typically, DBMS products use a locking mechanism to control access and modifications while ensuring data integrity.

When one task is updating data on a page (or block), another task can’t access data (read or update) on that same page (or block) until the data modification is complete and committed. When multiple users can access and update the same data at the same time, a locking mechanism is required. This mechanism must be capable of differentiating between stable data and uncertain data. Stable data has been successfully committed and isn’t involved in an update in a current unit of work. Uncertain data is currently involved in an operation that could modify its contents.

Most of modern DBMS products allow us to control the level of locking (table, page/block, row), as well as to adjust other locking criteria (for example, locks per users, time to wait for locks, etc. Lock timeouts are one of the most perplexing issues encountered by database professionals. The longer a lock is held, the greater the potential impact to other applications. When an application requests a lock that’s already held by another process, and the lock can’t be shared, that application is suspended. A suspended process temporarily stops running until the lock can be acquired. When an application has been suspended for a pre-determined period of time, it will be terminated. When a process is terminated because it exceeds this period of time, it’s said to timeout. In other words, a timeout is caused by the unavailability of a given resource.

To minimize lock timeouts, be sure to design application programs with locking in mind from the start. Limit the number of rows accessed by coding predicates to filter unwanted rows. Doing so reduces the number of locks on pages containing rows that are accessed but not required, thereby reducing timeouts and deadlocks. Also, we should design update programs so the update is issued as close to the COMMIT point as possible. Doing so reduces the time that locks are held during a unit of work, which also reduces timeouts (and deadlocks).

Deadlocks also cause concurrency problems. A deadlock occurs when two separate processes compete for resources held by one another. For example, a deadlock transpires when a lock on PAGE1 and wants to lock PAGE2 but at the same time a lock on PAGE2 and wants a lock on PAGE1. One of the programs must be terminated to allow processing to continue. One technique to minimize deadlocks is to code your programs so that tables are accessed in the same order. By designing all application programs to access tables in the same order, you reduce the likelihood of deadlocks.

It is important to design all programs with a COMMIT strategy. A COMMIT externalizes the modifications that occurred in the program since the beginning of the program or the last COMMIT. A COMMIT ensures that all modifications have been physically applied to the database, thereby ensuring data integrity and recoverability. Failing to code COMMITs in a data modification program can cause lock timeouts for other concurrent tasks.

You can also control the isolation level, or serialization, of the data requests in our programs. Programs using the repeatable read locking strategy hold their locks until a COMMIT is issued. If no COMMITs are issued during the program, locks aren’t released until the program completes, thereby negatively affecting concurrency. This can cause lock timeouts and lock escalation.

For these, a DBA have techniques to minimize lock timeouts. When an object is being accessed concurrently by multiple programs or users, consider increasing free space, causing fewer rows to be stored on a single page, at least until data is added. The fewer rows per page, the less intrusive page locking will be because fewer rows will be impacted by a page lock. Locking is a complex issue and can be at the root of many performance problems.

Happy Reading …. Please suggest your views and experience on this topic.

It’s very difficult to be confident about the security of database environment? Because databases may contain sensitive or regulated information, critical applications or stored functions, ensuring database security is undoubtedly a number one priority. And with a number of users viewing and accessing the data, how about all those “who-what-when-where” details that might be hidden from your radar?

The increasing pressure of compliance regulations and security policies makes the deployment of high-level database protection a must-have for any organization. However, it’s generally observed, in almost 90% of cases, unnoticed changes to database configurations result in outages and security breaches.

For those looking for ways to advance database security, here are 5 SQL Server best practices to maintain database security and streamline compliance.

Tip 1: Minimize SQL server exposure and do not leave any “open doors”

We can take the first step to minimize security risks for SQL Server even before your installation is complete and fully configured. Install only required components. In the first place, when configuring your installation, remember the principle of least privilege. Running SQL Server services under an account with local Windows administrative privileges is not a good idea. In case a violator gains possession of such an account with extended privileges, the probability of unwanted outcomes increases. The risk of overall exposure can be minimized if you use a domain account with minimum required privileges instead.

It stands to reason to avoid using the default settings. Rename or disable the default system account for server administration after installation. The same is applicable to naming SQL Server instances instead of using the default instances. Changing the SQL Server port number, which is 1433 by default, will also help you minimize service and data exposure, and so will hiding SQL Server instances and/or disabling the SQL Server Browser service.

Also, do not leave anything unattended. Disable and remove everything which do not use, any unnecessary services or databases from production servers, for example, and sample or test data we may have used to verify successful installation.

Tip 2: Control who can access SQL server and how

When thinking about a user and service accounts authentication, be mindful of establishing user accountability and avoid misuse of privileged accounts. When we can choose between integrated (Windows) authentication and built-in SQL Server authentication, choose the first option whenever it is possible. Integrated authentication encrypts messages to validate users, while built-in authentication passes SQL Server logins and passwords across the network and keeps them unprotected. If you have to use built-in SQL Server authentication for application compatibility, make sure you have ensured a strong password policy.

Again, never use shared user accounts for administrators. A SQL Server administrators should have dedicated accounts with no administrative privileges in other systems. Also, make sure that each admin is using a personal user account. The same recommendation works for applications. Creating separate service accounts with descriptive names for each application that works with SQL is among security best practices

Tip 3: Plan database ownership and data security in advance

Start by identifying the needed level of protection and encryption for each database. This is an important issue when you have to deal with securing sensitive data, such as credit card numbers or patient health information, which is also a staple requirement to meet PCI or HIPAA compliance regulations. Having ensured complete visibility into what is happening across your databases, you strengthen security and streamline compliance by reducing the risk of missing suspicious activities.

When creating a database, make sure that you get all the necessary information about data confidentiality. Do not forget to assign distinct database owners, meaning that the same login should not be applied across all databases. In order to mitigate future risks, establish the same process for new database requests and approvals as well as for database retention.

Protecting database files on disk from unauthorized access and copying in real-time is highly recommended and can be done by leveraging database-level encryption with the Transparent Database Encryption (TDE) feature. In case you need to keep data encrypted in memory (until it is actively decrypted), and/or if we need to give granular users specific access to certain column or cell values, it is recommended that use cell-level encryption.

Tip 4: Regularly patch your SQL servers

The list of security best practices would not be complete without mentioning the need for proper patch management. Because attackers are actively looking for new security flaws in IT systems, and new malware and viruses appear every day, establishing proper patch management of your SQL servers should be among mandatory security practices.

A timely deployment of current versions of SQL service packs, cumulative updates and critical security hotfixes will advance the stability of database performance. It is also necessary to pay attention to regular updating of the underlying Windows Server operating system and any supporting applications, such as antivirus applications, as well.

Tip 5: Keep track of what’s going on

Finally, establishing accountability in many respects means staying up-to-date with configuration changes and user activity. This is an ongoing process of maintaining the actual state of security policies to make sure that all changes are authorized and documented.

Note: Always keep in mind that security is not a state – it is a process. Monitoring, alerting and reporting on changes must become a part of the entire data lifecycle.

Native audit logs allow us, to some extent, to check recent activities and changes affecting security, but obtaining an older view of changes made far long ago can be a challenge. Much excessive information is saved, and as a result logs very often do not contain the required data. On the contrary, change auditing can help detect unauthorized and malicious changes at early stages or show you the historical data, all of which help prevent data breaches and system downtime.

Security Requires a Thoughtful Policy : Try implementing continuous auditing to protect database environment against internal and external threats by ensuring complete visibility across databases

Happy Reading !

Recourse : dbta.com