As a DBA I spend a lot of time monitoring and babysitting database servers. This can become tedious, not to mention boring, for e. I would rather spend my time doing something more constructive, so I have come up with a simple SQL Server task that will monitor database activities for me and notify me when there is an issue. This is both pro-active and re-active, in my opinion, because it gathers the information and gets it to me automatically, so I can get to the root cause fast.
This process consists of the following:
- 1 database to house the information (I have a central server for my DBA activities)
- 1 table to house the server information
- 1 job to collect data and kick off reports
- At least 1 alert to notify and kick off the job (I use CPU usage greater than 90% and insufficient resources, for example, but you decide when you want notification)
I have created a database that I use for my DBA activities that I call SQL_Monitoring, however, you can call yours whatever you like. Just remember to change it in the scripts below. Please make sure that you have the following before you begin:
- DBMail enabled on your server(s) - this is absolutely key to this process
- Create yourself as an operator so you can receive notifications if your job fails as well as used on alerts. Personally, I like to add a notification on every job I create in case it fails. This is optional.
I'm not going to go into details on how to set these up as they are readily available from other sources, however, just consider them as prerequisites.
Step One: Create the Database (optional if you already have a DBA database as described above - go to step 2). This will be used as the central place to store data that will be used in the reports.
Copy/Paste/Run the script below to create the database. Make sure to change this to suit your environment (paths, server names, operator names, alert names).
--Create the database USE [master] GO CREATE DATABASE [SQL_Monitoring] ON PRIMARY ( NAME = N'SQL_Monitoring', FILENAME = N'<ADD YOUR PATH HERE>\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\SQL_Monitoring.mdf' , SIZE = 95040KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB ) LOG ON ( NAME = N'SQL_Monitoring_log', FILENAME = N'<ADD YOUR PATH HERE>\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\SQL_Monitoring_log.ldf' , SIZE = 69760KB , MAXSIZE = 2048GB , FILEGROWTH = 10%) GO ALTER DATABASE [SQL_Monitoring] SET COMPATIBILITY_LEVEL = 100 GO IF (1 = FULLTEXTSERVICEPROPERTY('IsFullTextInstalled')) begin EXEC [SQL_Monitoring].[dbo].[sp_fulltext_database] @action = 'enable' end GO ALTER DATABASE [SQL_Monitoring] SET ANSI_NULL_DEFAULT OFF GO ALTER DATABASE [SQL_Monitoring] SET ANSI_NULLS OFF GO ALTER DATABASE [SQL_Monitoring] SET ANSI_PADDING OFF GO ALTER DATABASE [SQL_Monitoring] SET ANSI_WARNINGS OFF GO ALTER DATABASE [SQL_Monitoring] SET ARITHABORT OFF GO ALTER DATABASE [SQL_Monitoring] SET AUTO_CLOSE OFF GO ALTER DATABASE [SQL_Monitoring] SET AUTO_CREATE_STATISTICS ON GO ALTER DATABASE [SQL_Monitoring] SET AUTO_SHRINK OFF GO ALTER DATABASE [SQL_Monitoring] SET AUTO_UPDATE_STATISTICS ON GO ALTER DATABASE [SQL_Monitoring] SET CURSOR_CLOSE_ON_COMMIT OFF GO ALTER DATABASE [SQL_Monitoring] SET CURSOR_DEFAULT GLOBAL GO ALTER DATABASE [SQL_Monitoring] SET CONCAT_NULL_YIELDS_NULL OFF GO ALTER DATABASE [SQL_Monitoring] SET NUMERIC_ROUNDABORT OFF GO ALTER DATABASE [SQL_Monitoring] SET QUOTED_IDENTIFIER OFF GO ALTER DATABASE [SQL_Monitoring] SET RECURSIVE_TRIGGERS OFF GO ALTER DATABASE [SQL_Monitoring] SET DISABLE_BROKER GO ALTER DATABASE [SQL_Monitoring] SET AUTO_UPDATE_STATISTICS_ASYNC OFF GO ALTER DATABASE [SQL_Monitoring] SET DATE_CORRELATION_OPTIMIZATION OFF GO ALTER DATABASE [SQL_Monitoring] SET TRUSTWORTHY OFF GO ALTER DATABASE [SQL_Monitoring] SET ALLOW_SNAPSHOT_ISOLATION OFF GO ALTER DATABASE [SQL_Monitoring] SET PARAMETERIZATION SIMPLE GO ALTER DATABASE [SQL_Monitoring] SET READ_COMMITTED_SNAPSHOT OFF GO ALTER DATABASE [SQL_Monitoring] SET HONOR_BROKER_PRIORITY OFF GO ALTER DATABASE [SQL_Monitoring] SET READ_WRITE GO ALTER DATABASE [SQL_Monitoring] SET RECOVERY SIMPLE GO ALTER DATABASE [SQL_Monitoring] SET MULTI_USER GO ALTER DATABASE [SQL_Monitoring] SET PAGE_VERIFY CHECKSUM GO ALTER DATABASE [SQL_Monitoring] SET DB_CHAINING OFF GO
Step Two: Create the SP_Who_Results Table. This is where the results from running sp_who will be collected and stored. This table is truncated and reloaded with every run. I am writing this to a table so I can easily query the results and only use the columns in which I am interested. Plus, I can add a “where clause” to narrow down the results in my report in step three.
Copy/Paste/Run the script below. Make sure to change this to suit your environment.
--Create The Table USE [SQL_Monitoring] GO SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO SET ANSI_PADDING ON GO CREATE TABLE [dbo].[SP_Who_Results]( [SPID] [nchar](10) NULL, [ecid] [nchar](10) NULL, [status] [varchar](200) NULL, [loginame] [varchar](50) NULL, [hostname] [varchar](100) NULL, [blk] [nchar](10) NULL, [dbname] [varchar](max) NULL, [cmd] [varchar](max) NULL, [request_id] [nchar](10) NULL ) ON [PRIMARY] GO
Step Three: Create the Job “Report What's Running Now”
This job calls queries to return the following information in a user friendly HTML email format:
First determine what is currently running in SQL Server. This is created from the query:
SELECT program_name, count(*) FROM Master..sysprocesses WHERE ecid=0 and program_name <> '' '' GROUP BY program_name ORDER BY count(*) desc
Determine what processes are using a lot of CPU from SQL Server with this query:
SELECT spid, program_name, datediff(second,login_time, getdate()) FROM master..sysprocesses WHERE spid > 50 and PROGRAM_NAME not in (''Microsoft SQL Server Management Studio '')
Get the top 10 questionable SQL Server processes:
SELECT top 10 spid, blocked, convert(varchar(50),db_name(dbid)), cpu, datediff(second,login_time, getdate()), convert(varchar(16), hostname), convert(varchar(50), program_name), convert(varchar(20), loginame) FROM master..sysprocesses WHERE datediff(second,login_time, getdate()) > 0 and spid > 50 and cpu >=1024 ORDER BY 6 desc
Find SQL Server resource hogs:
SELECT convert(varchar(50), program_name), count(*), sum(cpu), sum(datediff(second, login_time, getdate())) FROM master..sysprocesses WHERE spid > 50 GROUP BY convert(varchar(50), program_name) ORDER BY 7 DESC
Find who is connected: this is created by running sp_who and dumping results in the table we created in step two.
Copy/Paste/Run the script below. Make sure to change this to suit your environment.
--Create the job USE [msdb] GO BEGIN TRANSACTION DECLARE @ReturnCode INT SELECT @ReturnCode = 0 IF NOT EXISTS (SELECT name FROM msdb.dbo.syscategories WHERE name=N'Data Collector' AND category_class=1) BEGIN EXEC @ReturnCode = msdb.dbo.sp_add_category @class=N'JOB', @type=N'LOCAL', @name=N'Data Collector' IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback END DECLARE @jobId BINARY(16) EXEC @ReturnCode = msdb.dbo.sp_add_job @job_name=”Whats Running Now', @enabled=1, @notify_level_eventlog=0, @notify_level_email=2, @notify_level_netsend=0, @notify_level_page=0, @delete_level=0, @description=N'Collects process data for: What is Currently Running in SQL Server Processes using a lot of CPU from SQL Server Top 10 Questionable SQL Server Processes SQL Server Resource Hogs Who is connected', @category_name=N'Data Collector', @owner_login_name=N'sa', @notify_email_operator_name=N'<Add Your Operator Name Here>', @job_id = @jobId OUTPUT IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback EXEC @ReturnCode = msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'Run the Report', @step_id=1, @cmdexec_success_code=0, @on_success_action=1, @on_success_step_id=0, @on_fail_action=2, @on_fail_step_id=0, @retry_attempts=0, @retry_interval=0, @os_run_priority=0, @subsystem=N'TSQL', @command=N'SET nocount ON DECLARE @Subject VARCHAR (100) SET @Subject=''SQL Server Whats Causing the High CPU & Performance Issues on '' + @@ServerName --clean up sp_who Truncate table [SP_Who_Results] --run SP_Who Insert into [SP_Who_Results] execute sp_who --Reports DECLARE @tableHTML NVARCHAR(MAX) ; SET @tableHTML = N''<strong><font color="red">What is Currently Running in SQL Server</font></strong> <br> <table border="1">'' + N''<tr>'' + N''<th>Program Name</th>'' + N''<th>Running Processes</th>'' + N''</tr>'' + CAST ( (SELECT td=program_name,'''' ,td= count(*),'''' FROM Master..sysprocesses WHERE ecid=0 and program_name <> '' '' GROUP BY program_name ORDER BY count(*) desc FOR XML PATH(''tr''), TYPE ) AS NVARCHAR(MAX) ) + N''</table> <br><br> <strong><font color="red">Processes using a lot of CPU from SQL Server</font></strong> <br>'' DECLARE @tableHTML4 NVARCHAR(MAX) ; SET @tableHTML4 = N''<table border="1">'' + N''<tr>'' + N''<th>SPID</th>'' + N''<th>Program Name</th>'' + N''<th>Connected Seconds</th>'' + N''</tr>'' + CAST ((SELECT td=spid,'''' ,td=program_name ,'''' ,td=datediff(second,login_time, getdate()),'''' FROM master..sysprocesses WHERE spid > 50 and PROGRAM_NAME not in (''Microsoft SQL Server Management Studio '') FOR XML PATH(''tr''), TYPE ) AS NVARCHAR(MAX) ) + N''</table> <br><br> <strong><font color="red">Top 10 Questionable SQL Server Processes</font></strong> <br>'' DECLARE @tableHTML5 NVARCHAR(MAX) ; SET @tableHTML5 = N''<table border="1">'' + N''<tr>'' + N''<th>SPID</th>'' + N''<th>Blocked</th>'' + N''<th>DB Name</th>'' + N''<th>CPU</th>'' + N''<th>Seconds</th>'' + N''<th>Host Name</th>'' + N''<th>Program</th>'' + N''<th>Login Name</th>'' + N''</tr>'' + CAST ((SELECT top 10 td=spid,'''' ,td=blocked,'''' ,td=convert(varchar(50),db_name(dbid)),'''' ,td=cpu,'''' ,td=datediff(second,login_time, getdate()),'''' ,td=convert(varchar(16), hostname),'''' ,td=convert(varchar(50), program_name),'''' ,td=convert(varchar(20), loginame),'''' FROM master..sysprocesses WHERE datediff(second,login_time, getdate()) > 0 and spid > 50 and cpu >=1024 ORDER BY 6 desc FOR XML PATH(''tr''), TYPE ) AS NVARCHAR(MAX) ) + N''</table> <br><br> <strong><font color="red">SQL Server Resource Hogs</font></strong> <br>'' DECLARE @tableHTML6 NVARCHAR(MAX) ; SET @tableHTML6 = N''<table border="1">'' + N''<tr>'' + N''<th>Program</th>'' + N''<th>Client Count</th>'' + N''<th>CPU Sum</th>'' + N''<th>Seconds Sum</th>'' + N''</tr>'' + CAST ((SELECT td=convert(varchar(50), program_name),'''' ,td=count(*),'''' ,td=sum(cpu),'''' ,td=sum(datediff(second, login_time, getdate())),'''' FROM master..sysprocesses WHERE spid > 50 GROUP BY convert(varchar(50), program_name) ORDER BY 7 DESC FOR XML PATH(''tr''), TYPE ) AS NVARCHAR(MAX) ) + N''</table> <br><br> <strong><font color="red">Who is connected</font></strong> <br>'' DECLARE @tableHTML7 NVARCHAR(MAX) ; SET @tableHTML7 = N''<table border="1">'' + N''<tr>'' + N''<th>SPID</th>'' + N''<th>Status</th>'' + N''<th>Login Name</th>'' + N''<th>Hostname</th>'' + N''<th>DB Name</th>'' + N''<th>Cmd</th>'' + N''</tr>'' + CAST ( (SELECT td=[spid],'''' ,td= [status],'''' ,td=[loginame],'''' ,td=[hostname],'''' ,td=[dbname],'''' ,td=[cmd],'''' FROM [SP_Who_Results] where dbname not IN (''master'', ''msdb'') ORDER BY 4,5 desc FOR XML PATH(''tr''), TYPE ) AS NVARCHAR(MAX) ) + N''</table>'' declare @body2 varchar(max) set @body2 = @tableHTML + '' '' + @tableHTML4 + '' '' + @tableHTML5 + '' '' + @tableHTML6 + '' '' +@tableHTML7 EXEC msdb.dbo.sp_send_dbmail @profile_name = ''<Add Mail Profile Here>'', @recipients = ''<Add DBA Email Here>'', @subject = @Subject, @body = @body2, @body_format = ''HTML'' ; ', @database_name=N'master', @output_file_name=N'<Add Path Here>\WhatsRunningErrors.txt', @flags=0 IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback EXEC @ReturnCode = msdb.dbo.sp_update_job @job_id = @jobId, @start_step_id = 1 IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback EXEC @ReturnCode = msdb.dbo.sp_add_jobserver @job_id = @jobId, @server_name = N'(local)' IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback COMMIT TRANSACTION GOTO EndSave QuitWithRollback: IF (@@TRANCOUNT > 0) ROLLBACK TRANSACTION EndSave: GO
Step Four: Create an alert. In order to tie all this together, you will need to call the job somehow. To be proactive I call it from an alert. You can call it from any alert you want however, in this case, I’m using CPU Utilization is High (>90%).
USE [msdb] GO EXEC msdb.dbo.sp_add_alert @name=N'CPU Utilization is High', @message_id=0, @severity=1, @enabled=1, @delay_between_responses=60, @include_event_description_in=1, @notification_message=N'CPU has been over 90% for the last 60 seconds on <ServerName>', @event_description_keyword=N'CPU Utilization', @category_name=N'[Uncategorized]', @job_id=N'' GO
Once the alert is created, open it and attach the job to it as the response as shown here:
Now when ever the CPU utilization reaches 90%, my report will run providing me with a report that looks like this in an email. I also attach this report to other alerts, feel free to choose your own – obviously, I’ve changed the names to protect the innocent:
What is Currently Running in SQL Server
Program Name
|
Running Processes
|
Microsoft SQL Server Management Studio - Query
|
6
|
Microsoft SQL Server Management Studio
|
5
|
DatabaseMail - DatabaseMail - Id<1792>
|
2
|
DatabaseMail - SQLAGENT - Id<932>
|
1
|
Report Server
|
1
|
SQLAgent - Alert Engine
|
1
|
SQLAgent - Email Logger
|
1
|
SQLAgent - Generic Refresher
|
1
|
SQLAgent - Job invocation engine
|
1
|
Processes using a lot of CPU from SQL Server
SPID
|
Program Name
|
Connected Seconds
|
51
|
DatabaseMail - SQLAGENT - Id<932>
|
79
|
52
|
SQLAgent - Email Logger
|
2607227
|
53
|
SQLAgent - Generic Refresher
|
2607227
|
54
|
Microsoft SQL Server Management Studio - Query
|
10972
|
55
|
SQLAgent - Alert Engine
|
2607207
|
57
|
DatabaseMail - DatabaseMail - Id<1792>
|
195
|
58
|
SQLAgent - Job invocation engine
|
2607142
|
59
|
Microsoft SQL Server Management Studio - Query
|
2850
|
60
|
DatabaseMail - DatabaseMail - Id<1792>
|
40
|
61
|
Report Server
|
2
|
64
|
User Db Name
|
38
|
65
|
Microsoft SQL Server Management Studio - Query
|
2115
|
69
|
Microsoft SQL Server Management Studio - Query
|
11776
|
70
|
Microsoft SQL Server Management Studio - Query
|
2487
|
73
|
Microsoft SQL Server Management Studio - Query
|
15962
|
Top 10 Questionable SQL Server Processes
SPID
|
Blocked
|
DB Name
|
CPU
|
Seconds
|
Host Name
|
Program
|
Login Name
|
55
|
0
|
msdb
|
1172017
|
2607207
|
host name
|
SQLAgent - Alert Engine
|
Login name
|
56
|
0
|
master
|
3880
|
8109
|
host name
|
Microsoft SQL Server Management Studio
|
Login name
|
58
|
0
|
msdb
|
4195
|
2607142
|
host name
|
SQLAgent - Job invocation engine
|
Login name
|
68
|
0
|
User DB Name
|
3494
|
11787
|
host name
|
Microsoft SQL Server Management Studio
|
Login name
|
SQL Server Resource Hogs
Program
|
Client Count
|
CPU Sum
|
Seconds Sum
|
SQLAgent - Email Logger
|
1
|
218
|
2607227
|
SQLAgent - Generic Refresher
|
1
|
389
|
2607227
|
SQLAgent - Alert Engine
|
1
|
1172017
|
2607207
|
SQLAgent - Job invocation engine
|
1
|
4195
|
2607142
|
Microsoft SQL Server Management Studio
|
5
|
9059
|
200228
|
Microsoft SQL Server Management Studio - Query
|
6
|
733
|
46162
|
DatabaseMail - DatabaseMail - Id<1792>
|
2
|
30
|
235
|
DatabaseMail - SQLAGENT - Id<932>
|
1
|
16
|
79
|
Some user db
|
1
|
0
|
38
|
Report Server
|
1
|
0
|
2
|
Who is connected
SPID
|
Status
|
Login Name
|
Hostname
|
DB Name
|
Cmd
|
68
|
sleeping
|
User login
|
host
|
User db
|
AWAITING COMMAND
|
69
|
sleeping
|
User login
|
host
|
User db
|
AWAITING COMMAND
|
61
|
sleeping
|
User login
|
host
|
ReportServer
|
AWAITING COMMAND
|
59
|
runnable
|
User login
|
host
|
Some database
|
SELECT
|
70
|
sleeping
|
User login
|
host
|
Some database
|
AWAITING COMMAND
|
64
|
sleeping
|
User login
|
host
|
User db
|
AWAITING COMMAND
|
Conclusion
Now you have a super simple way to proactivly monitor sql server without an expensive 3rd party application.
Included in this package, the following will be created:
Database: SQL_Monitoring
Table: SP_Who_Results
Job: Report Whats Running Now
No comments:
Post a Comment