Monday, 16 February 2015

What’s going on, on my server, right now?


As a DBA I spend a lot of time monitoring and babysitting database servers.  This can become tedious, not to mention boring, for e.  I would rather spend my time doing something more constructive, so I have come up with a simple SQL Server task that will monitor database activities for me and notify me when there is an issue.  This is both pro-active and re-active, in my opinion, because it gathers the information and gets it to me automatically, so I can get to the root cause fast.
This process consists of the following:
  • 1 database to house the information (I have a central server for my DBA activities)
  • 1 table to house the server information
  • 1 job to collect data and kick off reports
  • At least 1 alert to notify and kick off the job (I use CPU usage greater than 90% and insufficient resources, for example, but you decide when you want notification)
I have created a database that I use for my DBA activities that I call SQL_Monitoring, however, you can call yours whatever you like. Just remember to change it in the scripts below. Please make sure that you have the following before you begin:
  • DBMail enabled on your server(s) - this is absolutely key to this process
  • Create yourself as an operator so you can receive notifications if your job fails as well as used on alerts. Personally, I like to add a notification on every job I create in case it fails. This is optional.
I'm not going to go into details on how to set these up as they are readily available from other sources, however, just consider them as prerequisites.
Step One:  Create the Database (optional if you already have a DBA database as described above - go to step 2).  This will be used as the central place to store data that will be used in the reports.
Copy/Paste/Run the script below to create the database.  Make sure to change this to suit your environment (paths, server names, operator names, alert names).
--Create the database
USE [master]
GO
CREATE DATABASE [SQL_Monitoring] ON  PRIMARY
( NAME = N'SQL_Monitoring', FILENAME = N'<ADD YOUR PATH HERE>\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\SQL_Monitoring.mdf' , SIZE = 95040KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB )
 LOG ON
( NAME = N'SQL_Monitoring_log', FILENAME = N'<ADD YOUR PATH HERE>\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\SQL_Monitoring_log.ldf' , SIZE = 69760KB , MAXSIZE = 2048GB , FILEGROWTH = 10%)
GO

ALTER DATABASE [SQL_Monitoring] SET COMPATIBILITY_LEVEL = 100
GO

IF (1 = FULLTEXTSERVICEPROPERTY('IsFullTextInstalled'))
 begin
  EXEC [SQL_Monitoring].[dbo].[sp_fulltext_database] @action = 'enable'
 end
GO
ALTER DATABASE [SQL_Monitoring] SET ANSI_NULL_DEFAULT OFF
GO
ALTER DATABASE [SQL_Monitoring] SET ANSI_NULLS OFF
GO
ALTER DATABASE [SQL_Monitoring] SET ANSI_PADDING OFF
GO
ALTER DATABASE [SQL_Monitoring] SET ANSI_WARNINGS OFF
GO
ALTER DATABASE [SQL_Monitoring] SET ARITHABORT OFF
GO
ALTER DATABASE [SQL_Monitoring] SET AUTO_CLOSE OFF
GO
ALTER DATABASE [SQL_Monitoring] SET AUTO_CREATE_STATISTICS ON
GO
ALTER DATABASE [SQL_Monitoring] SET AUTO_SHRINK OFF
GO
ALTER DATABASE [SQL_Monitoring] SET AUTO_UPDATE_STATISTICS ON
GO
ALTER DATABASE [SQL_Monitoring] SET CURSOR_CLOSE_ON_COMMIT OFF
GO
ALTER DATABASE [SQL_Monitoring] SET CURSOR_DEFAULT  GLOBAL
GO
ALTER DATABASE [SQL_Monitoring] SET CONCAT_NULL_YIELDS_NULL OFF
GO
ALTER DATABASE [SQL_Monitoring] SET NUMERIC_ROUNDABORT OFF
GO
ALTER DATABASE [SQL_Monitoring] SET QUOTED_IDENTIFIER OFF
GO
ALTER DATABASE [SQL_Monitoring] SET RECURSIVE_TRIGGERS OFF
GO
ALTER DATABASE [SQL_Monitoring] SET  DISABLE_BROKER
GO
ALTER DATABASE [SQL_Monitoring] SET AUTO_UPDATE_STATISTICS_ASYNC OFF
GO
ALTER DATABASE [SQL_Monitoring] SET DATE_CORRELATION_OPTIMIZATION OFF
GO
ALTER DATABASE [SQL_Monitoring] SET TRUSTWORTHY OFF
GO
ALTER DATABASE [SQL_Monitoring] SET ALLOW_SNAPSHOT_ISOLATION OFF
GO
ALTER DATABASE [SQL_Monitoring] SET PARAMETERIZATION SIMPLE
GO
ALTER DATABASE [SQL_Monitoring] SET READ_COMMITTED_SNAPSHOT OFF
GO
ALTER DATABASE [SQL_Monitoring] SET HONOR_BROKER_PRIORITY OFF
GO
ALTER DATABASE [SQL_Monitoring] SET  READ_WRITE
GO
ALTER DATABASE [SQL_Monitoring] SET RECOVERY SIMPLE
GO
ALTER DATABASE [SQL_Monitoring] SET  MULTI_USER
GO
ALTER DATABASE [SQL_Monitoring] SET PAGE_VERIFY CHECKSUM
GO
ALTER DATABASE [SQL_Monitoring] SET DB_CHAINING OFF
GO
Step Two:  Create the SP_Who_Results Table.  This is where the results from running sp_who will be collected and stored.  This table is truncated and reloaded with every run.  I am writing this to a table so I can easily query the results and only use the columns in which I am interested. Plus, I can add a “where clause” to narrow down the results in my report in step three.
Copy/Paste/Run the script below. Make sure to change this to suit your environment.
--Create The Table
USE [SQL_Monitoring]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[SP_Who_Results](
            [SPID] [nchar](10) NULL,
            [ecid] [nchar](10) NULL,
            [status] [varchar](200) NULL,
            [loginame] [varchar](50) NULL,
            [hostname] [varchar](100) NULL,
            [blk] [nchar](10) NULL,
            [dbname] [varchar](max) NULL,
            [cmd] [varchar](max) NULL,
            [request_id] [nchar](10) NULL
) ON [PRIMARY]
GO
Step Three:  Create the Job “Report What's Running Now” 
This job calls queries to return the following information in a user friendly HTML email format:
First determine what is currently running in SQL Server. This is created from the query:
SELECT
    program_name,
    count(*)
 FROM Master..sysprocesses
 WHERE ecid=0
 and program_name <> '' '' 
 GROUP BY program_name 
 ORDER BY count(*) desc
Determine what processes are using a lot of CPU from SQL Server with this query:
SELECT 
    spid,
    program_name,
    datediff(second,login_time, getdate()) 
 FROM master..sysprocesses
 WHERE spid > 50 
 and PROGRAM_NAME not in (''Microsoft SQL Server Management Studio '')
Get the top 10 questionable SQL Server processes:
SELECT top 10
    spid,
    blocked,
    convert(varchar(50),db_name(dbid)), cpu, datediff(second,login_time, getdate()),
    convert(varchar(16), hostname),
    convert(varchar(50), program_name),
    convert(varchar(20), loginame)
 FROM master..sysprocesses
 WHERE datediff(second,login_time, getdate()) > 0
 and spid > 50
 and cpu >=1024
 ORDER BY 6 desc
Find SQL Server resource hogs:
SELECT
    convert(varchar(50), program_name),
    count(*),
    sum(cpu),
    sum(datediff(second, login_time, getdate()))
 FROM master..sysprocesses
 WHERE spid > 50
 GROUP BY convert(varchar(50), program_name)
 ORDER BY 7 DESC
Find who is connected: this is created by running sp_who and dumping results in the table we created in step two.
Copy/Paste/Run the script below. Make sure to change this to suit your environment.
--Create the job
USE [msdb]
GO
BEGIN TRANSACTION

DECLARE @ReturnCode INT

SELECT @ReturnCode = 0

IF NOT EXISTS (SELECT name FROM msdb.dbo.syscategories WHERE name=N'Data Collector' AND category_class=1)
 BEGIN
  EXEC @ReturnCode = msdb.dbo.sp_add_category @class=N'JOB', @type=N'LOCAL', @name=N'Data Collector'
  IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
 END

DECLARE @jobId BINARY(16)
EXEC @ReturnCode =  msdb.dbo.sp_add_job
                        @job_name=”Whats Running Now',
                        @enabled=1,
                        @notify_level_eventlog=0,
                        @notify_level_email=2,
                        @notify_level_netsend=0,
                        @notify_level_page=0,
                        @delete_level=0,
                        @description=N'Collects process data for:
What is Currently Running in SQL Server
Processes using a lot of CPU from SQL Server
Top 10 Questionable SQL Server Processes
SQL Server Resource Hogs
Who is connected',
                        @category_name=N'Data Collector',
                        @owner_login_name=N'sa',
                        @notify_email_operator_name=N'<Add Your Operator Name Here>', @job_id = @jobId OUTPUT

IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
  EXEC @ReturnCode = msdb.dbo.sp_add_jobstep
                        @job_id=@jobId,
                        @step_name=N'Run the Report',
                        @step_id=1,
                        @cmdexec_success_code=0,
                        @on_success_action=1,
                        @on_success_step_id=0,
                        @on_fail_action=2,
                        @on_fail_step_id=0,
                        @retry_attempts=0,
                        @retry_interval=0,
                        @os_run_priority=0, @subsystem=N'TSQL',
                        @command=N'SET nocount ON

DECLARE @Subject VARCHAR (100)

SET @Subject=''SQL Server Whats Causing the High CPU & Performance Issues on '' + @@ServerName

--clean up sp_who
Truncate table [SP_Who_Results]

--run SP_Who
Insert into [SP_Who_Results]
 execute sp_who

--Reports
DECLARE @tableHTML NVARCHAR(MAX) ;
SET @tableHTML =
N''<strong><font color="red">What is Currently Running in SQL Server</font></strong> <br>
<table border="1">'' +
N''<tr>'' +
N''<th>Program Name</th>'' +
N''<th>Running Processes</th>'' +
N''</tr>'' +
CAST ( (SELECT td=program_name,''''
,td= count(*),''''
FROM Master..sysprocesses
WHERE ecid=0 and program_name <> '' ''
GROUP BY program_name
ORDER BY count(*) desc
FOR XML PATH(''tr''), TYPE
) AS NVARCHAR(MAX) ) +
N''</table>
<br><br>
<strong><font color="red">Processes using a lot of CPU from SQL Server</font></strong> <br>''

DECLARE @tableHTML4 NVARCHAR(MAX) ;

SET @tableHTML4 =
N''<table border="1">'' +
N''<tr>'' +
N''<th>SPID</th>'' +
N''<th>Program Name</th>'' +
N''<th>Connected Seconds</th>'' +
N''</tr>'' +
CAST ((SELECT  td=spid,''''
,td=program_name ,''''
,td=datediff(second,login_time, getdate()),''''
FROM master..sysprocesses WHERE spid > 50 and PROGRAM_NAME not in (''Microsoft SQL Server Management Studio '')
FOR XML PATH(''tr''), TYPE
) AS NVARCHAR(MAX) ) +
N''</table>
<br><br>
<strong><font color="red">Top 10 Questionable SQL Server Processes</font></strong> <br>''

DECLARE @tableHTML5 NVARCHAR(MAX) ;

SET @tableHTML5 =
N''<table border="1">'' +
N''<tr>'' +
N''<th>SPID</th>'' +
N''<th>Blocked</th>'' +
N''<th>DB Name</th>'' +
N''<th>CPU</th>'' +
N''<th>Seconds</th>'' +
N''<th>Host Name</th>'' +
N''<th>Program</th>'' +
N''<th>Login Name</th>'' +
N''</tr>'' +
CAST ((SELECT  top 10 td=spid,''''
,td=blocked,''''
,td=convert(varchar(50),db_name(dbid)),''''
,td=cpu,''''
,td=datediff(second,login_time, getdate()),''''
,td=convert(varchar(16), hostname),''''
,td=convert(varchar(50), program_name),''''
,td=convert(varchar(20), loginame),''''
FROM master..sysprocesses
WHERE datediff(second,login_time, getdate()) > 0 and spid > 50 and cpu >=1024
ORDER BY 6 desc
FOR XML PATH(''tr''), TYPE
) AS NVARCHAR(MAX) ) +
N''</table>
<br><br>
<strong><font color="red">SQL Server Resource Hogs</font></strong> <br>''

DECLARE @tableHTML6 NVARCHAR(MAX) ;
SET @tableHTML6 =
N''<table border="1">'' +
N''<tr>'' +
N''<th>Program</th>'' +
N''<th>Client Count</th>'' +
N''<th>CPU Sum</th>'' +
N''<th>Seconds Sum</th>'' +
N''</tr>'' +
CAST ((SELECT td=convert(varchar(50), program_name),''''
,td=count(*),''''
,td=sum(cpu),''''
,td=sum(datediff(second, login_time, getdate())),''''
FROM master..sysprocesses
WHERE spid > 50
GROUP BY convert(varchar(50), program_name)
ORDER BY 7 DESC
FOR XML PATH(''tr''), TYPE
) AS NVARCHAR(MAX) ) +
N''</table>
<br><br>
<strong><font color="red">Who is connected</font></strong> <br>''

DECLARE @tableHTML7 NVARCHAR(MAX) ;
SET @tableHTML7 =
N''<table border="1">'' +
N''<tr>'' +
N''<th>SPID</th>'' +
N''<th>Status</th>'' +
N''<th>Login Name</th>'' +
N''<th>Hostname</th>'' +
N''<th>DB Name</th>'' +
N''<th>Cmd</th>'' +
N''</tr>'' +
CAST ( (SELECT td=[spid],''''
,td= [status],''''
,td=[loginame],''''
,td=[hostname],''''
,td=[dbname],''''
,td=[cmd],''''
FROM [SP_Who_Results]
where dbname not IN (''master'', ''msdb'') 
ORDER BY 4,5 desc
FOR XML PATH(''tr''), TYPE
) AS NVARCHAR(MAX) ) +
N''</table>''

declare @body2 varchar(max)
set @body2 = @tableHTML + '' '' + @tableHTML4  + '' '' + @tableHTML5  + '' '' + @tableHTML6 + '' '' +@tableHTML7

EXEC msdb.dbo.sp_send_dbmail
@profile_name = ''<Add Mail Profile Here>'',
@recipients = ''<Add DBA Email Here>'',
@subject = @Subject,
@body = @body2,
@body_format = ''HTML'' ;
',
                        @database_name=N'master',
                        @output_file_name=N'<Add Path Here>\WhatsRunningErrors.txt',
                        @flags=0

IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
 EXEC @ReturnCode = msdb.dbo.sp_update_job @job_id = @jobId, @start_step_id = 1

IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
 EXEC @ReturnCode = msdb.dbo.sp_add_jobserver @job_id = @jobId, @server_name = N'(local)'

IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
 COMMIT TRANSACTION

GOTO EndSave

QuitWithRollback:

    IF (@@TRANCOUNT > 0) ROLLBACK TRANSACTION

EndSave:
GO
Step Four:  Create an alert. In order to tie all this together, you will need to call the job somehow.  To be proactive I call it from an alert.  You can call it from any alert you want however, in this case, I’m using CPU Utilization is High (>90%).
USE [msdb]
GO
EXEC msdb.dbo.sp_add_alert @name=N'CPU Utilization is High',
                        @message_id=0,
                        @severity=1,
                        @enabled=1,
                        @delay_between_responses=60,
                        @include_event_description_in=1,
                        @notification_message=N'CPU has been over 90% for the last 60 seconds on <ServerName>',
                        @event_description_keyword=N'CPU Utilization',
                        @category_name=N'[Uncategorized]',
                        @job_id=N''
GO
Once the alert is created, open it and attach the job to it as the response as shown here:
Now when ever the CPU utilization reaches 90%, my report will run providing me with a report that looks like this in an email.  I also attach this report to other alerts, feel free to choose your own – obviously, I’ve changed the names to protect the innocent:

What is Currently Running in SQL Server

Program Name
Running Processes
Microsoft SQL Server Management Studio - Query
6
Microsoft SQL Server Management Studio
5
DatabaseMail - DatabaseMail - Id<1792>
2
DatabaseMail - SQLAGENT - Id<932>
1
Report Server
1
SQLAgent - Alert Engine
1
SQLAgent - Email Logger
1
SQLAgent - Generic Refresher
1
SQLAgent - Job invocation engine
1

Processes using a lot of CPU from SQL Server
SPID
Program Name
Connected Seconds
51
DatabaseMail - SQLAGENT - Id<932>
79
52
SQLAgent - Email Logger
2607227
53
SQLAgent - Generic Refresher
2607227
54
Microsoft SQL Server Management Studio - Query
10972
55
SQLAgent - Alert Engine
2607207
57
DatabaseMail - DatabaseMail - Id<1792>
195
58
SQLAgent - Job invocation engine
2607142
59
Microsoft SQL Server Management Studio - Query
2850
60
DatabaseMail - DatabaseMail - Id<1792>
40
61
Report Server
2
64
User Db Name
38
65
Microsoft SQL Server Management Studio - Query
2115
69
Microsoft SQL Server Management Studio - Query
11776
70
Microsoft SQL Server Management Studio - Query
2487
73
Microsoft SQL Server Management Studio - Query
15962


Top 10 Questionable SQL Server Processes
SPID
Blocked
DB Name
CPU
Seconds
Host Name
Program
Login Name
55
0
msdb
1172017
2607207
host name
SQLAgent - Alert Engine
Login name
56
0
master
3880
8109
host name
Microsoft SQL Server Management Studio
Login name
58
0
msdb
4195
2607142
host name
SQLAgent - Job invocation engine
Login name
68
0
User DB Name
3494
11787
host name
Microsoft SQL Server Management Studio
Login name

SQL Server Resource Hogs
Program
Client Count
CPU Sum
Seconds Sum
SQLAgent - Email Logger
1
218
2607227
SQLAgent - Generic Refresher
1
389
2607227
SQLAgent - Alert Engine
1
1172017
2607207
SQLAgent - Job invocation engine
1
4195
2607142
Microsoft SQL Server Management Studio
5
9059
200228
Microsoft SQL Server Management Studio - Query
6
733
46162
DatabaseMail - DatabaseMail - Id<1792>
2
30
235
DatabaseMail - SQLAGENT - Id<932>
1
16
79
Some user db
1
0
38
Report Server
1
0
2

Who is connected
SPID
Status
Login Name
Hostname
DB Name
Cmd
68
sleeping
User login
host
User db
AWAITING COMMAND
69
sleeping
User login
host
User db
AWAITING COMMAND
61
sleeping
User login
host
ReportServer
AWAITING COMMAND
59
runnable
User login
host
Some database
SELECT
70
sleeping
User login
host
Some database
AWAITING COMMAND
64
sleeping
User login
host
User db
AWAITING COMMAND

Conclusion

Now you have a super simple way to proactivly monitor sql server without an expensive 3rd party application. 
Included in this package, the following will be created:
Database: SQL_Monitoring
Table: SP_Who_Results
Job: Report Whats Running Now

No comments:

Post a Comment