How To Troubleshoot Microsoft Exchange Server Latency or Connection Issues

I’ve found this great article on TechNet that really gives us the best practices for Troubleshooting Microsoft Exchange Server Latency or Connection Issues..

This is definitely a MUST have in your IT Toolbox.. Enjoy 🙂

How To Troubleshoot Microsoft Exchange Server Latency or Connection Issues

Premier Field Engineers

Written by Samuel Drey, Premier Field Engineer.


This article is meant to be a hopefully useful guide to help Microsoft Exchange Server IT Operations teams understand, troubleshoot and remedy situations where users are experiencing issues connecting to the Exchange messaging service via Outlook or OWA. I’ve included information relating to Exchange Server 2003, 2007 and 2010.  The following process helps rule out server latencies and helps determine whether a less than optimal messaging user experience comes from a client-side configuration, client-side performance issue or a server-side issue.

Step 1: Check the Application Log and System Log

The first thing to look at is the Application Log and then the System Log for possible errors. Usually, poor messaging experiences caused due to server issues are surfaced by warnings or errors regarding memory or disk issues and are obvious recurring events.  For example: Error 9582 stating that “the virtual memory necessary to run your Exchange server is fragmented in such a way that performance may be affected” or Event ID 51 for the disk component stating that “an error was detected on device \Device\Harddisk3\DR3”.

Step 2: Check for Issues Using Key Performance Counters

The second less obvious thing to look at are the performance counters and checking if there are any latencies. The first counters that will indicate performance issues are the RPC latencies counters since all the actions a user does corresponds to RPC requests being sent to the Exchange server.

Here are the steps to follow:

  • Check for RPC latencies
  • Check for CPU performance issues
  • Check for Memory load issues
  • Check for Disk bound issues
  • Check for Network issues
  • Check for Active Directory related issues
  • Check for Virus scanning issues

If an issue is not visible in the Application or System Log, then the performance logging analysis will point out the cause(s) of the issue most of the time, provided you use the correct methodology as introduced above.

Conduct Performance Analysis to Fine-Tune Exchange Components  and Help Identify Issues

  • If users are still able to connect to the Exchange server, but they encounter huge latencies, then performance analysis with will tell you where the issue is.
  • As there are hundreds of counters on an Exchange Server it is essential to have a subset of counters to begin with the performance analysis.

Once the component causing the Exchange issue (e.g. Disk, Memory, Network, etc…) has been identified, then we can dig further in the analysis of this component by using more of the component’s counters.

  • For example, with the Memory component, we must check the “Available MB” and “Pages/Sec” counters, and if one of these shows an issue, then we will add more Memory counters (total counters for the Memory component is 35). That’s why we start with only 2 counters, Available MB and Pages/Sec. The principle is the same for all other components: take 2 to 4 significant counters, then dig further.

Key Exchange Performance Counters for Monitoring and Troubleshooting

Below are two tables that I created in the past for initial versions of Microsoft’s Premier Exchange Risk Assessment Programs and that I updated since then to fit the evolution of best practices:

  • Exchange 2007/2010 counters table
  • Exchange 2003 counters table

We really encourage administrators to focus on these specific counters to effectively monitor their Exchange infrastructure and to proactively identify potential performance issues. Usually these are a subset of the System Center Operations Manager (SCOM) Exchange Management Pack rules, so use tables below to tune SCOM alerts to focus on the most important ones. If you are using another monitoring application, integrate the above counters into your monitoring solution.

Exchange Server 2007/2010 Key Performance Counters

Here is the selection of the key Exchange 2007/2010 counters that will help point out where the issue is (you can copy/paste the relevant counter names):

For additional information, check out Monitoring Without System Center Operations Manager.
SERVER ROLE COUNTER Check Expected
Database and Database ==> Instances
MAILBOX AND HUB MSExchange Database(Information Store)\Database Page Fault Stalls/sec Avg <10
Max <100
MAILBOX MSExchange Database ==> Instances(*)\Log Generation Checkpoint Depth Max <=500
MAILBOX xchange Database(Information Store)\Version buckets allocated Max <=12000
HUB MSExchange Database ==> Instances(edgetransport/Transport Mail Database)\Log Generation Checkpoint Depth Max <=1000
HUB MSExchange Database ==> Instances(edgetransport/Transport Mail Database)\Version buckets allocated MAX <=200
LogicalDisk (or substitute PhysicalDisk if Logical is unavailable)
MAILBOX LogicalDisk  – Temp/Page File Disks
LogicalDisk\Average Disk sec/Read Avg <10ms
Max <=50ms
LogicalDisk\Average Disk sec/Write Avg <10ms
Max <=50ms
HUB LogicalDisk  – SMTP
LogicalDisk\Average Disk sec/Read Avg <20ms
Max <=50ms
LogicalDisk\Average Disk sec/Write Avg <20ms
Max <=50ms
MAILBOX LogicalDisk  – Databases
LogicalDisk\Average Disk sec/Read Avg <=20ms
LogicalDisk\Average Disk sec/Write Avg <=100ms
LogicalDisk – Transaction Logs
LogicalDisk\Average Disk sec/Read Avg <=20ms
LogicalDisk\Average Disk sec/Write Avg <=10ms
Logical Disk – All disks
CAS LogicalDisk(_Total)\Disk Reads/sec Max <=50
LogicalDisk(_Total)\Disk Writes/sec Max <=50
Memory
COMMON Memory\Available Mbytes (MB) Min >=100Mb
Memory\Pages/sec Max <1,000
MSExchangeDSAccess
COMMON MSExchange ADAccess Domain Controllers(*)\LDAP Read Time Avg <=50ms
Max <=100ms
MSExchange ADAccess Domain Controllers(*)\LDAP Search Time Avg <=50ms
Max <=100ms
MSExchange ADAccess Domain Controllers(*)\LDAP Searches timed out per minute Max <=10
MSExchange ADAccess Domain Controllers(*)\Long running LDAP operations/Min Max <=50
MSExchangeIS
MAILBOX MSExchangeIS Public(_Total)\Replication Receive Queue Size Max <=100
MSExchangeIS\RPC Averaged Latency Avg <=25ms
MSExchangeIS\RPC Num. of Slow Packets Avg <=1
Max <=3
MSExchangeIS\RPC Operations/sec Avg info only
Min
Max
MSExchangeIS\RPC Packets/sec Avg info only
Min
Max
MSExchangeIS\RPC Requests Max <70
MSExchangeIS\Virus Scan Queue Length Max <=10
MSExchangeIS\VM Largest Block Size Min info
MSExchangeIS\VM Total 16MB Free Blocks Min info
MSExchangeIS\VM Total Free Blocks Min info
MSExchangeIS\VM Total Large Free Block Bytes Min info
Network Interface
COMMON Network Interface\Bytes Total/sec Max <=7MBps or <=70MBPS
Network Interface\Current Bandwidth special
Network Interface\Packets Outbound Errors Max =0
Process, Processor, and System
COMMON Processor(_Total)\% Processor Time Avg <=75%
Processor(_Total)\% User Time Avg <=75%
Processor(_Total)\% Privileged Time Avg <=75%
Process(*)\% Processor Time special
System\Processor Queue Length (all instances) Avg <=5 per proc
SMTP Server
HUB \MSExchangeTransport Queues(_total)\Aggregate Delivery Queue Length (All Queues) Avg <=3000
Max <=5000
\MSExchangeTransport Queues(_total)\Active Remote Delivery Queue Length Max <=250
\MSExchangeTransport Queues(_total)\Active Mailbox Delivery Queue Length Max <=250
\MSExchangeTransport Queues(_total)\Submission Queue Length Max <=100
\MSExchangeTransport Queues(_total)\Active Non-Smtp Delivery Queue Length Max <=250
\MSExchangeTransport Queues(_total)\Retry Mailbox Delivery Queue Length Max <=100
\MSExchangeTransport Queues(_total)\Retry Non-Smtp Delivery Queue Length Max <=100
\MSExchangeTransport Queues(_total)\Retry Remote Delivery Queue Length Max <=100
\MSExchangeTransport Queues(_total)\Unreachable Queue Length Max <=100
CAS Server
CAS Outlook Web Access Counters
MSExchange OWA\Average Response Time Max <=100ms
MSExchange OWA\Average Search Time Max <=31000ms
CAS to MBX connection
RPC/HTTP Proxy\Number of Failed Back-End Connection attempts per Second Max =0
Client Access Server OAB Download Counters
MSExchangeFDS:OAB(*)\Download Task Queued Max =0

Exchange Server 2003 Key Counters

Here is the selection of the key Exchange 2003 counters that will help point out where the issue is (you can copy/paste the relevant counter names):

Exchange server 2003 counters    
COUNTER Check Expected Links for more information
Database and Database ==> Instances
Database ==> Instances(*)\Log Record Stalls/sec Avg <10 More info…
Max <100
LogicalDisk (or substitute PhysicalDisk if Logical is unavailable)
LogicalDisk  – Temp/Page File Disks
LogicalDisk\Average Disk sec/Read Avg <10ms More info…
Max <=50ms More info…
LogicalDisk\Average Disk sec/Write Avg <10ms More info…
Max <=50ms More info…
Paging File\% Usage Avg <50% More info…
LogicalDisk  – SMTP
LogicalDisk\Average Disk sec/Read Avg <10ms More info…
Max <=50ms More info…
LogicalDisk\Average Disk sec/Write Avg <10ms More info…
Max <=50ms More info…
LogicalDisk  – Database
LogicalDisk\Average Disk sec/Read Avg <20ms More info…
Max <=50ms More info…
LogicalDisk\Average Disk sec/Write Avg <20ms More info…
Max <=50ms More info…
LogicalDisk  – Database (additionnal disk 1)
LogicalDisk\Average Disk sec/Read Avg <20ms More info…
Max <=50ms More info…
LogicalDisk\Average Disk sec/Write Avg <20ms More info…
Max <=50ms More info…
LogicalDisk – Transaction Logs
LogicalDisk\Average Disk sec/Read Avg <5ms More info…
Max <=50ms More info…
LogicalDisk\Average Disk sec/Write Avg <10ms More info…
Max <=50ms More info…
LogicalDisk – Transaction Logs (additionnal disk 1)
LogicalDisk\Average Disk sec/Read Avg <5ms More info…
Max <=50ms More info…
LogicalDisk\Average Disk sec/Write Avg <10ms More info…
Max <=50ms More info…
Memory
Memory\Available Mbytes (MB) Min >=50 More info…
Memory\Free System Page Table Entries Min >=5000 More info…
Memory\Pages/sec Max <1,000 More info…
MSExchangeDSAccess
MSExchangeDSAccess Process\LDAP Read Time (for all processes) Avg <50ms More info…
Max <=100ms
MSExchangeDSAccess Process\LDAP Search Time (for all processes) Avg <50ms More info…
Max <=100ms
MSExchangeIS
MSExchangeIS Public\Replication Receive Queue Size Max <=1000 More info…
MSExchangeIS\RPC Averaged Latency Max <=50ms or 100ms More info…
Avg
MSExchangeIS\RPC Operations/sec Avg info only More info…
Min
Max
MSExchangeIS\RPC Packets/sec Avg info only
Min
Max
MSExchangeIS\RPC Requests Max <30 More info…
MSExchangeIS\Virus Scan Queue Length Max <=10 More info…
MSExchangeIS\VM Largest Block Size Min >32Mb More info…
MSExchangeIS\VM Total 16MB Free Blocks Min >=1 More info…
MSExchangeIS\VM Total Free Blocks Min >=1 More info…
MSExchangeIS\VM Total Large Free Block Bytes Min >50MB More info…
Network Interface
Network Interface\Bytes Total/sec Max <7MBps or <70MBps More info…
Network Interface\Current Bandwidth special
Network Interface\Packets Outbound Errors Max =0 More info…
Process, Processor, and System
Processor\% Processor Time (_Total) Avg <80% More info…
Processor\% Privileged Time (_Total) Avg special More info…
Process(*)\% Processor Time (_Total) special More info…
Process(*)\% Privileged Time (_Total) special More info…
Process(*)\Virtual Bytes (store) Max <2.8GB More info…
System\Processor Queue Length Avg <2 More info…
SMTP Server
SMTP Server\Categorizer Queue Length Max <10 More info…
Avg
SMTP Server\Local Queue Length Max <1000 More info…
SMTP Server\Remote Queue Length Max <1000 More info…
Avg info

Hope you found this helpful.  At a later date, I will provide the equivalent procedure to help you troubleshoot client-side latencies.



Categories: Exchange Server

Tags: , , , , , , , , ,

1 reply

We look forward to hearing your feedback..

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: