Troubleshooting Common RPC issues in Outlook

Common Root Causes For Receiving the RPC Dialog Box

High network latency
Loss of network connectivity on the client side
Loss of a network path within a network
Exchange server outages and crashes
Active Directory/Domain Controller outages and crashes
High database and/or log disk latencies
High server CPU and/or context switching
Long running MAPI operations
A few thoughts to keep in mind when debugging latency issues:

High disk latencies usually affect multiple users, not just single users
If max server latencies are high and cannot be explained by high disk latencies,
also check for Jet Log Stalls and high server context switching or CPU usage.
Disconnects and reconnects always start with calls to Logon, so seeing lots of
Logon calls from a particular user is a sign of connection problems (in addition
to the RPC failures)
Outlook/COM add-ins, VBA code, and MAPI code running on the user’s workstation
can cause problems that are intermixed with Outlook requests.
While Outlook from time to time makes expensive calls, 3rd party applications
are common culprits.

RPC Response Times

Autodiscover Request time out.

OWA is extermally slow and ECP is not also not responding, sometime after trying multiple time it says time out.

 

Microsoft Outlook Take more time approximately 10 to 15 min.

Check CAS server Health.

Test-MapiConnectivity – This is the next command that should be run to verify that mailbox access is working.

Test-ServiceHealth – This should be run first in any troubleshooting scenario.  This script will list each Exchange role, what services are required for that role to function properly and which of those services are not running.

Test-OutlookConnectivity – Will verify that OWA is running and can be used to test all virtual directories or individual ones.  In this example I am testing HTTP connectivity, TCP can also be specified.

Test-OutlookWebServices – Verifies the service information returned by Autodiscover for the Availability Service, Outlook Anywhere, OAB, and UM

Test-WebServicesConnectivity – Tests EWS functionality.

Test-EcpConnectivity – Verifies connectivity to the Exchange Control Panel.

Test-ActiveSyncConnectivity – Performs a full mailbox synchronization to verify health of ActiveSync

Test-PowerShellConnectivity – Test whether PowerShell remoting on a target CAS server is healthy

 

Run ExMon Tool to verify user profile or culprit account and verify no significant load was placed on the server from one user session from virus or bad Outlook profile. 

Directory Services Performance

To troubleshoot performance issues related to Global Catalog two performance counters are of relevance:

MSExchange ADAccess

  • LDAP Read Time
  • LDAP Search Time

Both of these counters should never exceed 100ms and for normal activity should remain under 50ms.

Application log, review the 2080 event to make sure that all domain controllers are responding with correct Boolean values. If there are any responses that are not accurate, the DC’s should be repaired or excluded.
For testing, a DC can be excluded by using the Set-Exchangeserver –StaticExcludedDomainControllers parameter as shown in this section, however, troubleshooting Global Catalog access should also be done as soon as your testing is completed. Statically excluding a GC takes effect immediately and will be viewable on the next 2080 Event ID with all zero values.
Running Backup
Could be Backup applications running through the day causing significant server load.

Load balancer configuration

Please note that for 3rd party load balancer configuration, you should always refer to product documentation / guidance. The following are some general best practices and things we see misconfigured:

1. Verify the client TCP Idle Time-Out is a slightly larger value than the Keep Alive setting on CAS, as noted earlier.

In this example, we are using the 30-minute Keep Alive on CAS and we have both a firewall and load balancer in front of the clients. Here is the connection path.

Clients > Firewall > Load Balancer > CAS

In this example, if you have a firewall in the path from client to CAS, we are referencing the firewall “idle” time out and not the persistence time out. This value should be greater than the load balancer and the load balancer time out should be greater than CAS. Note that it is not recommended to go below 15 minutes for Keep Alive on CAS or TCP idle timeout on the load balancer.

Firewall time out = 40 minutes
LB TCP Idle time out = 35 minutes
CAS Keep Alive = 30 minutes

2. If the load balancer supports it, the preferred option is to configure it to use “Least Connections” with “Slow start” during typical operation.

With the “least connections” method, be mindful it is possible for a CAS to become overloaded and unresponsive during a CAS outage or during patching/maintenance. In the context of Exchange performance, authentication is an expensive operation.

Check Database Status.

IPv6 Disabled

IPv6 can be known to cause performance issues with Microsoft Exchange.  As a result for testing it is advisable to turn of IPv6 by disabling it on the network interface adapter on the Exchange 2010 server and by putting in place the DisabledComponents registry DWORD value as required by Microsoft KB929852.

 

Check Connection Count on CAS server

Check problematic user CAS connection

Get-LogonStatistics -Identity cjones | where {$_.applicationid -eq “Client=MSExchangeRPC”}| fl
clientname,applicationid

Finally Try to restart IIS Service and later CAS server.

At the command prompt, type iisreset /stop. IIS will attempt to stop all services and will return confirmation once all services have been stopped.

iisreset-stop.PNG

 

At the command prompt, type iisreset /start.

iisreset_start.PNG

Done.

Thanks

 

 

 

2 Replies to “Troubleshooting Common RPC issues in Outlook”

Leave a Reply

Your email address will not be published. Required fields are marked *