ms-client-diagnostics: 25; reason=”A federated call failed to establish due to a media connectivity failure where both endpoints are internal

Published by

on

Skype Edge Architecture

Huh you say..that’s what I said too.

The Problem

Audio and Video failing between federated partners. This only happens when the B-Party user was internal. I also found that when the B-Party called the A-Party the media worked.

The Monitoring server reveals as follows:-

ms-client-diagnostics: 25; reason=”A federated call failed to establish due to a media connectivity failure where both endpoints are internal” Diagnostic ID: 25 states:- “A federated call failed to connect because a media path could not be established between the two internal endpoints. This failure can be caused by network performance issues affecting the endpoints or an endpoint being unable to use the A/V Edge Server. Correlation of such failures with individual networks, endpoints or A/V Edge Server can indicate potential deployment issues.”

The Diagnostic Header 25; reason=”A federated call failed to establish due to a media connectivity failure where both endpoints are internal”; UserType=”Callee”; MediaType=”audio”; ICEWarn=”0x40003a0″; LocalSite=”192.168.99.33:20008″;LocalMR=”60.234.48.202:54594″; RemoteSite=”10.57.210.11:20003″;RemoteMR=”122.56.25.147:54376″; PortRange=”20000:20039″; LocalMRTCPPort=”55223″; RemoteMRTCPPort=”54376″; LocalLocation=”2″; RemoteLocation=”2″; FederationType=”1″; NetworkName=”ucsorted.com”; Interfaces=”0x6″; BaseInterface=”0x2″; BaseAddress=”192.168.99.33:20018; MrDnsU=”lyncedge01.ucsorted.com”; MrResU=”0″

Finding the underlying issue

The federated environment is as follows:-

Skype Edge Architecture
Logical view

Following the call setup using Snooper I can see the initial INVITE from the A-Party with 8 possible candidates.

SIP INVITE
Caller Candidates

Next we see the response from the B-Party in a 183 Session Progress message. The B-Party offers 5 possible candidates.

183 Session Progress
Caller Candidates

We should now see a message containing “a=remote-candidates” to identify the selected and tested candidate pairs.

In my trace, this is missing! Instead I find a BYE message with the following ms-client-diagnostics:-

The Diagnostic Header 25; reason=”A federated call failed to establish due to a media connectivity failure where both endpoints are internal”

I did notice the reference to LocalLocation=”2″; RemoteLocation=”2″; Having compared this to a successful federated audio call I cant say that this has any relevance since the same location ID’s were present.

Generally when the candidate negotiation fails we are facing a path issue. Be that blocked ports or routing.

I have come across this issue before see http://ucsorted.com/2014/01/12/media-connectivity-failure-when-both-endpoints-are-internal/

This time, however, the underlying cause of failed candidate negotiation was something I hadn’t encountered (yet).

My edge server was configured with external facing IP’s in the DMZ with NATted Public IP’s for each. Checking the NAT using whatsmyIP I discovered that the result was not what I expected. Cycling through all 3 IP’s on the Edge server I found that the routing was not symmetric!

Arrghh! Such a simple issue!

Asymmetric verses Symmetric simply refers to the paths that data takes on a round trip.
Symmetric routing – Send and receive traffic via the same Public IP.
Asymetric Routing will send on one IP BUT return traffic arrives from a different IP (this is no good for Audio and Video in Lync\Skype for Business)

Solution

Once the routing was corrected to symmetric the audio was successful.

This time the nominated candidate pair for Audio and video was present. Looking into the final INVITE from the Callee (B-Party) to the Caller (A-Party).

Candidate Negotiation
Caller Candidate Selection

The Callee is stating that it will use candidate pair 3 to get to the Caller and that it plans to do so via the remote-candidate.

In response (OK Message), the caller states its intention to use the following candidate pair:-

Candidate Selection
Callee Candidate Selection

Now the Caller states that it will use candidate pair 6 to get to the caller and that this will be done via its remote-candidate.

Summary

It seems to me that when the candidate negotiation fails the default assumption is that the endpoints must be internal. I can understand that reasoning since you would only expect to negotiate candidates so that you are able to traverse the Edge infrastructure.

However, it can lead you down the wrong path.

Phew!..and now that’s sorted

5 responses to “ms-client-diagnostics: 25; reason=”A federated call failed to establish due to a media connectivity failure where both endpoints are internal”

  1. GW Avatar
    GW

    Hi there – Great article. I’ve been wresting with some thing similar.

    Just wondering how you were able to check the nat of the external facing IP’s in the DMZ on at a time? How did you get the traffic sourced from each of the IPs one at a time. My server is 2008 R2 and there are no options to run ping/tracerts from a specific source address?

    Thanks

    GW

    Like

    1. Paul B Avatar

      In my case I was using a single interface with the 3 external facing IP’s assigned to it. So I simply removed the IP’s, leaving just the one I needed to test. Next I made sure my internet traffic was using this connection. Then I simply went to http://www.whatsmyip.org/ to confirm the transmit path IP.
      The receive path is easily tested using Telnet from external to the Public IP’s.

      PB

      Like

  2. rahul Avatar
    rahul

    we have similar issue where we are trying to connect with a federated partners meeting and get the same error which you have described. I feel confused while performing those steps but this issue is random. sometimes we are unable to join meeting and some times we are not able to join their video calls.

    I don’t understand why meetings with other partners are working fine. Could we have similar issue.

    Like

    1. Paul B Avatar

      I would say that you most likely have a network issue. My first thought is related to the subnets the users are connecting from. You need to collect the candidate information from the setup logs and map that out with your network architecture. That’s way you have a logical view of the routing involved.

      Like

Leave a comment

A WordPress.com Website.