Monday, June 9, 2014

Lync Centralized Logging Service AlwaysOn filling up hard drive

Issue

Customer was running the AlwaysOn scenario with Centralized Logging Service on Lync Server 2013. The drive was filling up with .etl files when we had set the "CacheFileLocalMaxDiskUsage".


When I observed how this was operating on a working system, I noticed as soon as the .etl file rolled over to a new one (at 20MB), it would be converted to a .cache and .hdr file and the .etl file would be deleted.

So clearly the issue was the .etl files not being converted and then deleted. This allowed a huge amount of disk space to be chewed up and for disk alerts to be sent by the customers monitoring application.

Note: if you are trying to track down where the .etl files... it does no good to type %temp%/Tracing in the run command. That will go to the currently logged in user temp directory /tracing. The Centralized Logging Service runs under "NetworkServer" and can be found:
C:\Windows\ServiceProfiles\NetworkService\AppData\Local\Temp\Tracing

Resolution

This will be an obvious one, but it still may catch people, so that is why I'm putting this blog together. The culprit was Symantec Antivirus. Once this was disabled the .etl files were converted to .cache and .hdr files as expected. The centralized logging service should have been excluded per the Technet article about excluding executables and directories for Lync Server 2013 (http://technet.microsoft.com/en-us/library/dn440138.aspx).

AT&T Hosted Firewall preventing Desktop/Application Sharing through Lync Edge

Issue

Ran into this problem a while ago and just now getting around to writing about it... but it is one of those head scratch kind of problems, until you compare a working system with the non-working system. In this case, I spotted the problem of Desktop/Application Sharing across the Lync Edge server in Wireshark when the client went to STUN the server on port 443.

I didn't realize this, but there actually is a momentary TLS negotiation on 443 STUN and the failure from the AT&T Hosted Firewall looked like this.



From the Lync Edge server perspective everything was successful. But the AT&T Hosted Firewall in the middle of the TLS negotiation was sending back this "Level: Fatal, Description: Access Denied" error instead of what the Lync Edge Server responded with. This was both an issue for Lync Server 2010 and 2013.

Resolution

The jist of the resolution was to request AT&T to do a Policy Bypass for the IP Addresses associated with the Lync Edge server. The problem was with STUN, but I would probably ask they bypass any other IPs if you have multiple IP addresses on your Lync Edge servers.

Just in case that doesn't get you far enough... I have below verbatim what was send back... so that you can coax the Level 1 support technician to find someone that really knows what they are doing (my cust went through several support people before he found someone that could fix this).
“What I did to correct the issue is to remove the Protocol Option filter from rule 4 and rule 17 in the policy that was being used for the Lync traffic.  Protocol options tell the firewall to check further into known traffic types such as http, https, smtp, ftp, etc. for expected settings.  On the HTTPS side one of those checks is ‘Allow invalid SSL certificates’ which is not enabled by default.  Since the filter is used by most of your rules in your policy I didn’t want to enable this and have all of those rules using it so instead I removed the filter from these two rules.  If you would like it re-applied but with the setting enabled a separate filter can be configured with that setting enabled and just apply that filter to the two rules.”