Notes from the lab: VMware Horizon and Microsoft MFA NPS Extension

In my own lab environment I have a mixture of EUC components and dual factor configured accordingly, but more and more I see that customers also just use the MFA solution of Microsoft to integrate it for their environments. Why not it’s included with your license right.

So back to the techie part I’ve configured my own NPS setup on a Windows Server 2019 and configured the RADIUS setup. Installed the MFA NPS extension and had a pre-existing configuration for my Citrix ADC appliance.

I’ve configured my Horizon connection server as an RADIUS client and enabled the configuration request and network policies for it as well, configuration type NAS IPv4 Address and the IP-address of the server.

Afterwards put in the configuration part in Horizon itself pointing the RADIUS authentication to the NPS server with all the necessary fields and/or additions that you want.

Well basically all should be working instantly when logging on to the Horizon URL or client.

I did however had some issues when logging in and stuff would time-out, event entries would say that the wrong dual factor request was given. This ultimately came from the fact I didn’t have a primary authentication set in MFA, I’ve checked that I could use my yubikey, SMS or push authentication. The resolution for this was to select primary push in the authenticator app and then it worked instantly.

Reference articles:
https://docs.microsoft.com/en-us/azure/active-directory/authentication/howto-mfa-nps-extension
https://christiaanbrinkhoff.com/2017/02/17/how-to-configure-azure-mfa-for-citrix-netscaler-gateway-radius-by-using-the-new-nps-extension/

(off topic there is an issue with 2019-NPS which I’ve encountered when configuring RADIUS-WIFI authentication, see the resolution here: https://community.ui.com/questions/FYI-Windows-Server-2019-NPS-for-RADIUS-broken-w-fix/364c7c17-b3d3-4973-8dd2-e4e701309300)

Notes from the field: The unexplained Outlook pop-up

Quite recently I’ve had an interesting troubleshoot at a customer. The problem was at first that there was an issue in the newly build Exchange 2019 environment that Outlook clients would open up and ask for credentials in a domain joined environment, so the SSO part of WIA isn’t working and it “seemed” to work after you would put in credentials.

Long story short support case at Microsoft was in play and after some weeks of log troubleshooting no results and a standstill for the customers migration project.

Drilldown to the issue at hand it seemed that the Exchange solution wasn’t working correctly and it was decided to uninstall the software and clean-up the entire setup and do a fresh redeploy. The engineer already repaired some ADDS inconsistencies which perhaps could be an issue but that was not the case.

At this time I got involved because the uninstall wasn’t working correctly on the last server, it kept preventing the uninstall with still some arbitrary mailboxes present and an old hardcoded OAB reference which couldn’t get removed. I’ve also had this kind of behavior with the arbitrary mailboxes on a Exchange 2016 solution and a reboot would resolve the blocking issues, the entities will get populated again and afterwards you could remove them accordingly. That was the solution here as well until the point of the OAB entry we had to delete that one through ADSIEDIT.

Ok.. all is well again and Exchange 2019 is completely removed and a redeploy was the next item. Exchange got redeployed, basic no fuzz and behavior of the client seemed to improve.. well no it came back and the outlook prompts and disconnects were reoccurring after some time again.

Next item was to strip stuff even further, we put on some scenario’s like lets do a complete redeploy of ADDS etc. which would mean a lot of work. So we tried to simulate it in separate lab environments to reproduce the issue and there we found out no issues at all, SSO was working out of the box and things started to point through customization of some sorts or other external factors.

We’ve landed on the point that perhaps the forest trust would be an issue, that was our last item to troubleshoot. There was a full forest trust in place for the migration/transition to the new network which is being used accordingly for resources etc. This was also something we implemented in both lab environments and never had any issues with. Well here comes the kicker… we’ve disabled the trust and the issues were instantly gone!

Ok. We have our little friend and are going to focus on that, it turned out that the name suffix routing enabled of the domain was giving issues, very strange because they are not overlapping of any sorts. It “seemed” all ok from a configuration point of perspective. We focused in on everything that could be an issue and ultimately came to the point that the NETBIOS name and FQDN of ADDS where the same with a dot in it e.g. domain.local for them both.. Right this isn’t something that you can setup anytime soon from Server 2019 through Windows 2000 and is basically not supported or advised to use, this was the case in Windows NT4 but further on no go at all.

Recap of the whole situation we have our workaround for the customer, but the root cause with NETBIOS and a dot in its setup isn’t going away. The migration is focused to move it all as soon as it can and then kill the trust and any connections to the old forest/domain which was giving all this grief.

I’ve had an splendid troubleshoot with the engineer who’s a wizard in ADDS, be sure to connect or follow him on LinkedIn.

Christiaan Ploeger

Notes from the field: Configuring AFAS Online with Azure

I have a quick win for those who are also in the process of migrating an ADFS configured AFAS Online setup to Azure Active Directory. I’ve already had an support call with them and besides the point they don’t support any troubleshooting IDP setups they did their best which in turn got me to sharing this.

So down to the point, the following article describes the SSO needed part for AFAS Online: https://help.afas.nl/help/EN/SE/plv2_Config_SSO.htm

The parts that need to be adjusted are at the endpoint part, they refer to the federation metadata document which is not the one you need. This needs to be the OpenID Connect metadata document listed at the endpoints. Microsoft now defaults to the /v2.0/ part. (On a side note there might be some situations you will want to use the v1 document which is not listed anymore as an endpoint to copy, to use this just delete the /v2.0/ part and the old version will be used)

The final part is the configuration adjustment in AFAS Online, there when you fill in all the values the documentation states that “Scopes” is an optional field which in turn isn’t. I’ve only got it to work with this filled with email and the same at the claim part.

If you don’t fill out the scopes section it will error out with missing claim “upn” if that is the one you chose or “email”

Hope it helps!

Notes from the lab: Configuring vCenter 7 with ADFS

With the release of vCenter 7 you can now integrate it with Microsof Active Directory Federation Services (ADFS)

See the following blog article for an overview:
https://blogs.vmware.com/vsphere/2020/03/vsphere-7-identity-federation.html

See the following configuration articles for a setup overview:
https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.authentication.doc/GUID-C5E998B2-1148-46DC-990E-A5DB71F93351.html
https://kb.vmware.com/s/article/78029

With this information I’ve configured my lab environment to a working SAML based login with a few minor issues.

I had my ADFS setup load balanced through a content switching setup for external access. This is working great for my simple office 365 integration point but not so much if you’re trying to do more.

Like stated in the following article don’t terminate the SSL connection:
https://docs.microsoft.com/en-us/windows-server/identity/ad-fs/overview/ad-fs-faq

The issue I came across was that vCenter was failing with the validation of the certificate, at first I thought the missing root/intermediates was the cause of this but this was not the case. Even after uploading my internal root/intermediate and the external certificates root/intermediate the SSL validation check would fail. The chain was valid in every case though.

The resolution was to make my internal DNS entry through a SSL-BRIDGE setup to my ADFS server and afterwards I could finish the configuration part without issues.

Now when presented with vCenter logon page if you put in an account from the federated domain it will redirect you accordingly to the ADFS logon point.

Hope it helps!

Notes from the lab: Migrating Windows vCenter to VCSA 7

In my lab environment I was running Windows vCenter 6.7 and with the release of vCenter 7 a migration is needed because there is no Windows vCenter anymore.

The following articles will give you enough information on how the process works especially the how-to from Vladan Seget:
https://docs.vmware.com/en/VMware-vSphere/6.5/com.vmware.vsphere.upgrade.doc/GUID-9A117817-B78D-4BBE-A957-982C734F7C5F.html
https://www.starwindsoftware.com/blog/how-to-migrate-vmware-vcenter-from-windows-to-vcsa-6-7-update-1

Basically the process is the same for vCenter 7 with in my case one issue.

At first try my migration failed at the stage when the migration assistant is shutting down the Windows vCenter and the VCSA 7 is being brought up with the original IP, hence resulting in a conflict and the upgrade is in a broken state. I followed the process and saw that especially Windows Server 2016 has the annoyance to delay the shutdown for minutes, this is a known issue and happens from time to time and to my knowledge has to do when updates were installed (even after multiple reboots).

So the migration assistant is doing its job and completes the migration and is shutting down the VM, except in my case. Resolution for this is when the script is completed and windows says it’s still shutting down etc. do a hard power off and after that the migration is moving further as expected.

In summary:
1. Make sure Windows is up to date with all the updates and there are none outstanding
2. Make sure vCenter is up to date, in my case it was the latest update of 6.7
3. Make a snapshot before you start 😊 saves a lot of hassle later on
4. Force kill the Windows vCenter when it’s shutting down after the migration assistant completes

After this the migration in my case completed flawlessly.

Hope it helps!

Notes from the field: Windows 2019 Storage Replica lock-up on VMware

On one of my latest projects consisting of a new Windows Server 2019 setup on VMware and making use of Storage Replica in a server to server setup for replicating home drives and profiles I came across a random lock-up of the VM and by that inaccessible shares.

The setup was all working until the failover part. It seems there is an delay of some sort and the failover isn’t instant or takes a while to be active with the server being unresponsive and disconnecting any form of management to the VM in question(VM tools are not responding as well and console login will not work in this failover time). I’ve tried the actions again of doing a storage replica failover and I got an BSOD on the VM stating: HAL INITIALIZATION FAILED I’ve tried all of this in a separate test setup and had this working without any problems on Server 2016, and Server 2019. Only this time it gave me this strange behavior. The difference in my own setup is HW level 14 and this new one had HW level 15 and the hosts are 6.7 13981272 build and my own setup is 6.7 14320388 build (older builds have also worked fine for me)

After some troubleshooting and providing the BSOD dump findings to VMware GSS support it became clear that version 10341 of VMware tools was the troublemaker. The solution was to upgrade to the latest 10346 VMware tools. The vmm2core tool provided me with the means of creating a dump file with the VM in question.

Notes from the field: Hyper-V to VMware migrated VM’s cannot install VMware Tools

One of my last projects I needed to convert Hyper-V VM’s to VMware, this all went fine with the offline capability of vcenter converter and the migration succeeded. Only after trying to install the VMware tools this would hang on starting the VGauth services and several other dependencies. For reference the VM’s in question are a mixture of 2008R2 / 2012R2. After some troubleshooting and searching the knowledgebase I stumbled across this article: https://kb.vmware.com/s/article/55798

So for the project I didn’t had any ok to patch the servers that was out of scope for this one, the mitigation was to install older VMware tools (10.2.5 to be exact) afterwards the tools installed fine.

On a side note when finalizing the converted VM don’t forget to delete the hidden older hyper-v network adapter, this can still provide conflicts if not removed.

Notes from the lab: Citrix ADC Native OTP and AdminSDHolder

While doing some lab work I came across an issue that the Domain Admin accounts could not register on the manageotp site while Domain Users could. This got me figuring it out.

For the use of Native OTP on the ADC we need to use an bind account for Active Directory which has the appropriate write permissions on the userParameters value of the users.

When we delegate control of the exact write permission of the userParameters everything is fine for normal users but administrator accounts won’t work. When we use a service account with full blown domain administrator permissions as the bind account then it works.

After some researching I came across this old article which explained the behavior:
https://support.microsoft.com/nl-nl/help/817433/delegated-permissions-are-not-available-and-inheritance-is-automatical

Long story short, if any user is also a member of a high privileged group the AdminSDHolder protection will prevent this. There is a way that inheritance can be enabled but this is mostly not recommended as you will open up a whole lot of extra security risks.

If it isn’t needed then just delegate control of the needed permissions otherwise use an bind account with domain admin permissions.

For some in depth knowledge of AdminSDHolder and it’s workings see the following article:
https://www.petri.com/active-directory-security-understanding-adminsdholder-object

Notes from the field: uberAgent to the rescue!!

We all know it, the once in a while “it’s slow logging  on..” and then it gets dropped at the escalation desk for a resolution. So I got the call for troubleshooting this issue. Since I knew from previous experiences that uberAgent is the troubleshooting tool you will want for this I contacted them and requested the consulting license at https://uberagent.com/ (thanks to Helge Klein) did the installation of Splunk / Uberagent and got myself a monitoring baseline to work with. A little background on the setup:

  • vSphere 6.0
  • XenDesktop 7.15 / MCS – Windows 8.1 & Windows Server 2012 R2
  • RES WorkspaceManager 10.1.300.1

The problem was at times users would have a profile initialization of 90 seconds! and at times the user shell would hang..

After a period of two weeks I would have my baseline with uberAgent and filtered out that this would be random very early start of the day or just after break time. No funny business whatsoever in the environment and no lack of resources e.g. iops or cpu/memory exhaustion, drilling down in some user trending with uberAgent I came to a somewhat recurring user base that experienced the issue. Ok! That helps and after that I could reproduce it with the useraccounts in question displaying the following screen:Dropped this in the resrockstars.slack.com group and got a quick reply from Dennis van Dam in regards to traceviewer and came to the following:This in turn pointed me out to the following support article:Problem resolved and a happy customer! Hope this helps you out as well.

Reference article:

HOWTO: Create a trace file

Notes from the lab: Exchange Server 2016 CU6 broken by default??

I came across the most peculiar issue I’ve seen so far with Exchange 2016.
Installed a greenfield setup and the ECP/OWA page was broken by default with the following entry in event viewer:
——————————————————————————————————————————————————–
Event code: 3005
Event message: An unhandled exception has occurred.
Event time: 9-9-2017 22:26:57
Event time (UTC): 9-9-2017 20:26:57
Event ID: 53b3f1166cb147408cb97bc79483c3f5
Event sequence: 2
Event occurrence: 1
Event detail code: 0

Application information:
Application domain: /LM/W3SVC/2/ROOT/owa-4-131494624100042355
Trust level: Full
Application Virtual Path: /owa
Application Path: C:\Program Files\Microsoft\Exchange Server\V15\ClientAccess\owa\
Machine name: EX01

Process information:
Process ID: 7756
Process name: w3wp.exe
Account name: NT AUTHORITY\SYSTEM

Exception information:
Exception type: TargetInvocationException
Exception message: Exception has been thrown by the target of an invocation.
at System.RuntimeMethodHandle.InvokeMethod(Object target, Object[] arguments, Signature sig, Boolean constructor)
at System.Reflection.RuntimeMethodInfo.UnsafeInvokeInternal(Object obj, Object[] parameters, Object[] arguments)
at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
at Owin.Loader.DefaultLoader.<>c__DisplayClass12.<MakeDelegate>b__b(IAppBuilder builder)
at Owin.Loader.DefaultLoader.<>c__DisplayClass1.<LoadImplementation>b__0(IAppBuilder builder)
at Microsoft.Owin.Host.SystemWeb.OwinAppContext.Initialize(Action`1 startup)
at Microsoft.Owin.Host.SystemWeb.OwinBuilder.Build(Action`1 startup)
at Microsoft.Owin.Host.SystemWeb.OwinHttpModule.InitializeBlueprint()
at System.Threading.LazyInitializer.EnsureInitializedCore[T](T& target, Boolean& initialized, Object& syncLock, Func`1 valueFactory)
at Microsoft.Owin.Host.SystemWeb.OwinHttpModule.Init(HttpApplication context)
at System.Web.HttpApplication.RegisterEventSubscriptionsWithIIS(IntPtr appContext, HttpContext context, MethodInfo[] handlers)
at System.Web.HttpApplication.InitSpecial(HttpApplicationState state, MethodInfo[] handlers, IntPtr appContext, HttpContext context)
at System.Web.HttpApplicationFactory.GetSpecialApplicationInstance(IntPtr appContext, HttpContext context)
at System.Web.Hosting.PipelineRuntime.InitializeApplication(IntPtr appContext)

Encryption certificate is absent
at Microsoft.Exchange.Security.Authentication.Utility.GetCertificates()
at Microsoft.Exchange.Clients.Owa2.Server.Core.notifications.SignalR.SignalRStartup.Configuration(IAppBuilder app)

Request information:
Request URL: https://localhost:444/owa/exhealth.check
Request path: /owa/exhealth.check
User host address: 127.0.0.1
User:
Is authenticated: False
Authentication Type:
Thread account name: NT AUTHORITY\SYSTEM

Thread information:
Thread ID: 25
Thread account name: NT AUTHORITY\SYSTEM
Is impersonating: False
Stack trace: at System.RuntimeMethodHandle.InvokeMethod(Object target, Object[] arguments, Signature sig, Boolean constructor)
at System.Reflection.RuntimeMethodInfo.UnsafeInvokeInternal(Object obj, Object[] parameters, Object[] arguments)
at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
at Owin.Loader.DefaultLoader.<>c__DisplayClass12.<MakeDelegate>b__b(IAppBuilder builder)
at Owin.Loader.DefaultLoader.<>c__DisplayClass1.<LoadImplementation>b__0(IAppBuilder builder)
at Microsoft.Owin.Host.SystemWeb.OwinAppContext.Initialize(Action`1 startup)
at Microsoft.Owin.Host.SystemWeb.OwinBuilder.Build(Action`1 startup)
at Microsoft.Owin.Host.SystemWeb.OwinHttpModule.InitializeBlueprint()
at System.Threading.LazyInitializer.EnsureInitializedCore[T](T& target, Boolean& initialized, Object& syncLock, Func`1 valueFactory)
at Microsoft.Owin.Host.SystemWeb.OwinHttpModule.Init(HttpApplication context)
at System.Web.HttpApplication.RegisterEventSubscriptionsWithIIS(IntPtr appContext, HttpContext context, MethodInfo[] handlers)
at System.Web.HttpApplication.InitSpecial(HttpApplicationState state, MethodInfo[] handlers, IntPtr appContext, HttpContext context)
at System.Web.HttpApplicationFactory.GetSpecialApplicationInstance(IntPtr appContext, HttpContext context)
at System.Web.Hosting.PipelineRuntime.InitializeApplication(IntPtr appContext)

Custom event details:
———————————————————————————————————————————————————
After some digging I came across this blog: https://justaucguy.wordpress.com/2014/12/01/exchange-2013-cu6-owa-something-went-wrong/ and https://blogs.technet.microsoft.com/rmilne/2017/06/27/exchange-2016-cu6-released/
the first one mentions of replacing the sharedwebconfig, which wasn’t my error but tried it anyway without any change, and the other triggered me with certificates… okay I checked them via the Exchange Management Shell and also there no issues..

Finally I got the bugger in IIS, it appears that a wrong certificate got bound at installation (yeah two clean servers and even some re-runs in other lab setups give me the same) but the solution was to unbound the certificate it had and bind the Microsoft Exchange Server Auth Certificate and do a IISreset.

Problem was instantly solved in my case. (the second blog above mentions that in an upgrade scenario the Microsoft Exchange Auth Certificate could get deleted so beware!!)

See the following reference regarding the binding in IIS:

Hope this helps!