The purpose of this blog is to highlight how different platforms respond to ARP requests and to explore some strange default operations on Juniper IVE VPN platforms. This quirk was found during a datacentre migration, during which the top-of-rack/first-hop device changed from a Cisco IOS 6500 environment to a Nexus Switching environment. The general setup looks like this and follows an example customer with a Shared IVS setup:
In order to understand this scenario, it’s important to know what the Juniper IVE platform is and how it provides its VPN services. To that end, I’ll give a brief overview of the platform before looking at the quirk.
The Juniper 6500 IVE (Instant Virtual Extranet) platform, is a physical appliance that offers customers a unique VPN solution linking to their MPLS network. Once connected, a home worker will be connected to their corporate MPLS network just as if they were at a Branch Office.
(In order to avoid confusion between the Juniper 6500 IVE and the Cisco 6500 L3 switch -which also plays an important role in this setup but is a very different kind of device – I will just use the term IVE to refer to the Juniper platform)
As you can see from the digram above, an IVE appliance has an external port and an internal port.
The external port, as its name implies, is typically assigned a public IP address. It also has virtual ports, which are analogous to sub-interfaces, each with their own IPs. Each of these virtual ports links to an individual customers VPN platform, or a shared VPN platform that holds multiple customer solutions. A common design involves placing a firewall in between the external interface and the internet. This allows the virtual interfaces to share the same subnet as the main external interface. Customer public IPs are destination NAT’d inbound (or MIP’d if you’re using a Juniper firewall) to their corresponding virtual IPs.
The internal port, similarly services multiple customers. This port can be thought of as a trunk port, whereby each VLAN links to an individual customers VRF, typically with an SVI as the gateway – sometimes used with HSRP or other FHRP.
Customers can have either a Shared or Dedicated VPN solution. These solutions are called IVS’s (or Instant Virtual Systems). You can have multiple IVS’s on a single IVE appliance.
Shared IVS Solutions represent a single multi-tenant IVS. Basically, multiple customers connect to the same IVS and are segmented by allocating them different sign-in pages and connection policies. Options are more limited than having a Dedicated IVS but can be more cost effective.
Dedicated IVS solutions give customers more flexibility. They can have more connected users and added customisation such as 2FA and multiple realms.
When an IVS is created it needs to link to the internal port. To do this one or more VLANs can be assigned. If the platform is Dedicated, only a single VLAN needs to be assigned – namely that of the customer. This VLAN will link to an SVI in the customers VRF. If the platform is Shared, multiple the VLANs are assigned – one per customer. However in this case, a default VLAN will need to be assigned for when the IVS needs to communicate on a network that is independent from any of its individual customers. Typically the Shared Authentication VLAN is used for this.
But what is the Shared Authentication VLAN? This leads to the next part of the setup… how users authenticate.
When a VPN user logins in from home and authenticates, the credentials they enter on the sign-in page with need to be… well… authenticated. Much like the IVS solutions themselves, there are both Shared and Dedicated options.
Customers can have their own LDAP or RADIUS servers within their MPLS networks. In this case the IVE will make a request to this LDAP when a user connects. This is called Dedicated Authentication.
Alternatively, the Service Provider can offer a Shared Authentication solution. This alleviates the customer from having to build and maintain their own LDAP servers by utilising a multi-tenant platform managed by the Provider. The customer supplies the user details, and the Service Provider handles the rest.
Shared Authentication is typically used for Shared IVS’s. In order to connect to the Shared Authentication Server, a Shared IVS will allocate a VLAN – alongside all of its customer VLANs – on the internal trunk port. This links to the Providers network (for example an internal VRF or VLAN) where the Shared Authentication servers reside. It is this VLAN that is assigned as the default VLAN for the Shared IVS.
The below screenshot is taken from the Web UI of the IVE platform. It shows some of the configuration for a Shared IVS (namely IVS123). It uses a default VLAN called Shared_Auth_Network as noted by the asterisk in the bottom right table:
We’re nearly ready to look at the quirk. There is just one last thing to note regarding how a Shared IVS Platform, like IVS123, communicates with one of its customers Authentication Servers.
Here is the key sentence to remember: When a Shared IVS platform communicates with any authentication server (shared or dedicated), it will use its Shared Auth VLAN IP as the source address in the IP packet.
This behaviour seems very counterintuitive and I’m not sure why the IVS wouldn’t use the source IP of the VLAN for that customer IVS.
Whatever the reason for this behaviour, the result is that packets sourced from a Shared IVS Platform communicating to one of its customer’s Dedicated authentication servers, will be sending packets with a source IP of the Shared Auth VLAN. But such a customer isn’t using Shared Auth. Their network doesn’t know or care about the Shared Auth environment. So when their Dedicated LDAP server receives an authentication request from the IVE, it sees the source IP address as being from this Shared Auth VLAN.
The solution, however, is easy enough (barring any IP overlaps)… The customer simply places a redistributed static route into its VRF pointing any traffic to this Shared Auth subnet back to their internal port of the IVE.
To understand this better, let’s take a look at a diagram of the setup as a user attempts to connect:
Now we are equipped to investigate the quirk, which looks at a customer on a Shared IVS platform, but with Dedicated LDAP Authentication Servers.
As mentioned earlier, this quirk follows a migration of an IVE platform from an environment using Cisco IOS 6500s switches to an environment using Cisco Nexus switches.
In the both environments, trunk ports connect to the internal IVE ports with SVIs acting as gateways. The difference comes in the control and dataplane that were used. The original IOS environment was a standard MPLS L3VPN network. The Nexus environment was part of a hierarchical VxLAN DC Fabric. Leaf switches connected direct to the IVEs and implemented anycast gateway on the SVIs. Prefix and MAC information was communicated over the EVPN BGP address family and ASR9k DCIs acted as border-leaves terminating the VTEPs, which were then stitched into the MPLS core.
The key difference however, isn’t in the overlays or dataplane protocols being used. The key is how each ToR device responds to ARP…
Once the move was completed and the IVE was connected to the Nexus switches everything seemed fine at first glance. Users with Dedicated IVS’s worked. Users on Shared IVS’s who utilised the Shared Auth server could also login and authenticate correctly. However a problem was found when checking any customer who had a VPN solution configured on a Shared IVS platform with Dedicated Authentication. Despite the customer login page showing up (implying that the public facing external side was working), authentication requests to their Dedicated Auth Servers were failing.
Below shows the Web UI output of a test to connect to our example customers LDAP servers at 192.168.10.10.
As we searched for a solution to this problem, we had to keep in mind how a Shared IVS Platform makes Auth Server requests…
Focusing on just one of the customers on the Shared platform, we first checked how far a trace would get from the IVE to the Dedicated Auth Server. We found pretty quickly that the trace would not even reach the first hop – that is, the anycast gateway IP that was on the SVI of the Nexus leaf switch.
However when checking from the Nexus, both routing and tracing, we saw we could reach the Dedicated Auth Server fine – as long as we sourced from the right VRF.
nexus1# sh ip route vrf CUST_A | b 192.168.10.10 | head lines 5 192.168.10.0/24, ubest/mbest: 2/0 *via 172.16.24.34 %default, [20/0], 7w2d, bgp-65000, external, tag 500 (evpn) segid: 12345 tunnelid: 0xc39dfe04 encap: VXLAN *via 172.16.24.33 %default, [20/0], 7w2d, bgp-65000, external, tag 500 (evpn) segid: 12345 tunnelid: 0xc39dfe05 encap: VXLAN nexus1# traceroute 192.168.10.10 vrf CUST_A traceroute to 192.168.10.10 (192.168.10.10), 30 hops max, 40 byte packets 1 172.16.24.33 (172.16.24.33) 1.455 ms 1.129 ms 1.022 ms 2 172.16.20.54 (172.16.20.54) 6.967 ms 6.928 ms 6.64 ms 3 10.11.2.3 (10.11.2.3) 8.002 ms 7.437 ms 7.92 ms 4 10.24.4.1 (10.24.4.1) 6.789 ms 6.683 ms 6.764 ms 5 * * * 6 192.168.10.10 (192.168.10.10) 12.374 ms 0.704 ms 0.62 ms
This led us to check the Layer 2 between the switch and the IVE. We did this by checking the ARP table entries on the IVE. We immediately found that there were no ARP entries to be found for the ToR SVI for any customer on a Shared Platform with a Dedicated Authentication setup.
The output below shows the ARP table as seen from the console of the IVE. Note the incomplete ARP entry for 172.16.20.33, the SVI on the Nexus for our example customer.
(As a quick aside, you may notice that the HWAddress of the Nexus is showing as 11:11:22:22:33:33. This is due to the fabric forwarding anycast-gateway-mac 1111.2222.3333 command being configured.)
Please choose from among the following options: 1. View/Set IP/Netmask/Gateway/DNS/WINS Settings 2. Print Routing Table 3. Print ARP Cache 4. Clear ARP Cache 5. Ping to a Server 6. Trace route to a Server 7. Remove Routes 8. Add ARP entry 9. View cluster status 10. Configure Management port (Enabled) Choice: 3 Address HWtype HWaddress Flags Mask Iface 172.16.31.1 ether 11:11:22:22:33:33 C int0.2387 10.101.23.4 ether 11:11:22:22:33:33 C int0.1298 192.168.77.1 ether 11:11:22:22:33:33 C int0.2347 172.16.20.33 (incomplete) int0.
So there is no ARP entry. But logically this appears to be more or less the same layer 2 segment when it connected to the 6500. So what gives?
It turns out that 6500s and Nexus switches respond to ARP requests in different ways. The process on the 6500 is fairly standard and works as follows:
But a Nexus will not respond to an ARP request if the source IP is from a subnet that it doesn’t recognise:
In our example case, the Nexus switch does not recognise 10.10.10.10 as a valid source IP for the receiving interfaces (which has IP 172.16.20.33). It sees it as off-net. We could also see the ARP check failing by using debug ip arp packet on the switch.
So what’s the solution? There are a couple of ways to tackle this. We could add a static ARP entry on the IVE, but this could be cumbersome if new needed to add it for each Shared IVS. Alternatively, we could add a secondary IP to the subnet on the SVI…
Adding a secondary IP is fairly straight forward. The config would be as follows:
nexus1# sh run interface vlan 2301 ! interface Vlan2301 description Customer_A no shutdown bandwidth 2000 vrf member CUST_A no ip redirects ip address 172.16.20.33/29 ip address 10.10.10.11/31 secondary fabric forwarding mode anycast-gateway
A /31 works well in this case, encompassing only the IPs that are needed (namely 10.10.10.10 and 10.10.10.11) . This allows the ARP request to pass the aforementioned check that the Nexus performs. From here the MAC entries began to show up and connectivity to the Shared Auth Server began to work.
Please choose from among the following options: 1. View/Set IP/Netmask/Gateway/DNS/WINS Settings 2. Print Routing Table 3. Print ARP Cache 4. Clear ARP Cache 5. Ping to a Server 6. Trace route to a Server 7. Remove Routes 8. Add ARP entry 9. View cluster status 10. Configure Management port (Enabled) Choice: 3 Address HWtype HWaddress Flags Mask Iface 172.16.31.1 ether 11:11:22:22:33:33 C int0.2387 10.101.23.4 ether 11:11:22:22:33:33 C int0.1298 192.168.77.1 ether 11:11:22:22:33:33 C int0.2347 172.16.20.33 ether 11:11:22:22:33:33 C int0.2301
So this raises the question of whether or not this behaviour is desired. Should a device responding to an ARP request, check the source IP? I’d tend to lean in favour of this type of behaviour. It adds extra security and besides, it’s actually the behaviour of the IVE that is strange in this case. One would think that the IVS would use the source IP of the connecting customers subnet, instead of that of the Shared Auth VLAN. The behaviour certainly is unorthodox but finding a solution to this problem highlights some of the interesting scenarios that can arise when working with different vendors and operating systems.
I hope you’ve enjoyed the read. I’m always open to alternate ideas or general discussion so if you have any thoughts, let me know.