IVE ARP’d on for too long

The purpose of this blog is to highlight how different platforms respond to ARP requests and to explore some strange default operations on Juniper IVE VPN platforms. This quirk was found during a datacentre migration, during which the top-of-rack/first-hop device changed from a Cisco IOS 6500 environment to a Nexus Switching environment. The general setup looks like this and follows an example customer with a Shared IVS setup:

blog10_diagram1_setup

In order to understand this scenario, it’s important to know what the Juniper IVE platform is and how it provides its VPN services.  To that end, I’ll give a brief overview of the platform before looking at the quirk.

IVE Platform

The Juniper 6500 IVE (Instant Virtual Extranet) platform, is a physical appliance that offers customers a unique VPN solution linking to their MPLS network. Once connected, a home worker will be connected to their corporate MPLS network just as if they were at a Branch Office.

(In order to avoid confusion between the Juniper 6500 IVE and the Cisco 6500 L3 switch -which also plays an important role in this setup but is a very different kind of device – I will just use the term IVE to refer to the Juniper platform)

IVE Ports

As you can see from the digram above, an IVE appliance has an external port and an internal port.

The external port, as its name implies, is typically assigned a public IP address. It also has virtual ports, which are analogous to sub-interfaces, each with their own IPs. Each of these virtual ports links to an individual customers VPN platform, or a shared VPN platform that holds multiple customer solutions. A common design involves placing a firewall in between the external interface and the internet. This allows the virtual interfaces to share the same subnet as the main external interface. Customer public IPs are destination NAT’d inbound (or MIP’d if you’re using a Juniper firewall) to their corresponding virtual IPs.

The internal port, similarly services multiple customers. This port can be thought of as a trunk port, whereby each VLAN links to an individual customers VRF, typically with an SVI as the gateway – sometimes used with HSRP or other FHRP.

Shared or Dedicated

Customers can have either a Shared or Dedicated VPN solution. These solutions are called IVS’s (or Instant Virtual Systems). You can have multiple IVS’s on a single IVE appliance.

Shared IVS Solutions represent a single multi-tenant IVS. Basically, multiple customers connect to the same IVS and are segmented by allocating them different sign-in pages and connection policies. Options are more limited than having a Dedicated IVS but can be more cost effective.

Dedicated IVS solutions give customers more flexibility. They can have more connected users and added customisation such as 2FA and multiple realms.

When an IVS is created it needs to link to the internal port. To do this one or more VLANs can be assigned. If the platform is Dedicated, only a single VLAN needs to be assigned – namely that of the customer. This VLAN will link to an SVI in the customers VRF. If the platform is Shared, multiple the VLANs are assigned – one per customer. However in this case, a default VLAN will need to be assigned for when the IVS needs to communicate on a network that is independent from any of its individual customers. Typically the Shared Authentication VLAN is used for this.

But what is the Shared Authentication VLAN? This leads to the next part of the setup… how users authenticate.

Authentication

When a VPN user logins in from home and authenticates, the credentials they enter on the sign-in page with need to be… well… authenticated. Much like the IVS solutions themselves, there are both Shared and Dedicated options.

Customers can have their own LDAP or RADIUS servers within their MPLS networks. In this case the IVE will make a request to this LDAP when a user connects. This is called Dedicated Authentication.

Alternatively, the Service Provider can offer a Shared Authentication solution. This alleviates the customer from having to build and maintain their own LDAP servers by utilising a multi-tenant platform managed by the Provider. The customer supplies the user details, and the Service Provider handles the rest. 

Shared Authentication is typically used for Shared IVS’s. In order to connect to the Shared Authentication Server, a Shared IVS will allocate a VLAN – alongside all of its customer VLANs – on the internal trunk port. This links to the Providers network (for example an internal VRF or VLAN) where the Shared Authentication servers reside. It is this VLAN that is assigned as the default VLAN for the Shared IVS. 

The below screenshot is taken from the Web UI of the IVE platform. It shows some of the configuration for a Shared IVS (namely IVS123).  It uses a default VLAN called Shared_Auth_Network as noted by the asterisk in the bottom right table:

blog10_image1_default_vlan

We’re nearly ready to look at the quirk. There is just one last thing to note regarding how a Shared IVS Platform, like IVS123, communicates with one of its customers Authentication Servers.

Here is the key sentence to remember: When a Shared IVS platform communicates with any authentication server (shared or dedicated), it will use its Shared Auth VLAN IP as the source address in the IP packet.

This behaviour seems very counterintuitive and I’m not sure why the IVS wouldn’t use the source IP of the VLAN for that customer IVS.

Whatever the reason for this behaviour, the result is that packets sourced from a Shared IVS Platform communicating to one of its customer’s Dedicated authentication servers, will be sending packets with a source IP of the Shared Auth VLAN. But such a customer isn’t using Shared Auth. Their network doesn’t know or care about the Shared Auth environment.  So when their Dedicated LDAP server receives an authentication request from the IVE, it sees the source IP address as being from this Shared Auth VLAN.

The solution, however, is easy enough (barring any IP overlaps)… The customer simply places a redistributed static route into its VRF pointing any traffic to this Shared Auth subnet back to their internal port of the IVE.

To understand this better, let’s take a look at a diagram of the setup as a user attempts to connect:

blog10_diagram2_authentication

Now we are equipped to investigate the quirk, which looks at a customer on a Shared IVS platform, but with Dedicated LDAP Authentication Servers.

The quirk

As mentioned earlier, this quirk follows a migration of an IVE platform from an environment using Cisco IOS 6500s switches to an environment using Cisco Nexus switches.

In the both environments, trunk ports connect to the internal IVE ports with SVIs acting as gateways. The difference comes in the control and dataplane that were used. The original IOS environment was a standard MPLS L3VPN network. The Nexus environment was part of a hierarchical VxLAN DC Fabric. Leaf switches connected direct to the IVEs and implemented anycast gateway on the SVIs. Prefix and MAC information was communicated over the EVPN BGP address family and ASR9k DCIs acted as border-leaves terminating the VTEPs, which were then stitched into the MPLS core.

The key difference however, isn’t in the overlays or dataplane protocols being used. The key is how each ToR device responds to ARP…

Once the move was completed and the IVE was connected to the Nexus switches everything seemed fine at first glance. Users with Dedicated IVS’s worked. Users on Shared IVS’s who utilised the Shared Auth server could also login and authenticate correctly. However a problem was found when checking any customer who had a VPN solution configured on a Shared IVS platform with Dedicated Authentication. Despite the customer login page showing up (implying that the public facing external side was working), authentication requests to their Dedicated Auth Servers were failing.

Below shows the Web UI output of a test to connect to our example customers LDAP servers at 192.168.10.10.

blog10_image2_ldap_failure

As we searched for a solution to this problem, we had to keep in mind how a Shared IVS Platform makes Auth Server requests…

The search

Focusing on just one of the customers on the Shared platform, we first checked how far a trace would get from the IVE to the Dedicated Auth Server. We found pretty quickly that the trace would not even reach the first hop – that is, the anycast gateway IP that was on the SVI of the Nexus leaf switch.

blog10_image3_trace_fail

However when checking from the Nexus, both routing and tracing, we saw we could reach the Dedicated Auth Server fine – as long as we sourced from the right VRF.

nexus1# sh ip route vrf CUST_A | b 192.168.10.10 | head lines 5
192.168.10.0/24, ubest/mbest: 2/0
*via 172.16.24.34 %default, [20/0], 7w2d, bgp-65000, external, tag 500 
    (evpn) segid: 12345 tunnelid: 0xc39dfe04 encap: VXLAN

*via 172.16.24.33 %default, [20/0], 7w2d, bgp-65000, external, tag 500 
    (evpn) segid: 12345 tunnelid: 0xc39dfe05 encap: VXLAN

nexus1# traceroute 192.168.10.10 vrf CUST_A
traceroute to 192.168.10.10 (192.168.10.10), 30 hops max, 40 byte packets
1 172.16.24.33 (172.16.24.33) 1.455 ms 1.129 ms 1.022 ms
2 172.16.20.54 (172.16.20.54) 6.967 ms 6.928 ms 6.64 ms
3 10.11.2.3 (10.11.2.3) 8.002 ms 7.437 ms 7.92 ms
4 10.24.4.1 (10.24.4.1) 6.789 ms 6.683 ms 6.764 ms
5 * * *
6 192.168.10.10 (192.168.10.10) 12.374 ms 0.704 ms 0.62 ms

This led us to check the Layer 2 between the switch and the IVE. We did this by checking the ARP table entries on the IVE. We immediately found that there were no ARP entries to be found for the ToR SVI for any customer on a Shared Platform with a Dedicated Authentication setup.

The output below shows the ARP table as seen from the console of the IVE. Note the incomplete ARP entry for 172.16.20.33, the SVI on the Nexus for our example customer.

(As a quick aside, you may notice that the HWAddress of the Nexus is showing as 11:11:22:22:33:33. This is due to the fabric forwarding anycast-gateway-mac 1111.2222.3333 command being configured.)

Please choose from among the following options:
1. View/Set IP/Netmask/Gateway/DNS/WINS Settings
2. Print Routing Table
3. Print ARP Cache
4. Clear ARP Cache
5. Ping to a Server
6. Trace route to a Server
7. Remove Routes
8. Add ARP entry
9. View cluster status
10. Configure Management port (Enabled)

Choice: 3
Address       HWtype  HWaddress          Flags Mask  Iface
172.16.31.1   ether   11:11:22:22:33:33   C          int0.2387
10.101.23.4   ether   11:11:22:22:33:33   C          int0.1298
192.168.77.1  ether   11:11:22:22:33:33   C          int0.2347
172.16.20.33          (incomplete)                   int0.

So there is no ARP entry. But logically this appears to be more or less the same layer 2 segment when it connected to the 6500. So what gives?

It turns out that 6500s and Nexus switches respond to ARP requests in different ways. The process on the 6500 is fairly standard and works as follows:

blog10_diagram3_6500_arp

But a Nexus will not respond to an ARP request if the source IP is from a subnet that it doesn’t recognise:

blog10_diagram4_nexus_arp

In our example case, the Nexus switch does not recognise 10.10.10.10 as a valid source IP for the receiving interfaces (which has IP 172.16.20.33). It sees it as off-net. We could also see the ARP check failing by using debug ip arp packet on the switch.

So what’s the solution? There are a couple of ways to tackle this. We could add a static ARP entry on the IVE, but this could be cumbersome if new needed to add it for each Shared IVS. Alternatively, we could add a secondary IP to the subnet on the SVI…

The Work

Adding a secondary IP is fairly straight forward. The config would be as follows:

nexus1# sh run interface vlan 2301
!
interface Vlan2301
description Customer_A
no shutdown
bandwidth 2000
vrf member CUST_A
no ip redirects
ip address 172.16.20.33/29
ip address 10.10.10.11/31 secondary
fabric forwarding mode anycast-gateway

A /31 works well in this case, encompassing only the IPs that are needed (namely 10.10.10.10 and 10.10.10.11) . This allows the ARP request to pass the aforementioned check that the Nexus performs. From here the MAC entries began to show up and connectivity to the Shared Auth Server began to work.

Please choose from among the following options:
1. View/Set IP/Netmask/Gateway/DNS/WINS Settings
2. Print Routing Table
3. Print ARP Cache
4. Clear ARP Cache
5. Ping to a Server
6. Trace route to a Server
7. Remove Routes
8. Add ARP entry
9. View cluster status
10. Configure Management port (Enabled)

Choice: 3
Address       HWtype  HWaddress          Flags Mask   Iface
172.16.31.1   ether   11:11:22:22:33:33   C           int0.2387
10.101.23.4   ether   11:11:22:22:33:33   C           int0.1298
192.168.77.1  ether   11:11:22:22:33:33   C           int0.2347
172.16.20.33  ether   11:11:22:22:33:33   C           int0.2301

blog10_image4_ldap_success

So this raises the question of whether or not this behaviour is desired. Should a device responding to an ARP request, check the source IP? I’d tend to lean in favour of this type of behaviour. It adds extra security and besides, it’s actually the behaviour of the IVE that is strange in this case. One would think that the IVS would use the source IP of the connecting customers subnet, instead of that of the Shared Auth VLAN. The behaviour certainly is unorthodox but finding a solution to this problem highlights some of the interesting scenarios that can arise when working with different vendors and operating systems.

I hope you’ve enjoyed the read. I’m always open to alternate ideas or general discussion so if you have any thoughts, let me know.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s