All stitched up

Segment Routing is undoubtedly one of the most powerful tools in modern Service Provider networking. It introduces a source-based routing paradigm that allows ingress routers to stack instructions or “segments” onto packets. Using this you can steer traffic through a network without the need for the signalling and state management that comes with traditional MPLS Traffic Engineering.

This blog explores a scenario where traffic is steering into two sequential SR policies – essentially stitching them together. It assumes a solid understanding of the basic functionality of MPLS based Segment Routing.

Here is the topology we will be working with:

It’s a basic MPLS network, running ISIS + LDP in the core. VPNv4 routes are exchanged via the router reflector. CE1 and CE2 are customer devices connected via BGP to the Service Provider – placed inside VRF ACME.

I’ve built this lab in EVE-NG. If you have your own setup, you can download the lab and/or config files here to follow along:

The EVE-NG lab consists of:
11 x Cisco XRv 9k 7.9.2s and 2 x Cisco XE 17.03.02s
Login creds are user1/user123

The goal

As mentioned above, our goal here is to connect two SR policies together. The first policy will direct traffic from PE1 to PE5 using an explicit path. The second policy will direct traffic from PE5 to PE4 by dynamically avoiding red colored links.

Here it is in diagram form:

Whilst this is only a lab environment, this kind of traffic engineering could be used in larger environments to do tasks such as steering traffic towards DDoS scrubbers, avoiding maintenance links or crossing administrative boundaries.

If we were using MPLS RSVP to signal the separate paths, all the LSRs would need to reserve and maintain the required state. This would be done using RSVP Path or Resv message (more details here). Segment Routing can accomplish this much more efficiently.

We’ll begin by looking at the base state of the network, then walk through the steps to enable SR, before finally creating the policies.

The setup

Let’s start by checking out the ISIS and LDP config. Here is a sample from PE1:

PE1 has a VRF called ACME configured with a BGP session to the CE2:

The same type of session is configured between PE4 and CE2. This is all fairly stock standard for a Service Provider environment. We can demonstrate the basic MPLS network by running a traceroute from CE1 to CE2 (sourcing with CE1 Loopback 0 to emulate a LAN).

(NB. For the sake of this lab, the label ranges for each if the LSRs have been set to 24×00 – 24×99, where x is the identifier for that router – PE1 has identifier 1 etc.)

You can see the traffic is taking a standard ECMP path through the network. Now let’s look at getting SR working

Setting up SR

Enable Segment Routing

The first step is to enable SR and assign prefix SIDs (don’t forget to enable mpls traffic engineering router-id as well!).

We’ll set the SRGB base to be 16000 across all devices and give sequential IPv4 indexes to each router (PE1 will be SID index 1 etc.). The IPv6 indexes will be the same but +600.

I’ve enabled the router ID using the router-id lo0 command here, which works under ipv4 and ipv6. An alternative is to use mpls traffic-eng router-id lo0. This might already be in place if you are migrating from traditional MPLS-TE, but it’s only applicable to the ipv4 address-family.

Once we repeat this for all the Service Provider routers, we can see that the MPLS forwarding table now prefers Segment Routing:

And indeed, if we repeat our traceroute we can see that SR labels are now used:

ECMP is still in effect, but since the transport label stays as 16004 the whole way, we’d need to look at the IP addresses to determine the exact path.

Populate SRTE Database

From here we need to populate the SRTE database so that any dynamic policies (in our case the one that avoids RED links) can calculate their best path. This is done using the distribute link-state command under ISIS:

Once this is done across all the devices, we can see the topology by running the following command:

All devices within the same domain should show the same output. We are now ready to start setting up the policies.

Configure explicit SR Policy

The first thing to do is set up an explicit segment list that details the path we want the traffic to follow. In our case the path looks like this:

Here is the CLI:

With this done, we can create an SR policy to reference the explicit path:

So far so good. Let’s verify that it has come up okay:

We can see from the policy that it is up, but how do we steer traffic into it?

This is done by attaching a color community to the BGP router that matches the color of the policy. In our case, we’ll tag 192.168.2.0/24 inbound from CE2 with color 10

Before we commit, here is what the prefix currently looks like in the BGP table:

The additon of the color can be seen once we commit:

Note that this has only been applied to 192.168.2.0/24, not 192.168.3.0/24:

PE1 sees the color as well:

The idea here is for traffic to 192.168.20/24 to be directed into our color 10 tunnel. If this is working, we should see the CEF table on the ACME VRF recurving to the Binding SID for our SR Policy (if you look above, the Binding SID is 24123)…

But here, it looks like it is still just imposing 16004 (the SID for R4) and then 24407 (the VPNv4 label for 192.168.2.0/24). It’s then ECMP’ing it out of Gi0/0/0/0, Gi0/0/0/3 and Gi0/0/0/2.

So what gives? Why is it not using our SR Policy?

Well, we have to remember that the allocation of a prefix to a policy is based on the combination of the end-point and the color. Looking at the BGP route, the color is correct, but the end point (or next-hop in BGP talk) is still 10.1.1.4 – PE4. Our policy is defined as having an end-point of 10.1.1.5!

So let’s fix that:

Now we see that it is correctly steering down the SR policy:

The Binding SID has changed to 24125 since we refreshed the endpoint, but CEF is looking good.

However, whilst this is steering the traffic into the policy, it still won’t get us all the way. If trace from CE1 we can see that we just get stars:

The reason for this is fairly simple. This is the stack we are putting on the packet to CE2:

16006 (PE6 Prefix SID)
16007 (P1 Prefix SID)
16005 (PE5 Prefix SID)
24407 (VPN label)

As each segment is completed, the top label is popped. We can very quickly see that when the packet reaches PE5 the VPN label is exposed. But PE5 has no idea what to do with it! This VPN label was allocated by PE4 not PE5!

For our solution, this is okay at this point in the setup. Remember we will be wanting to push this traffic into a second policy that avoids all RED links and does end up at PE4.

For now, and just for the sake of getting our traceroute working, let’s add the PE4 label to the bottom of our explicit stack, so that PE5 can forward traffic on to PE4. We’ll remove this later:

Now a traceroute works correctly:

For the explicit path to PE5 to work, we need to make sure that a label that PE5 is going to understand is exposed. To get that, we need to configure the second policy from PE4…

Configure dynamic SR Policy

This policy isn’t going to be explicitly defined. Rather, we’re going to define the conditions of the policy (namely to avoid red links) and let the head end router figure it out. The first step in creating a dynamic policy that avoids red links is to, well, configure some RED links!

As a reminder, these are the links we want to color red:

Before going any further, let’s get some clarity on the term color and the different ways it is used within the context of this lab.

Color

This scenario uses the term color to refer to multiple different things and it can get confusing if you don’t know what you’re looking at. The two ways we’re concerned with color are as follows:

Policy Coloring

The first, is the color that we have already seen when defining an SR policy. This is an identifier for the policy. If a prefix is tagged with that color (in the case of BGP, it will be an attribute) and its next-hop matches the policy endpoint, traffic to that prefix will be steered into the policy. This is exactly how we’ve steered traffic into our explicit tunnel at PE1.

Link Coloring

The second way in which color is used is with regards to link coloring. Coloring a connection between two devices works by using something called link affinities (also called Admin group from the MPLS TE RSVP days). When we entered the distribute-link state command above, ISIS started to advertise Segment Routing details in its TLVs. This includes details about the links themselves – like metric, delay and link affinities. The link affinity is basically a string made up of ones and zeros that we can set and use how we wish. In this case, we’re using the affinity to “color” a link. I put the word “color” in air quotes, because from a CLI perspective, the word color isn’t actually referenced. We engineers use the term color because it’s easy to visualise a link that way.

This section looks at the latter of the two color definitions. Within the CLI the link-affinity is referenced as an integer number. For our scenario, let’s make 7 represent RED. Here’s how it would look:

NB. As a side note, if you are configuring colors for different Flex-Algos, that config would go under the IGP (ISIS) and not under segment-routing. I won’t detail this here, but the principle of link coloring, with or without Flex-Algo, is the same.

Now that we’ve colored the link we can verify it:

The output is a little messy, so I’ve piped the command and omitted the full result, but you can clearly see that the affinity bits (Admin groups) have been set for PE4’s links to PE5 and P2.

The next step would be to configure the policy on PE5 to avoid RED links. This looks similar to the explicit policy we used before but it instead uses the (surprise, surprise) dynamic keyword. We’ll give the policy a the color number of 20 (in the SR policy sense, not the link color sense), just to differentiate it from the PE1 – although colors will be locally significant:

Great. Now if we check the policy, we can see that it is up:

This might look to be working, but if you look at the SID list, it only appears to be adding 16004 onto the packet. This means it will simply send the traffic straight to PE4 without avoiding the RED links. To prove this, we can look at the LFIB forwarding behaviour for 16004. It just sends it out of the Gi0/0/0/1 interface (direct to PE4!)

The reason for this is simple: The end that the link is colored on matters. 

We’ve colored Gi0/0/0/0 and Gi0/0/0/3 on PE4. But nothing else. 

Gi0/0/0/1 on PE5 isn’t colored RED. This might seem like a limitation, having to color both ends, but it allows traffic in different directions to take different paths, which could be handy depending on the circumstance. To help visualise this, it might be easier to think of the color as being applied to the outbound interface rather than the link as a whole. To put this in diagram form, this is what we’ve done:

So in our case, traffic from PE5 to PE4 will not be considered to be crossing a red link (but traffic from PE4 to PE5 would). We don’t need any multi-directional differences in our lab, so to make this consistent, let’s color the interfaces on PE5 and P2 facing PE4.

With this corrected, here is what our policy on PE5 looks like:

This is looking much better! The policy is going via PE3 (10.1.1.3), through P2 (10.1.1.8). It is using the Node SID of P2 first, then the node SID of PE3.

Stitching two policies together

Now that we’ve got both policies working, we need to stitch them together at PE5. 

We’ve now got both of our policies working:

  • The first policy will use an explicit path from PE1 to PE5
  • The second policy will use a dynamic path from PE5 to PE4, avoiding red links.

We steered traffic into the first policy by tagging 192.168.2.0/24 with a color attribute 10, so that it matches the color (and endpoint) on PE1’s explicit policy.

But how do we steer traffic into our second policy. Well to understand this, we need to consider different ways that traffic is directed into SR policies:

Directing traffic into an SR Policy

An SR policy is all well and good, but it doesn’t mean much if you can’t actually steer traffic into it. We’ve already seen one way – namely by tagging a BGP route with the right color attribute. But there are other ways to accomplish this.

If the incoming packet is unlabelled you could use a static route, or some form of policy based routing – pseudowires can be configured to prefer a given SR policy etc.

But what we’re interested in here is how incoming labelled traffic enters an SR policy. Afterall, traffic coming from PE1 to PE5 on the explicit path will arrive with labels.

The way to steering labelled traffic into a SR policy is to use what is called the Binding SID of a given policy. The Binding SID is a locally significant label that instructs the router to steer any arriving traffic with that label into the SR policy. The incoming packet with the Binding SID on top, will have the Binding SID removed and then the labels associated with that policy imposed on to it.

We’ve already seen a form of this earlier when looking at the CEF entry for our first policy. The CEF entry showed the local Binding SID as being imposted. This will in turn apply the explicit segment list we specified.

So with this in mind, we need to make sure that traffic arriving at PE5 has the Binding SID for the SR policy that avoids red links. Re-checking the policy on PE5 shows that it has a Binding SID of 24529:

The Binding SID is automatically generated and comes from a random pool – typically the same pool that LDP labels are pulled from. If PE5 were to reload, this number could change, meaning we’d have to change our policy on PE1. To avoid this, we can statically set the Binding SID as follows: 

Whoops. This doesn’t seem to have work. It’s unhappy with 24500, stating that there is a conflict. If we check our MPLS configuration we can see why:

Our dynamic label range has been set to 24500 24599. This is from when we had LDP configured. We can’t set an explicit Binding SID from within a dynamic range. The Explicit binding SID should come from the SRLB which defaults to 15,000 – 15,999.

We’ll allocate 15005 as the Binding SID:

Brilliant. Now that we’ve got the Binding SID set, the final step is to change the policy from PE1 to ensure that when traffic arrives at PE5, it has SID 15005 on top.

Remember we previously added 16004 to PE1’s policy. This was just so that PE5 had something it recognised once traffic reached it and our test traceroute could work. We’ll remove that first and replace it with 15005. 

Looking good. Let’s try our traceroute from CE1 to CE2:

It works! We can see the traffic following the P6→ P1 → PE5 explicit path, before entering the P4→P5 dynamic path that avoids red using the 15005 label. The 24407 is the VPNv4 advertised from P4 for 192.168.2.0/24.

As proof of concept we can see that tracing to 192.168.3.50 (loopback1) on CE2, whose BGP route does not have a color 10 attribute, is traversing the normal ECMP path we saw at the beginning:

Here is the final diagram to visualise the result:

So that’s it! There are a lot of different options that SR allows us to use in order to steer traffic intelligently and smoothly across a network. This lab has shown us but one of the methods at our disposal. Further steps might be to implement PCEP for increased scalabilty or introduce more dynamic routing options like performance-measurement – but I’ll leave this variation for a possible later blog.

Thanks so much for reading. Let me know what you think or if you have any comments. Until next time.

Leave a comment