Chapter 20. FlexRoute

Switches started out being strictly Layer 2 (L2) devices until the advent of the Layer 3 (L3) switch, which, if you are as old as me, you might remember were called brouters at their inception. Brouter, being a portmanteau of the words bridge and router, was not the kind of word that people enjoyed saying and so was mostly lost to the realm of forgotten terms. “Hey boss, we need another brouter!” <shudder>

Although switches have been able to “do Layer 3 stuff” for a very long time now, it was mostly a convenience thing that allowed us to build networks without having to resort to another terrifically named network design called router on a stick. What can I tell you—it was the 90s. L3 switches really changed things in the networking world; although they allowed us to route VLANs between one another, they weren’t really routers. This lesson was learned by me the hard way when I decided to have a major provider deliver an OC3 link with an Ethernet handoff so that I didn’t need to spend big money on a router. Routers had WAN interfaces and WAN interfaces were expensive, so I outsmarted the whole system. Hah!

Well, as I soon learned, back then routers supported all sorts of things like QoS and traffic-shaping that we needed, whereas Layer 3 switches did not. Although my L3 switch could route, it wasn’t a real router. Remember, too, that routers often did some heavy lifting at the internet edge where Border Gateway Protocol (BGP) was needed. Supporting full internet routing tables with hundreds of thousands of prefixes was a complete nonstarter for an L3 switch back in the halcyon days of Cisco 3600 routers and 3550 switches. In fact, it was a complete nonstarter for everyone until Arista came along.

Fast-forward to 2016 or so, and Ethernet handoff has become the norm. No one wants to pay for an OC192 10 Gbps SONET setup when 10 Gbps Ethernet has been standard and getting progressively more inexpensive for years. With many vendors now switching via off-the-shelf Application-Specific Integrated Circuit (ASICs), the ability to forward that kind of traffic in an L3 switch was simple. The problem, though, was still that huge routing table. Not only do most switches not have the memory needed to store that many routes, but actually programming the hardware forwarding tables in the ASIC just wasn’t possible. That’s not what they were designed for! Arista, however, found a way.

How FlexRoute Works

So how does FlexRoute work? I can’t tell you that. Seriously, I asked, and I can’t tell you that. What I can tell you is that Arista has figured out a way to make merchant silicon do things that it wasn’t designed to do, and that is seriously freaking cool. What I can tell you is that this is absolutely a hardware thing, so you can’t build this in vEOS (Chapter 32). It is also limited to Arista devices with the “R” suffix, such as the 7280R and the 7500R switches. The switches I’m using for my examples are all model DCS-7280SR-48C6-M-F, so they’ve got the FlexRoute capability along with the extra memory needed to hold all those routes.

Simulating 800,000 Routes

To show FlexRoute in action, I’ve built a deceptively simple lab, as shown in Figure 20-1.

Simple FlexRoute lab
Figure 20-1. Simple FlexRoute lab

Why do I say that it’s deceptively simple? Because to make it work, I had to somehow get almost 400,000 routes onto my Arista switches without fancy tools. I didn’t have an Ixia, and I like to make my examples repeatable by anyone.

To generate almost 400,000 routes, I wrote a simple eAPI script. Well, two of them, actually: one to create roughly 388,000 /32 routes and another to create around 393,000 /24 routes. Why the strange numbers? For two reasons. First, I went to CIDR.org and determined how big the internet routing table was in terms of number of prefixes. As of today (January 30, 2019) there are 763,024 prefixes in the global internet table, so my goal was to get roughly that many prefixes into one of my 7280Rs via BGP.

Without any sort of sophisticated route injection tool, I decided to resort to Python and eAPI (Chapter 30) to generate a bunch or routes programmatically. For these to work, eAPI must be enabled to run via localhost. Also, these scripts must be run on the switch itself as written, though that’s not a strict requirement if you wanted to change that. Following are the eAPI commands needed on the switch to allow localhost API. Technically the username isn’t needed, because I’m connecting via localhost, but I like to boilerplate my configurations, and so it’s there regardless of that fact:

username Script secret Arista
management api http-commands
   protocol http localhost
   no shutdown

Here is the script that I used to generate the /32 routes:

 #!/usr/bin/python

import jsonrpclib, sys

IP         = "127.0.0.1"
ScriptUser = "Script"
ScriptPass = "Arista"
target     = "http://" + ScriptUser + ":" + ScriptPass + "@" + IP + 
             ":8080/command-api"
switch     = jsonrpclib.Server( target )
allRoutes  = ["enable", "configure"]

print "
   '.' = 256 routes added.
   '*' = 65k routes added.
"

with open("routes.txt", "w") as routeFile:
   routeCounter = 0
   oct2         = 1
   #while oct2 <= 10:
   while oct2 <= 6:
       oct3 = 0
       while oct3 <= 255:
           oct4 = 0
           while oct4 <= 254:
               newRoute =      "ip route 11."  + 
                                str(oct2) + "." + 
                                str(oct3) + "." + 
                                str(oct4) + " 255.255.255.255 Null0"
               allRoutes.append(newRoute)
               routeCounter += 1
               oct4 += 1
           sys.stdout.write(".")
           # Comment out next line for testing
           response = switch.runCmds( 1, allRoutes )
           del allRoutes[:]
           allRoutes = ["enable", "configure"]
           oct3 += 1
       sys.stdout.write("*")
       oct2 += 1

print "
   " + str(routeCounter) + " routes created. 
"

And here is the script that I used to generate the /24 routes. This script was run on Arista-R-B:

#!/usr/bin/python

import jsonrpclib, sys

IP         = "127.0.0.1"
ScriptUser = "Script"
ScriptPass = "Arista"
target     = "http://" + ScriptUser + ":" + ScriptPass + "@" + IP + 
             ":8080/command-api"
switch     = jsonrpclib.Server( target )
allRoutes  = ["enable", "configure"]

print "
   '.' = 256 routes added.
   '*' = 65k routes added.
"

with open("routes.txt", "w") as routeFile:
   # Makes 21.0.0.0/24 - 26.255.255.0/24
   oct1  = 21
   routeCounter = 0
   while oct1 <= 26:
      oct2  = 0
      while oct2 <= 255:
          oct3 = 0
          while oct3 <= 255:
             newRoute =      "ip route " + 
                              str(oct1)  + "." + 
                              str(oct2)  + "." + 
                              str(oct3)  + ".0 255.255.255.0 Null0"
             allRoutes.append(newRoute)
             routeCounter += 1
             oct3 += 1
          sys.stdout.write(".")
          # Comment out next line for testing
          response = switch.runCmds( 1, allRoutes )
          del allRoutes[:]
          allRoutes = ["enable", "configure"]
          oct2 += 1
      sys.stdout.write("*")
      oct3 = 0
      oct1 +=1

print "

   " + str(routeCounter) + " routes created. 
"

You might notice some weirdness, such as starting at 1 instead of 0 for octets. This was a way for me to tweak the number of routes being generated in a simple way.

I should point out that if you’re going to try this, each of these scripts takes approximately two hours to run. That’s a lot of routes we’re putting into the running-config, and it just takes time. Here’s what the output of one of the scripts looks like:

[admin@Arista-R-A ~]$ ./HellaRoutes-32.py

   '.' = 256 routes added.
   '*' = 65k routes added.

.................................................................................
.................................................................................
.................................................................................
.............*...................................................................
.................................................................................
.................................................................................
...........................*.....................................................
.................................................................................
.................................................................................
.........................................*.......................................
.................................................................................
.................................................................................
.......................................................*.........................
.................................................................................
.................................................................................
.....................................................................*...........
.................................................................................
.................................................................................
.................................................................................
..*
   391680 routes created.

Just to reiterate, this takes a long time and will consume a fair bit of the switch’s CPU capacity while running. Here’s a snippet of show proc top from while the script was running:

top - 20:48:40 up 3 days, 42 min,  2 users,  load average: 1.50, 1.49, 1.24
Tasks: 293 total,   2 running, 291 sleeping,   0 stopped,   0 zombie
%Cpu(s): 29.7 us,  1.7 sy,  0.0 ni, 68.2 id,  0.0 wa,  0.2 hi,  0.1 si,  0.0 st
KiB Mem:  32458980 total,  4711688 used, 27747292 free,   295228 buffers
KiB Swap:        0 total,        0 used,        0 free,  2727296 cached

  PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND
 2595 root      20   0  844m 468m 325m S 100.4  1.5  41:24.17 ConfigAgent
 3381 root      20   0  599m 403m 286m S  10.3  1.3  10:04.68 Rib
 2543 root      20   0  687m 366m 238m R   3.6  1.2  19:00.71 Sysdb
 3686 root      20   0  978m 327m 215m S   2.7  1.0 179:36.10 SandFap

With my Arista-R-A and Arista-R-B devices fully loaded, I then went to Arista-R-Z to see what BGP had learned. First, here is the memory utilization on Arista-R-Z:

Arista-R-Z#sho proc top once | grep Mem
KiB Mem:  32458980 total,  8352096 used, 24106884 free,   275272 buffers

8.3 Gb of RAM is being used on this switch. I’m glad I got the switch with the enhanced memory! How about BGP? Let’s take a peek at what it sees:

Arista-R-Z#sho ip bgp summ
BGP summary information for VRF default
Router identifier 192.168.100.1, local AS number 65100
Neighbor Status Codes: m - Under maintenance
  Neighbor   V  AS      MsgRcvd   MsgSent  InQ OutQ  Up/Down State  PfxRcd PfxAcc
  10.10.1.2  4  65001      1711        30    0    0 20:26:20 Estab  388621 388621
  10.10.2.2  4  65002      1619        95    0    0 20:26:20 Estab  393216 393216

Looks good! So what’s the problem? The problem is that those routes are in memory, but that doesn’t necessarily mean that they’ve been programmed into the ASIC’s forwarding tables. How can you tell? The easy way is with the show ip route command:

Arista-R-Z(config)#sho ip route

VRF: default
======================================================
WARNING: Some of the routes are not programmed in    
hardware, and they are marked with '*'.              
======================================================
[-- route key removed --]

Gateway of last resort:
 S      0.0.0.0/0 is directly connected, Null0

 C      10.0.0.0/24 is directly connected, Management1
 C      10.10.1.0/30 is directly connected, Ethernet1
 C      10.10.2.0/30 is directly connected, Ethernet2
 C      10.10.3.0/30 is directly connected, Ethernet3
 C      10.10.5.0/30 is directly connected, Ethernet5
 C      10.10.7.0/30 is directly connected, Ethernet7
 C      10.10.8.0/30 is directly connected, Ethernet8
 B E    11.1.1.1/32 [200/0] via 10.10.1.2, Ethernet1
*B E    11.1.1.2/32 [200/0] via 10.10.1.2, Ethernet1
 B E    11.1.1.3/32 [200/0] via 10.10.1.2, Ethernet1
 B E    11.1.1.4/32 [200/0] via 10.10.1.2, Ethernet1
 B E    11.1.1.5/32 [200/0] via 10.10.1.2, Ethernet1
 B E    11.1.1.6/32 [200/0] via 10.10.1.2, Ethernet1
 B E    11.1.1.7/32 [200/0] via 10.10.1.2, Ethernet1
 B E    11.1.1.8/32 [200/0] via 10.10.1.2, Ethernet1
 B E    11.1.1.9/32 [200/0] via 10.10.1.2, Ethernet1
 B E    11.1.1.10/32 [200/0] via 10.10.1.2, Ethernet1
 B E    11.1.1.11/32 [200/0] via 10.10.1.2, Ethernet1
 B E    11.1.1.12/32 [200/0] via 10.10.1.2, Ethernet1
 B E    11.1.1.13/32 [200/0] via 10.10.1.2, Ethernet1
*B E    11.1.1.14/32 [200/0] via 10.10.1.2, Ethernet1
*B E    11.1.1.15/32 [200/0] via 10.10.1.2, Ethernet1
*B E    11.1.1.16/32 [200/0] via 10.10.1.2, Ethernet1
*B E    11.1.1.17/32 [200/0] via 10.10.1.2, Ethernet1
*B E    11.1.1.18/32 [200/0] via 10.10.1.2, Ethernet1
*B E    11.1.1.19/32 [200/0] via 10.10.1.2, Ethernet1
 --More--

See that warning that I put in bold? That’s a problem, and although there aren’t that many problems on that screen output, let’s use some Linux skills to see how many routes have asterisks next to them. I’m going to issue the show ip route command-line interface (CLI) command, pipe that to grep ^*, which will search for asterisks as the first character on the line, and then pipe that to wc –l, which will report on the number of lines in that output:

Arista-R-Z#sho ip route | grep ^* | wc -l
523780

Uh-oh! Even though I have almost 800,000 routes in memory, more than 500,000 of them are not programmed into the forwarding tables on the ASIC. That’s more than half, and that seems bad. FlexRoute to the rescue.

Configuring and Using FlexRoute

With the problem clearly identified, let’s fix it using only three commands on this 7280R switch. But first, a warning.

Warning

As you are about to see, issuing these commands can have pretty serious consequences, so you should take these steps before you begin receiving routes on the switch. I’m only doing so after we’ve received almost 800,000 routes to demonstrate the problem and so that you can see the differences with and without FlexRoute involved.

With that out of the way, let’s get this switch using FlexRoute. First, we need to tell EOS to deal with routing protocols a little differently than it does by default:

Arista-R-Z#conf
Arista-R-Z(config)#service routing protocols model multi-agent
Warning

This command may require a reboot on certain code revisions.

This can cause your switch session to become nonresponsive for a few seconds when there are already hundreds of thousands of routes, so be prepared. Also, all of the routes are removed from the IP Routing table, so that’s probably worth a warning, too. That includes the connected routes.

Next, we need to configure the ASIC to optimize its tables (more or less) for what we’re trying to accomplish. Normally, on an internet-attached device that’s getting full tables from a provider, you would use the following command:

Arista-R-Z(config)#ip hardware fib optimize prefixes profile internet

We’re not connected to an internet peer, and our tables don’t look quite like the real thing because they’re all /32s and /24s, so we’re going to tweak this a little bit and optimize for only those prefixes:

Arista-R-Z(config)#ip hardware fib optimize prefix-length 32 24
! Please restart layer 3 forwarding agent to ensure IPv4 routes are optimized

That seems serious, and it is. Restarting the L3 forwarding agent essentially causes L3 to stop working while the ASIC resets itself to work with FlexRoute. What’s more, there is no indication of what command you should actually use to reset that agent. On the 7280Rs, that command is as follows:

Arista-R-Z(config)#agent sandl3Unicast terminate
SandL3Unicast was terminated

This doesn’t seem to have an immediate effect, but it does. Over the next minute or so, routes are being shuffled around by [REDACTED—I said I couldn’t tell you that!]. Cool, right? I think so. Kudos to the people who discovered such a cool method for solving this problem. I look forward to reading the patents.

After a couple of minutes, here’s what the IP routing table looks like:

Arista-R-Z(config)#sho ip route

VRF: default
[-- route key removed --]

Gateway of last resort:
 S      0.0.0.0/0 is directly connected, Null0

 C      10.0.0.0/24 is directly connected, Management1
 C      10.10.1.0/30 is directly connected, Ethernet1
 C      10.10.2.0/30 is directly connected, Ethernet2
 C      10.10.3.0/30 is directly connected, Ethernet3
 C      10.10.5.0/30 is directly connected, Ethernet5
 C      10.10.7.0/30 is directly connected, Ethernet7
 C      10.10.8.0/30 is directly connected, Ethernet8
 B E    11.1.1.1/32 [200/0] via 10.10.1.2, Ethernet1
 B E    11.1.1.2/32 [200/0] via 10.10.1.2, Ethernet1
 B E    11.1.1.3/32 [200/0] via 10.10.1.2, Ethernet1
 B E    11.1.1.4/32 [200/0] via 10.10.1.2, Ethernet1
 B E    11.1.1.5/32 [200/0] via 10.10.1.2, Ethernet1
 B E    11.1.1.6/32 [200/0] via 10.10.1.2, Ethernet1
 B E    11.1.1.7/32 [200/0] via 10.10.1.2, Ethernet1
 B E    11.1.1.8/32 [200/0] via 10.10.1.2, Ethernet1
 B E    11.1.1.9/32 [200/0] via 10.10.1.2, Ethernet1
 B E    11.1.1.10/32 [200/0] via 10.10.1.2, Ethernet1
 B E    11.1.1.11/32 [200/0] via 10.10.1.2, Ethernet1
 B E    11.1.1.12/32 [200/0] via 10.10.1.2, Ethernet1
 B E    11.1.1.13/32 [200/0] via 10.10.1.2, Ethernet1
--More--

Did you notice what’s missing? That warning that said, Some of the routes are not programmed in hardware, and they are marked with ‘*’, is gone, as are all of those pesky asterisk-marked routes! Being the kind of guy who doesn’t believe what he reads in headers, I want to see if there are any routes with asterisks before them:

Arista-R-Z(config)#sho ip route | grep ^* | wc -l
0

Nice! Before FlexRoute we had 523,780 routes that had not been programmed into hardware, but now they all are! How many routes? Show IP route summary to the rescue:

Arista-R-Z(config)#sho ip route summary

VRF: default
   Route Source                                Number Of Routes
------------------------------------- -------------------------
   connected                                                  7
   static (persistent)                                        0
   static (non-persistent)                                    0
   VXLAN Control Service                                      0
   static nexthop-group                                       0
   ospf                                                       0
     Intra-area: 0 Inter-area: 0 External-1: 0 External-2: 0    
     NSSA External-1: 0 NSSA External-2: 0                      
   ospfv3                                                     0
   bgp                                                   781837
     External: 781837 Internal: 0                              
   isis                                                       0
     Level-1: 0 Level-2: 0                                      
   rip                                                        0
   internal                                                  25
   attached                                                   2
   aggregate                                                  0
   dynamic policy                                             0
                                                                
   Total Routes                                          781871

Number of routes per mask-length:
   /0: 1         /8: 3         /24: 393217   /30: 6        /32: 388644

There are 781,871 routes in total, of which 393,217 are /24s, and 388,644 are /32s. And every single one of them is now programmed into the ASIC’s forwarding tables. I don’t know about you, but I think that’s cool as hell.

Conclusion

FlexRoute is a nifty feature that’s available on Arista switches that have an “R” suffix in the model name, such as the 7280Rs used in this chapter. This is a feature with a fairly specialized use case, but it allows us to use our favorite L3 switches and switch operating system for that use. Although that’s a bit like marketing, consider that an Arista 7280R can be a whole lot less expensive than a full-blown internet router from another vendor.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.125.120