Tracing a Layer 2 Path on Nexus Switches

Ever been stuck trying to figure out the exact switching path that packets take through your network? Me too. Here’s how I solved the problem without fancy Layer 2 traceroute tools.

I’ve recently been working in a data center environment with Nexus 7K  and 5K switches in the core. The core is almost completely Layer 2, with most routing pushed to the distribution layer. During the first week, we ran into a problem forwarding jumbo frames. Some vlans that used jumbo frames worked fine, but one vlan simply wouldn’t work. The network team went to some effort to prove our innocence, and there was the usual veiled finger-pointing by everyone else, “I’m not saying it’s a network problem, but…”. Yeah, yeah, we know: the network is assumed guilty until proven innocent.

In my experience, troubleshooting jumbo frames begins simply. Either every device in the forwarding path allows jumbo packets, or they don’t. If just one interface in the path doesn’t allow jumbo frames, the conversation breaks. So the crucial first question is “what is the path?”

In a routed environment, this would be a no-brainer: traceroute would have your answer. But this is a Layer 2 environment. What I needed was a layer 2 traceroute tool. Turns out Cisco does offer a Layer 2 traceroute utility for IOS on both the Cisco 7600 series routers and Catalyst 3560 series switches. It’s been around since 12.2(18) and you can use either MAC address or IP address to run the trace.

However, it didn’t work in NX-OS. And I did try. Several times. Just to be sure.

So, what was left was a manual Layer 2 trace, which means manually searching through the mac address-tables. Kind of cumbersome, but still doable. It was going to be tricky though, since the core switches were using both port-channels and virtual port-channels. The command mac address-table alone was not going to cut it, as sometimes the switch would see the MAC address over a port-channel, and I’d need to know which interface in the port channel had forwarded the packet.

However, before jumping in, I needed the source and destination MAC addresses, as well as source and destination IP address (more on this later). Once the sever team provided all these, I began by finding the exact source switch and interface:

sh mac address-table | inc AAAA.AAAA.AAAA

SWITCH-A# sh mac address-table | inc AAAA.AAAA.AAAA
   VLAN     MAC Address      Type      age  Secure NTFY   Ports
* 200      AAAA.AAAA.AAAA    dynamic   10      F    F     Eth101/1/2

Once the originating switch and port was identified, I could begin looking for the path to the destination. On the source switch, I ran:

sh mac address-table | inc BBBB.BBBB.BBBB

SWITCH-A# sh mac address-table | inc BBBB.BBBB.BBBB
   VLAN     MAC Address      Type      age  Secure NTFY   Ports
* 200      BBBB.BBBB.BBBB    dynamic   10      F    F     Po1

Guess what? The MAC was found on a port-channel. So, to find the physical interfaces included in that port-channel, I ran:

show port-channel summary

SWITCH-A# sh port-channel sum
Flags:  D - Down        P - Up in port-channel (members)
        I - Individual  H - Hot-standby (LACP only)
        s - Suspended   r - Module-removed
        S - Switched    R - Routed
        U - Up (port-channel)
        M - Not in use. Min-links not met
Group Port-Channel  Type     Protocol  Member Ports
1     Po1(SU)       Eth      LACP       Eth1/1(P)    Eth1/2(P)

This showed which physical interfaces each port-channel contains. With this, I looked in the CDP neighbor table to see which neighbor these interfaces connect to.

show cdp neighbor

SWITCH-A# sh cdp ne
Capability Codes: R - Router, T - Trans-Bridge, B - Source-Route-Bridge
 S - Switch, H - Host, I - IGMP, r - Repeater,
 V - VoIP-Phone, D - Remotely-Managed-Device,
 s - Supports-STP-Dispute
Device-ID Local Intrfce Hldtme Capability Platform     Port ID
          Eth1/1        125    S I s      N5K-C5548    Eth1/1
          Eth1/2        128    S I s      N5K-C5548    Eth1/2

Turns out the two physical interfaces connect to two different neighbors. Why? Virtual Port-Channel. This is where things get a little tricky.

I needed to figure out which physical interface is actually forwarding the packets, since they lead to different switches. I had no clue which command would accomplish this.  Fortunately, Cisco TAC did know.

sh port-channel load-balance forwarding-path int port-channel 1 vlan 101 src-ip dst-ip

This command is full of options, and if you question-mark your way through it, you can tweak it a variety of different ways. Remember the source and destination IPs? This is where you’ll use them. The output shows which physical interface the packets are taking, as well as which load-balanceing algorithm the port-channel is using. In this case, it was just using source and destination IP only.

SWITCH-A# sh port-channel load-balance forwarding int port-channel 1
          vlan 140 src-ip dst-ip
Missing params will be substituted by 0's.
Load-balance Algorithm on switch: source-dest-ip
crc8_hash: 11 Outgoing port id: Ethernet1/2
Param(s) used to calculate load-balance:
dst-mac: 0000.0000.0000
src-mac: 0000.0000.0000

With the physical interface info, I could correlate with the CDP neighbors table, and find which neighbor to check next.

I moved to the next switch, repeated the whole process, moved to the third, ran the procedure again, moved on yet again … sigh. Eventually, the MAC address-table entry didn’t point to a CDP neighbor, but instead pointed to a single physical interface with only one MAC address in the MAC address-table.

At last. I’d found the full, one-way Layer 2 path.

At this point, it would be easy assume that the return path is symmetrical, and call it a day. But in this case, given how much everyone else had already worked on it (with no success), my hunch said the traffic followed an asymmetrical return path. So, once again into the CLI, I repeated everything until I returned to the source. Sure enough, one device on the different return path was not configured for jumbo frames – problem found. One maintenance window later, problem solved.

All told, this procedure took about an hour. But with some Layer 2 traceroute tool, it would have taken about 5 minutes. This is a great opportunity for Cisco to expand the Layer 2 traceroute to NX-OS, especially since the Nexus line goes into the core of many large networks. Maybe even some enterprising startup with mad programming skills could develop an app with a Cisco API that would spider through all these tables and display the path. No doubt the big monitoring packages like What’s UP Gold or HP OpenView or Cisco Prime already do it, but how about a scaled-down version for the rest of us?


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s