The test conducted below was my attempt to observe and
document EVPN’s all-active load balancing capabilities under normal operating
conditions.  By examining the  BGP routing and Wireshark
traces, my objective was to get a detailed understanding
of how the MPLS labels were exchanged and used in an EVPN network to achieve the load balancing
behavior.
Test Plan
- Simulate traffic flow from CE_R27 to CE_R29
- Observe interface statistics to determine traffic routing
- Observe BGP routing and label exchange
- Perform targeted packet captures on PE links
- Examine packet captures for label usage
Test Traffic
Continuous ping from host CE_R27 to CE_R29.
| 
CE_R27#ping 172.16.50.2
  rep 2147483647 
Type escape
  sequence to abort. 
Sending
  2147483647, 100-byte ICMP Echos to 172.16.50.2, timeout is 2 seconds: 
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 
.. snip .. | 
Traffic
Observation
CE_R27’s Traffic
Distribution
CE_R27’s G1 and G2 Ether-channeled interface.  The input rate was double the output.  
| 
CE_R27#sh int
  po50 | in rate 
  Queueing strategy: fifo 
  5 minute input rate 50000 bits/sec, 55 packets/sec 
  5 minute output rate 25000 bits/sec, 28 packets/sec | 
CE_R27’s G1 Ethernet interface to PE_MXR01.  This traffic flow selected G1 as the egress
interface and sent 28 packets per second (pps). 
The input rate was identical to the output, so this would suggest these
were the legitimate replies.
| 
CE_R27#sh int
  g1 | in rate   
  Queueing strategy: fifo 
  5 minute input rate 25000 bits/sec, 28 packets/sec 
  5 minute output rate 25000 bits/sec, 28 packets/sec | 
CE_R27’s G2 Ethernet interface to PE_MXR03.  This interface was receiving 28 pps of extra
traffic.
| 
CE_R27#sh int
  g2 | in rate 
  Queueing strategy: fifo 
  5 minute input rate 25000 bits/sec, 28 packets/sec 
  5 minute output rate 0 bits/sec, 0 packets/sec | 
PE_MXR01’s Traffic Distribution
PE_MXR01’s AC interface to CE_R27 (Gig1).  This AC interface’s input/output rates were
in-line with the CE.
| 
admin@PE_MXR01>
  show interfaces ge-0/0/2.500 detail | grep pps  
     Input 
  packets:             
  2613229                   28 pps 
     Output packets:               696154                   28 pps | 
PE_MXR01 MPLS interface to P_R03.  Majority of traffic was via P3 in the MPLS
core.
| 
admin@PE_MXR01>
  show interfaces ge-0/0/1.44 detail | grep pps      
     Input 
  packets:              
  698531                   28 pps 
     Output packets:              2615863                   28 pps | 
PE_MXR01 MPLS interface to P_R01.  No traffic out to the P1 core router.
| 
admin@PE_MXR01>
  show interfaces ge-0/0/1.45 detail | grep pps     
     Input 
  packets:                   
  2                    0 pps 
     Output packets:                    0                    0 pps | 
PE_MXR02’s Traffic Distribution
PE_MXR02’s AC interface to CE_R28.  The destination AC interface was seeing 28
pps in both directions.  This would
indicate that the end host only received and replied 28 pps worth of traffic,
not anything more.
| 
admin@PE_MXR02>
  show interfaces ge-0/0/2.500 detail |grep pps  
     Input 
  packets:             
  2265249                   28 pps 
     Output packets:              2485439                   28 pps | 
PE_MXR02’s MPLS interface to P_R03.  At this point in the network, the output rate
doubles.  If the AC interface towards
CE_R28 saw only 28 pps of outbound traffic, this would suggest this PE
duplicated traffic.
| 
admin@PE_MXR02>
  show interfaces ge-0/0/1.46 detail |grep pps     
     Input 
  packets:             
  2513443                   28 pps 
     Output packets:              3069218                   56 pps | 
PE_MXR02’s MPLS interface to P_R04.  No traffic to P4.
| 
admin@PE_MXR02>
  show interfaces ge-0/0/1.47 detail |grep pps      
     Input 
  packets:               
  23965                    0 pps 
     Output packets:                    0                    0 pps | 
PE_MXR03’s Traffic
Distribution
PE_MXR03’s AC interface to CE_R27 (Gig2).  The redundant AC interface to CE_R27 only saw
outbound traffic.  This was the 28 pps of
extra traffic that was sent from PE_MXR02.
| 
admin@PE_MXR03>
  show interfaces ge-0/0/3.500 detail |grep pps  
     Input 
  packets:                   
  0                    0 pps 
     Output packets:               170575                   28 pps | 
PE_MXR03’s MPLS interface to P_R02.  Again, this was the 28 pps of extra traffic
received from PE_MXR02.
| 
admin@PE_MXR03>
  show interfaces ge-0/0/1.48 detail |grep pps      
     Input 
  packets:              2412297                   28 pps 
     Output packets:                  493                    0 pps | 
PE_MXR03’s MPLS interface to P_R01.  No traffic to P1.
| 
admin@PE_MXR03>
  show interfaces ge-0/0/1.49 detail |grep pps     
     Input 
  packets:                   
  3                    0 pps 
     Output packets:                    0                    0 pps | 
Wireshark Traffic
Analysis
Capture 1 – Link
between PE_MXR01 and P_R03
Capture 1, Frame 2 was the ICMP request from CE_R27 to CE_R28 as seen between PE_MXR01 and P_R03.  This request was
sent to CE_R28's MAC address of 00:0c:29:49:aa:8c.  
Based on the Type 2 route lookup for this destination MAC, a zero ESI would indicate that
the destination was single-homed and therefore a Type 1 aliasing route lookup wasn't
necessary.  As seen from the capture, its label usage (bottom label 299776, top label 328) was consistent with the route lookup.
| 
admin@PE_MXR01>
  show route table bgp.evpn.0 extensive 
bgp.evpn.0: 8
  destinations, 8 routes (8 active, 0 holddown, 0 hidden) 
.. snip..  
2:112.112.112.112:50::500::00:0c:29:49:aa:8c/304 MAC/IP (1 entry, 0
  announced) 
        *BGP    Preference: 170/-101 
                Route Distinguisher:
  112.112.112.112:50 
                Next hop type: Indirect, Next
  hop index: 0 
                Address: 0xb7a31f0 
                Next-hop reference count: 6 
                Source: 112.112.112.112 
                Protocol next hop:
  112.112.112.112 
                Indirect next hop: 0x2
  no-forward INH Session ID: 0x0 
                State: <Active Int Ext> 
                Local AS:  2345 Peer AS:  2345 
                Age: 4:59       Metric2: 1  
                Validation State: unverified  
                Task: BGP_2345.112.112.112.112 
                AS path: I               
                Communities: target:2345:50 
                Import Accepted          
                Route Label: 299776      
                ESI: 00:00:00:00:00:00:00:00:00:00 
                Localpref: 100           
                Router ID: 112.112.112.112 
                Secondary Tables:
  EVPN_CUSTOMER_G_ELAN_500.evpn.0 
                Indirect next hops: 1    
                        Protocol next hop:
  112.112.112.112 Metric: 1 
                        Indirect next hop: 0x2 no-forward INH
  Session ID: 0x0 
                        Indirect path
  forwarding next hops: 1 
                                Next hop
  type: Router 
                                Next hop:
  10.1.1.81 via ge-0/0/1.44 
                                Session Id:
  0x0 
                        112.112.112.112/32
  Originating RIB: inet.3 
                          Metric: 1                       Node path count: 1 
                          Forwarding
  nexthops: 1 
                                Nexthop: 10.1.1.81 via
  ge-0/0/1.44 | 
| 
admin@PE_MXR01>
  show route table bgp.evpn.0  
bgp.evpn.0: 8
  destinations, 8 routes (8 active, 0 holddown, 0 hidden) 
+ = Active
  Route, - = Last Active, * = Both 
.. snip .. 
2:112.112.112.112:50::500::00:0c:29:49:aa:8c/304 MAC/IP         
                   *[BGP/170] 00:17:37,
  localpref 100, from 112.112.112.112 
                      AS path: I,
  validation-state: unverified 
                    > to 10.1.1.81 via
  ge-0/0/1.44, Push 328 | 
Capture 2 – Link between
P_R03 and PE_MXR02
Capture2, Frame 1 was the ICMP request from CE_R27 to CE_R28 as seen from P_R03 to PE_MXR02.  This request was sent to CE_R28's MAC address of 00:0c:29:49:aa:8c.  
As P_R03 forwarded the packet to PE_MXR02, it popped the top
label of 328.  The VPN label of 299776
was looked up, associated to the EVPN instance and the packet was then delivered to CE_R28.  At this point, everything looked normal.
| 
admin@PE_MXR02>
  show route table mpls.0 label 299776 
mpls.0: 52
  destinations, 52 routes (52 active, 0 holddown, 0 hidden) 
+ = Active
  Route, - = Last Active, * = Both 
299776            
  *[EVPN/7] 00:18:26, routing-instance EVPN_CUSTOMER_G_ELAN_500,
  route-type Ingress-MAC,
  vlan-id 500 
                      to table
  EVPN_CUSTOMER_G_ELAN_500.evpn-mac.0 | 
Capture 2, Frame 2 was the ICMP reply from CE_R28 to CE_R27 as seen from PE_MXR02 to P_R03.  According to Wireshark’s analysis (arrows),
this was Frame 1’s corresponding reply.  
The initial Type 2 route lookup would indicate the destination host was multi-homed and as a result, the Type 1 aliasing route's labels should have been used to load balance the return traffic (label 300640 to PE_MXR01 and label 300512 to PE_MXR03).
| 
admin@PE_MXR02>
  show route table bgp.evpn.0 extensive   
   
bgp.evpn.0: 9
  destinations, 9 routes (9 active, 0 holddown, 0 hidden) 
.. snip .. 
1:111.111.111.111:50::112233445566778899::0/192
  AD/EVI (1 entry, 0 announced) 
        *BGP    Preference: 170/-101 
                Route Distinguisher:
  111.111.111.111:50 
                Next hop type: Indirect, Next
  hop index: 0 
                Address: 0xb7a5950 
                Next-hop reference count: 10 
                Source: 111.111.111.111 
                Protocol next hop:
  111.111.111.111 
                Indirect next hop: 0x2
  no-forward INH Session ID: 0x0 
                State: <Active Int Ext> 
                Local AS:  2345 Peer AS:  2345 
                Age: 20:54:43   Metric2: 1  
                Validation State: unverified  
                Task:
  BGP_2345.111.111.111.111 
                AS path: I 
                Communities: target:2345:50 
                Import Accepted 
                Route Label: 300640 
                Localpref: 100 
                Router ID: 111.111.111.111 
                Secondary Tables:
  EVPN_CUSTOMER_G_ELAN_500.evpn.0 
                Indirect next hops: 1 
                        Protocol next hop:
  111.111.111.111 Metric: 1 
                        Indirect next hop:
  0x2 no-forward INH Session ID: 0x0 
                        Indirect path
  forwarding next hops: 1 
                                Next hop
  type: Router 
                                Next hop:
  10.1.1.89 via ge-0/0/1.46 
                                Session Id:
  0x0 
                        111.111.111.111/32
  Originating RIB: inet.3 
                          Metric: 1                       Node path count: 1 
                          Forwarding
  nexthops: 1 
                                Nexthop: 10.1.1.89 via
  ge-0/0/1.46 
1:113.113.113.113:50::112233445566778899::0/192
  AD/EVI (1 entry, 0 announced) 
        *BGP    Preference: 170/-101 
                Route Distinguisher:
  113.113.113.113:50 
                Next hop type: Indirect, Next
  hop index: 0 
                Address: 0xb7a7630 
                Next-hop reference count: 8 
                Source: 113.113.113.113 
                Protocol next hop:
  113.113.113.113 
                Indirect next hop: 0x2
  no-forward INH Session ID: 0x0 
                State: <Active Int Ext> 
                Local AS:  2345 Peer AS:  2345 
                Age: 21:12:15   Metric2: 1  
                Validation State: unverified  
                Task:
  BGP_2345.113.113.113.113 
                AS path: I 
                Communities: target:2345:50 
                Import Accepted 
                Route Label: 300512 
                Localpref: 100 
                Router ID: 113.113.113.113 
                Secondary Tables:
  EVPN_CUSTOMER_G_ELAN_500.evpn.0 
                Indirect next hops: 1 
                        Protocol next hop:
  113.113.113.113 Metric: 1 
                        Indirect next hop:
  0x2 no-forward INH Session ID: 0x0 
                        Indirect path forwarding
  next hops: 1 
                                Next hop
  type: Router 
                                Next hop:
  10.1.1.89 via ge-0/0/1.46 
                                Session Id:
  0x0 
                        113.113.113.113/32
  Originating RIB: inet.3 
                          Metric: 1                       Node path count: 1 
                          Forwarding
  nexthops: 1 
                                Nexthop:
  10.1.1.89 via ge-0/0/1.46 
2:111.111.111.111:50::500::00:1e:e5:c8:0f:f1/304
  MAC/IP (1 entry, 0 announced) 
        *BGP    Preference: 170/-101 
                Route Distinguisher:
  111.111.111.111:50 
                Next hop type: Indirect, Next
  hop index: 0 
                Address: 0xb7a5950 
                Next-hop reference count: 10 
                Source: 111.111.111.111 
                Protocol next hop:
  111.111.111.111 
                Indirect next hop: 0x2
  no-forward INH Session ID: 0x0 
                State: <Active Int Ext> 
                Local AS:  2345 Peer AS:  2345 
                Age: 3  Metric2: 1  
                Validation State: unverified  
                Task:
  BGP_2345.111.111.111.111 
                AS path: I 
                Communities: target:2345:50 
                Import Accepted 
                Route Label: 300640 
                ESI: 00:11:22:33:44:55:66:77:88:99 
                Localpref: 100 
                Router ID: 111.111.111.111 
                Secondary Tables:
  EVPN_CUSTOMER_G_ELAN_500.evpn.0 
                Indirect next hops: 1 
                        Protocol next hop:
  111.111.111.111 Metric: 1 
                        Indirect next hop:
  0x2 no-forward INH Session ID: 0x0 
                        Indirect path
  forwarding next hops: 1 
                                Next hop type: Router 
                                Next hop:
  10.1.1.89 via ge-0/0/1.46 
                                Session Id:
  0x0 
                        111.111.111.111/32
  Originating RIB: inet.3 
                          Metric: 1                       Node path count: 1 
                          Forwarding
  nexthops: 1 
                                Nexthop:
  10.1.1.89 via ge-0/0/1.46 | 
However, the Wireshark capture displayed PE_MXR02's frame using a label stack 
based on
the PE_MXR03’s Type3 Ingress-IM route’s label (bottom 300528, top 330).  Neither of the 
Type 2 label or Type 1 aliasing label were used.  This was not in-line 
with
the expected behavior for EVPN aliasing. 
| 
admin@PE_MXR02>
  show route table bgp.evpn.0 extensive   
   
bgp.evpn.0: 9
  destinations, 9 routes (9 active, 0 holddown, 0 hidden) 
.. snip .. 
3:113.113.113.113:50::500::113.113.113.113/248 IM (1 entry, 0
  announced) 
        *BGP    Preference: 170/-101 
                Route Distinguisher:
  113.113.113.113:50 
                PMSI: Flags 0x0: Label 300528: Type INGRESS-REPLICATION
  113.113.113.113 
                Next hop type: Indirect, Next
  hop index: 0 
                Address: 0xb7a6af0 
                Next-hop reference count: 8 
                Source: 113.113.113.113 
                Protocol next hop:
  113.113.113.113 
                Indirect next hop: 0x2
  no-forward INH Session ID: 0x0 
                State: <Active Int Ext> 
                Local AS:  2345 Peer AS:  2345 
                Age: 34:37      Metric2: 1  
                Validation State: unverified  
                Task:
  BGP_2345.113.113.113.113 
                AS path: I 
                Communities: target:2345:50 
                Import Accepted 
                Localpref: 100 
                Router ID: 113.113.113.113 
                Secondary Tables:
  EVPN_CUSTOMER_G_ELAN_500.evpn.0 
                Indirect next hops: 1 
                        Protocol next hop:
  113.113.113.113 Metric: 1 
                        Indirect next hop:
  0x2 no-forward INH Session ID: 0x0 
                        Indirect path
  forwarding next hops: 1 
                                Next hop
  type: Router 
                                Next hop:
  10.1.1.89 via ge-0/0/1.46 
                                Session Id:
  0x0 
                        113.113.113.113/32
  Originating RIB: inet.3 
                          Metric: 1                       Node path count: 1 
                          Forwarding
  nexthops: 1 
                                Nexthop:
  10.1.1.89 via ge-0/0/1.46 | 
| 
admin@PE_MXR02>
  show route table bgp.evpn.0     
bgp.evpn.0: 9
  destinations, 9 routes (9 active, 0 holddown, 0 hidden) 
+ = Active
  Route, - = Last Active, * = Both 
.. snip .. 
3:113.113.113.113:50::500::113.113.113.113/248
  IM            
                   *[BGP/170] 00:32:50,
  localpref 100, from 113.113.113.113 
                      AS path: I,
  validation-state: unverified 
                    > to 10.1.1.89 via
  ge-0/0/1.46, Push 330 | 
Furthermore, the EVPN instance output from PE_MXR02
confirmed both Type 1 and Type 2 labels were received from their respective
PEs.  As for why this PE ignored these labels to forward traffic was unknown.
| 
admin@PE_MXR02>
  show evpn instance extensive           
   
Instance:
  EVPN_CUSTOMER_G_ELAN_500 
  Route Distinguisher: 112.112.112.112:50 
  Per-instance MAC route label: 299776 
  MAC database status                     Local  Remote 
    MAC advertisements:                       1       1 
    MAC+IP advertisements:                    0       0 
    Default gateway MAC advertisements:       0      
  0 
  Number of local interfaces: 1 (1 up) 
    Interface name  ESI                            Mode             Status     AC-Role 
    ge-0/0/2.500    00:00:00:00:00:00:00:00:00:00  single-homed     Up         Root  
  Number of IRB interfaces: 0 (0 up) 
  Number of bridge domains: 2 
    VLAN 
  Domain ID   Intfs / up    IRB intf  
  Mode             MAC sync  IM route label  SG sync 
  IM core nexthop 
    500                  1   
  1                Extended         Enabled   299840          Disabled 
    501                  1    1                Extended         Enabled   299856          Disabled 
  Number of neighbors: 2 
    Address               MAC    MAC+IP        AD   
      IM        ES Leaf-label 
    111.111.111.111         1         0         2         2         0 
    113.113.113.113         0         0         2         2         0 
  Number of ethernet segments: 1 
    ESI: 00:11:22:33:44:55:66:77:88:99 
      Status: Unresolved 
      Number of remote PEs connected: 2 
        Remote PE       
  MAC label  Aliasing label  Mode 
        113.113.113.113 
  300512     300512         
  all-active 
        111.111.111.111 
  300640     300640        
   all-active 
Instance:
  __default_evpn__ 
  Route Distinguisher: 112.112.112.112:0 
  Number of bridge domains: 0 
  Number of neighbors: 0 | 
Capture 2, Frame 3 was not part of the original ICMP
request.  This reply appeared to be replicated from PE_MXR02. 
The difference with this frame vs. the legitimate reply was that the packet’s label stack again used the Ingress-IM labels (bottom 303216, top 329) to forward towards PE_MXR01.
Although PE_MXR02 appeared to “load-balance” the reply traffic, (i.e., Frame 2 going towards PE_MXR03 and Frame 3 towards PE_MXR01) it clearly did not do so using the aliasing method. It simply replicated this traffic and sent it via another path. The strange thing was that despite the ICMP test traffic being known unicast and also receiving valid labels (from the Type 2 MAC and Type 1 EAD per EVI routes), the PE still ignored these labels.
Although PE_MXR02 appeared to “load-balance” the reply traffic, (i.e., Frame 2 going towards PE_MXR03 and Frame 3 towards PE_MXR01) it clearly did not do so using the aliasing method. It simply replicated this traffic and sent it via another path. The strange thing was that despite the ICMP test traffic being known unicast and also receiving valid labels (from the Type 2 MAC and Type 1 EAD per EVI routes), the PE still ignored these labels.
Perhaps PE_MXR02 misclassified this frame (as well as Frame 2)
as BUM traffic and therefore used the IM label for Ingress replication?  I’m unsure why this would
happen.  If it was classified as BUM, I would think this PE would have also replicated this frame and sent it to the
other host connected off PE_MXR03 (i.e., CE_R29).  However this was not seen in any of the
captures.
| 
admin@PE_MXR02>
  show route table bgp.evpn.0 extensive   
   
bgp.evpn.0: 9
  destinations, 9 routes (9 active, 0 holddown, 0 hidden) 
.. snip .. 
3:111.111.111.111:50::500::111.111.111.111/248 IM (1 entry, 0
  announced) 
        *BGP    Preference: 170/-101 
                Route Distinguisher:
  111.111.111.111:50 
                PMSI: Flags 0x0: Label 303216: Type INGRESS-REPLICATION
  111.111.111.111 
                Next hop type: Indirect, Next
  hop index: 0 
                Address: 0xb7a6790 
                Next-hop reference count: 10 
                Source: 111.111.111.111 
                Protocol next hop:
  111.111.111.111 
                Indirect next hop: 0x2
  no-forward INH Session ID: 0x0 
                State: <Active Int Ext> 
                Local AS:  2345 Peer AS:  2345 
                Age: 34:34      Metric2: 1  
                Validation State: unverified  
                Task:
  BGP_2345.111.111.111.111 
                AS path: I 
                Communities: target:2345:50 
                Import Accepted 
                Localpref: 100 
                Router ID: 111.111.111.111 
                Secondary Tables:
  EVPN_CUSTOMER_G_ELAN_500.evpn.0 
                Indirect next hops: 1 
                        Protocol next hop:
  111.111.111.111 Metric: 1 
                        Indirect next hop:
  0x2 no-forward INH Session ID: 0x0 
                        Indirect path
  forwarding next hops: 1 
                                Next hop
  type: Router 
                                Next hop:
  10.1.1.89 via ge-0/0/1.46 
                                Session Id:
  0x0 
                        111.111.111.111/32
  Originating RIB: inet.3 
                          Metric: 1                       Node path count: 1 
                          Forwarding
  nexthops: 1 
                                Nexthop:
  10.1.1.89 via ge-0/0/1.46 | 
| 
admin@PE_MXR02>
  show route table bgp.evpn.0     
bgp.evpn.0: 9
  destinations, 9 routes (9 active, 0 holddown, 0 hidden) 
+ = Active
  Route, - = Last Active, * = Both 
.. snip .. 
3:111.111.111.111:50::500::111.111.111.111/248
  IM             
                   *[BGP/170] 00:32:47,
  localpref 100, from 111.111.111.111 
                      AS path: I,
  validation-state: unverified 
                    > to 10.1.1.89 via
  ge-0/0/1.46, Push 329 | 
Conclusion
From a 40,000 foot view, the traffic appeared to be load
balanced as expected.  However after I examined the traffic more closely, this was not the case.  At this point I’m unsure
why this occurred.  This was not normal
behavior from my understanding of all-active operation.  This all well could be a simple misconfiguration on my part,
however I couldn’t find a lot of information on the Internet about this
particular issue.  I will keep searching
for the answer though…









