Ethernet Routing for Large Scale DistributedData Center Fabrics


The Presentation inside:

Slide 0

Ethernet Routing for Large Scale Distributed Data Center Fabrics Dave Allan, Janos Farkas, Panagiotis Saltsidis, Jeff Tantsura Ericsson


Slide 1

This is a concept and architecture for a distributed Cloud One purpose is to illustrate the capabilities and the scalability of the “state of the art” Ethernet The components of the proposed architecture are progressing in standards, either complete or in progress The architecture is built on IEEE Shortest Path Bridging – MAC mode (SPBM) As standardized in IEEE 802.1aq-2012 IETF Ethernet Virtual Private Network (EVPN) as extended for SPBM interworking This is being standardized in draft-ietf-l2vpn-spbm-evp Introduction


Slide 2

Key antecedents to SPB Provider Backbone Bridges (PBB) [802.1ah] Full MAC-in-MAC encapsulation 24-bit I-SID, which is a 24-bit L2 Virtual Network ID PBB Traffic Engineering (PBB-TE) [802.1Qay] Enabled external control of bridge forwarding with complete route freedom, i.e. Software Defined Networking (SDN) with geographical separation A Bit of History


Slide 3

SPBV: SPB VID VID based Applicable to all types of VLANs Flooding and learning Plug&play SPBM: SPB MAC MAC based Designed to leverage the scalability provided by PBB MAC-in-MAC No flooding and learning Managed environments What is Shortest Path Bridging (802.1aq SPB)? SPB is a routed Ethernet solution that has been specified by the IEEE ? link state for bridges IS-IS aspects documented in IETF RFC 6329 All control functionality has been collapsed into a single protocol (IS-IS) Unicast and multicast tree construction, VLAN registration etc. Two SPB modes are defined:


Slide 4

It is compute based: computation instead of signaling It uses multiple shortest path trees instead of shared spanning trees Unicast and multicast frames follow the same path between any two points in a given VLAN So no frame misordering & you get meaningful OAM support It uses loop mitigation AND loop prevention It uses edge based load spreading It is backwards compatible with, and is consistent with the full body of Ethernet standardization (IEEE 802.1) CFM, EVB, lossless Ethernet etc. It implements the full MEF 12.1 set of service constructs E-LINE, E-LAN, E-TREE What is important to understand about SPBM?


Slide 5

Ability to utilize more richly connected topologies SPBM supports up to 16 way multi-pathing and is extensible to go further Each multipath instance is a full mesh of the network Large scale virtualization PBB data plane scales to billion virtual networks (24-bit I-SID over 12-bit B-VID: 224 * 212) Operational simplicity All information contained in a single control protocol ? IS-IS Single touch adds/moves and changes Computed multicast Reduced CP messaging combined with a computation driven convergence of unicast & multicast is a virtuous circle… Problems Already Solved


Slide 6

Ubiquity and reach Interconnect different flavors of “Ethernet”, across the dominant WAN technology (MPLS) Preserve operational simplicity Preserve “single touch” add/move/delete automation Minimal configuration Alignment of BGP and IS-IS control plane paradigms Break the scaling barriers of a single routing domain Combined SPBM-EVPN allows much larger topologies Domain isolation to “divide and conquer” state Operate each SPBM domain on a “need to know” basis Non-relevant information is excluded from routing advertisement Minimize Filtering Database (FDB) state Solution Objectives


Slide 7

There are a number of aspect of the solution Topology hiding and abstraction “Need to know” filtering Independence of local multi-pathing Multicast summarization Solution Overview DCN1 EVPN DCN2 MPLS B-VID1 I-SID1 LSP I-SID1 I-SID1 B-VID2 SPBM SPBM EVPN


Slide 8

Shortest Path Trees (SPT) are the basic connectivity construct for SPBM They are edge rooted shortest path, and much finer grained than the shared spanning trees but they are still TREEs Which constrains the set of network interconnect mechanisms The set of fine grained MAC based trees are aggregated into Backbone VLANs (B-VLAN), where each B-VLAN delineates full mesh connectivity EVPN is IP/MPLS based, and uses BGP to sort out mirroring of attached Ethernet networks But once in EVPN we can map SPBM connectivity to any paradigm The trick is interconnecting them SPBM and EVPN


Slide 9

Trees have ROOTs…. Which means interworking needs to pin way points which can then permit the required design strategies work For SPBM-EVPN interworking, we make the interworking function on the EVPN-PE into a “pinned waypoint” This has the desirable effect of keep “churn” in subtending SPBM networks out of BGP An EVPN-PE that is a “pinned waypoint” for a set of VLANs is known as a “designated forwarder” Mapping between SPBM & EVPN


Slide 10

The set of EVPN-PEs attached to an SPBM network self elect which subset of VLANs they will act as Designated Forwarder (DF) for This is based on local B-VID The DF is then responsible for the relaying of all required state associated with the subset of VLANs it owns between the two control planes, and the interworking of data plane traffic between the SPBM and EVPN networks This is simply in the form of a list of I-SIDs/B-MAC tuples No topology information is leaked, the DF condenses all topology behind it down to a single node representation into the peer network The DF also “re-roots” all (S,G) multicast trees that transit it by “blindly” rewriting “S” (Source) Designated Forwarder


Slide 11

DF Control Plane Interworking DF has a Control Plane Interworking function It proxies B-MAC/I-SID announcements from ISIS-SPB into BGP for the set of I-SIDs it is DF for It will only proxy B-MAC/I-SID announcements from EVPN into ISIS-SPB if there is already locally registered interest in the I-SID BGP has the whole picture, IS-IS is “need to know”


Slide 12

EVPN-SPBM data plane DCN1 DCN2 MPLS B-VID1 I-SID1 LSP I-SID1 I-SID1 B-VID2 SPBM SPBM EVPN VM1 VM2


Slide 13

Islands are decoupled by keeping B-Tags out of the EVPN core What the core sees is MPLS encapsulated B-MACs and I-SIDs B-Tags stripped by PEs on ingress to EVPN B-Tags locally added by PEs on egress from EVPN So the core is independent of however multi-pathing is implemented in each subtending island, or whether a PBBN exists at all (e.g. PBB-PEs) Multicast MACs are aggregated at SPBM ingress DF Data Plane Procedures


Slide 14

Objective is to get away from the inefficiencies of edge based replication in the PEs while minimizing the multicast state impact in the core VLAN emulation can use lots of Multicast Distribution Trees (MDTs) These can be aggregated into shared MDTs between larger sites Shared MDTs can substantially reduce the amount of multicast state in the MPLS core to service large sites Smaller sites may more likely benefit from service specific MDTs So we will support both Add Multicast in the MPLS Core


Slide 15

Issue is how to resolve VLANs to shared trees without getting into resolution servers or provisioning One way to do this is to algorithmically “name” the tree (*,G) or (S,G) where G is a sorted list of leaf node IDs Via BGP every PE has sufficient information to construct the names of the MDTs mLDP permits arbitrary opaque identifiers for MDTs to be used as a multicast FEC so the algorithmically constructed names can be used directly in signaling Shared Multicast Distribution Trees


Slide 16

Example 802.1aq SPBM 802.1aq SPBM 802.1ad PBN 802.1aq SPBM EVPN + mLDP PE1 PE2 PE6 PE5 PE3 PE4 PBB PE7 BGP IS-IS CE CE CE CE RSTP IS-IS IS-IS DF DF DF PE2, PE3 and PE5 are DFs for a common set of VLANs


Slide 17

Example 802.1aq SPBM 802.1aq SPBM 802.1ad PBN 802.1aq SPBM EVPN + mLDP PE1 PE6 PE5 PE4 PBB PE7 BGP IS-IS CE CE CE CE RSTP IS-IS IS-IS DF PE2 DF PE3 DF mLDP


Slide 18

Example 802.1aq SPBM 802.1aq SPBM 802.1ad PBN 802.1aq SPBM EVPN + mLDP PE1 PE6 PE5 PE4 PBB PE7 BGP IS-IS CE CE CE CE RSTP IS-IS IS-IS DF PE2 DF mLDP PE3 DF I am PE 3, and I have 10 VLANs that need (*,G) multicast to myself and PEs 2, and 5 so the FEC is PE2+PE3+PE5


Slide 19

Example 802.1aq SPBM 802.1aq SPBM 802.1ad PBN 802.1aq SPBM EVPN + mLDP PE1 PE6 PE5 PE4 PBB PE7 BGP IS-IS CE CE CE CE RSTP IS-IS IS-IS DF PE3 DF PE2 DF mLDP I am PE 2, and I have 10 VLANs that need (*,G) multicast to myself and PEs 3, and 5 so the FEC is PE2+PE3+PE5


Slide 20

Example 802.1aq SPBM 802.1aq SPBM 802.1ad PBN 802.1aq SPBM EVPN + mLDP PE1 PE6 PE4 PBB PE7 BGP IS-IS CE CE CE CE RSTP IS-IS IS-IS PE2 DF PE3 DF PE5 DF I am PE 5, and I have 10 VLANs that need (*,G) multicast to myself and PEs 2, and 3 so the FEC is PE2+PE3+PE5


Slide 21

Example 802.1aq SPBM 802.1aq SPBM 802.1ad PBN 802.1aq SPBM EVPN + mLDP PE1 PE6 PE4 PBB PE7 BGP IS-IS CE CE CE CE RSTP IS-IS IS-IS PE3 DF PE2 DF PE5 DF Resulting MDT


Slide 22

mLDP like PIM is rather chatty, and based on transactional convergence If I had 10000 VLANs spread across the 3 sites in the example I WOULD have 10000 (*,G) or 30000 (S,G) trees For 3 dual homed sites, there are ONLY 8 possible (*,G) and 24 possible (S,G) shared trees It becomes practical to simply “nail them up” and modify the membership set of each tree at the ingress Result is both scalable and stable What does this get me?


Slide 23

Assumption of rich mesh hidden from SPBM in the first place Exposing a large highly regular CLOS topology in link state simply burdens the control plane Some topological summarization is required in the first place to usefully scale individual sites to 100,000 servers+ with existing technology There is lots that can be done to engineer an SPBM network ? both with the vanilla standard, and with techniques currently under research Deterministic aggregated trees lend themselves to “demand engineering” with automation Work needs to be done to seamlessly extend this into the EVPN realm Key Insights & Next steps


Slide 24

The totality, completeness and self-consistency of IEEE data center networking solutions is impressive From OAM to Edge Virtual Bridging SPB permits this to scale to orders of magnitude beyond what Ethernet previously was capable of Adding EVPN is a form of “multi-area” solution adds orders of magnitude beyond what SPB alone can do… Summary


Slide 25


×

HTML:





Ссылка: