




版權說明:本文檔由用戶提供并上傳,收益歸屬內容提供方,若內容存在侵權,請進行舉報或認領
文檔簡介
1、B4: Experience with a Globally-Deployed Software Defined WANSushant Jain, Alok Kumar, Subhasree Mandal, Joon Ong, Leon Poutievski, Arjun Singh, Subbaiah Venkata, Jim Wanderer, Junlan Zhou, Min Zhu, Jonathan Zolla,Urs Hlzle, Stephen Stuart and Amin VahdatGoogle, IABSTRACTWe pre
2、sent the design, implementation, and evaluation of B?, a pri- vate WAN connecting Googles data centers across the planet. B? has a number of unique characteristics: i) massive bandwidth re- quirements deployed to a modest number of sites, ii) elastic traf-?c demand that seeks to maximize average ban
3、dwidth, and iii) full control over the edge servers and network, which enables rate limit- ing and demand measurement at the edge. ese characteristics led to a So?ware De?ned Networking architecture using OpenFlow to control relatively simple switches built from merchant silicon. B?s centralized tra
4、?c engineering service drives links to near ? uti- lization, while splitting application ?ows among multiple paths to balance capacity against application priority/demands. We describe experience with three years of B? production deployment, lessons learned, and areas for futurework.Such overprovisi
5、oning delivers admirable reliability at the very real costs of ?-?x bandwidth over-provisioning and high-end routing gear.We were faced with these overheads for building a WAN connect- ing multiple data centers with substantial bandwidth requirements. However, Googles data center WAN exhibits a numb
6、er of unique characteristics. First, we control the applications, servers, and the LANs all the way to the edge of the network. Second, our most bandwidth-intensive applications perform large-scale data copies from one site to another. ese applications bene?t most from high levels of average bandwid
7、th and can adapt their transmission rate based on available capacity. ey could similarly defer to higher pri- ority interactive applications during periods of failure or resource constraint. ird, we anticipated no more than a few dozen data center deployments, making central control of bandwidth fea
8、sible.We exploited these properties to adopt a so?ware de?ned net- working (SDN) architecture for our data center WAN interconnect. We were most motivated by deploying routing and tra?c engineer- ing protocols customized to our unique requirements. Our de- sign centers around: i) accepting failures
9、as inevitable and com- mon events, whose e?ects should be exposed to end applications, and ii) switch hardware that exports a simple interface to program forwarding table entries under central control. Network protocols could then run on servers housing a variety of standard and custom protocols. Ou
10、r hope was that deploying novel routing, scheduling, monitoring, and management functionality and protocols would be both simpler and result in a more e?cient network.We present our experience deploying Googles WAN, B?, using So?ware De?ned Networking (SDN) principles and OpenFlow ? to manage indivi
11、dual switches. In particular, we discuss how we simultaneously support standard routing protocols and centralized Tra?c Engineering (TE) as our ?rst SDN application. With TE, we:i) leverage control at our network edge to adjudicate among compet- ing demands during resource constraint, ii) use multip
12、ath forward- ing/tunneling to leverage available network capacity according to application priority, and iii) dynamically reallocate bandwidth in the face of link/switch failures or shi?ing application demands. ese features allow many B? links to run at near ? utilization and all links to average ?
13、utilization over long time periods, correspond- ing to ?-?x e?ciency improvements relative to standard practice.B? has been in deployment for three years, now carries more traf-?c than Googles public facing WAN, and has a higher growth rate. It is among the ?rst and largest SDN/OpenFlow deployments.
14、 B? scales to meet application bandwidth demands more e?ciently than would otherwise be possible, supports rapid deployment and iter- ation of novel control functionality such as TE, and enables tight integration with end applications for adaptive behavior in response to failures or changing communi
15、cation patterns. SDN is of courseCategories and Subject DescriptorsC.?.? Network Protocols: Routing ProtocolsKeywordsCentralized Tra?c Engineering; Wide-Area Networks; So?ware- De?ned Networking; Routing; OpenFlow1.INTRODUCTIONModern wide area networks (WANs) are critical to Internet per- formance a
16、nd reliability, delivering terabits/sec of aggregate band- width across thousands of individual links. Because individual WAN links are expensive and because WAN packet loss is typically thought unacceptable, WANrouters consist of high-end, specialized equipment that place a premium on high availabi
17、lity. Finally, WANs typically treat all bits the same. While this has many bene?ts, when the inevitable failure does take place, all applications are typically treated equally, despite their highly variable sensitivity to available capacity.Given these considerations, WAN links are typically provisi
18、oned to ?-? average utilization. is allows the network service provider to mask virtually all link or router failures from clients.Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distribut
19、ed for profit or commercial advantage and that copies bear this notice and the full cita- tion on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re- publish, to post on servers or to redistr
20、ibute to lists, requires prior specific permission and/or a fee. Request permissions from .SIGCOMM13, August 1216, 2013, Hong Kong, China. Copyright 2013 ACM 978-1-4503-2056-6/13/08 .$15.00.Elastic bandwidth demands: e majority of our data cen- ter tra?c involves synchronizing larg
21、e data sets across sites. ese applications bene?t from as much bandwidth as they can get but can tolerate periodic failures with temporarybandwidth reductions.Moderate number of sites: While B? must scale among multi- ple dimensions, targeting our data center deployments meant that the total number
22、of WAN sites would be a few dozen.End application control: We control both the applications and the site networks connected to B?. Hence, we can enforce rel-ative application priorities and control bursts at the network edge, rather than through overprovisioning or complex func- tionality in B?.Cost
23、 sensitivity: B?s capacity targets and growth rate led to unsustainable cost projections. e traditional approach of provisioning WAN links at ?-? (or ?-?x the cost of a fully- utilized WAN) to protect against failures and packet loss, combined with prevailing per-port router cost, would make our net
24、work prohibitively expensive.Figure ?: B? worldwide deployment (?).not a panacea; we summarize our experience with a large-scale B? outage, pointing to challenges in both SDN and large-scale network management. While our approach does not generalize to all WANs or SDNs, we hope that our experience w
25、ill inform future design in both domains.2.BACKGROUNDBefore describing the architecture of our so?ware-de?ned WAN, we provide an overview of our deployment environment and tar- get applications. Googles WAN is among the largest in the Internet, delivering a range of search, video, cloud computing, a
26、nd enterprise applications to users across the planet. ese services run across a combination of data centers spread across the world, and edge de- ployments for cacheable content.Architecturally, we operate two distinct WANs. Our user-facing network peers with and exchanges tra?c with other Internet
27、 do- mains. End user requests and responses are delivered to our data centers and edge caches across this network. e second network, B?, provides connectivity amongdata centers(see Fig.?), e.g., for asynchronous data copies, index pushes for interactive serving sys- tems, and end user data replicati
28、on for availability. Well over ? of internal application tra?c runs across this network.We maintain two separate networks because they have di?erent requirements. For example, our user-facing networking connects with a range of gear and providers, and hence must support a wide range of protocols. Fu
29、rther, its physical topology will necessarily be more dense than a network connecting a modest number of data centers. Finally, in delivering content to end users, it must support the highest levels of availability.ousands of individual applications run across B?; here, we cat- egorize them into thr
30、ee classes: i) user data copies (e.g., email, doc- uments, audio/video ?les) to remote data centers for availability/- durability, ii) remote storage access for computation over inherently distributed data sources, and iii) large-scale data push synchroniz- ing state across multiple data centers. es
31、e three tra?c classes are ordered in increasing volume, decreasing latency sensitivity, and de- creasing overall priority. For example, user-data represents the low- est volume on B?, is the most latency sensitive, and is of the highest priority.e scale of our network deployment strains both the cap
32、acity of commodity network hardware and the scalability, fault tolerance, and granularity of control available from network so?ware. Internet bandwidth as a whole continues to grow rapidly ?. However, our own WAN tra?c has been growing at an even fasterrate.Our decision to build B? around So?ware De
33、?ned Networking and OpenFlow ? was driven by the observation that we could not achieve the level of scale, fault tolerance, cost e?ciency, and control required for our network using traditional WAN architectures. A number of B?s characteristics led to our design approach:ese considerations led to pa
34、rticular design decisions for B?, which we summarize in Table ?. In particular, SDN gives us a dedicated, so?ware-based control plane running on commodity servers, and the opportunity to reason about global state, yielding vastly simpli?ed coordination and orchestration for both planned and unplanne
35、d network changes. SDN also allows us to leverage the raw speed of commodity servers; latest-generation servers are much faster than the embedded-class processor in most switches, and we can upgrade servers independently from the switch hard- ware. OpenFlow gives us an early investment in an SDN eco
36、sys- tem that can leverage a variety of switch/data plane elements. Crit- ically, SDN/OpenFlow decouples so?ware and hardware evolution: control plane so?ware becomes simpler and evolves more quickly; data plane hardware evolves based on programmability and perfor- mance.We had several additional mo
37、tivations for our so?ware de?ned architecture, including: i) rapid iteration on novel protocols, ii) sim- pli?ed testing environments (e.g., we emulate our entire so?ware stack running across the WAN in a local cluster), iii) improved capacity planning available from simulating a deterministic cen-
38、tral TE server rather than trying to capture the asynchronous rout- ing behavior of distributed protocols, and iv) simpli?ed manage- ment through a fabric-centric rather than router-centric WAN view. However, we leave a description of these aspects to separate work.3.DESIGNIn this section, we descri
39、be the details of our So?ware De?ned WAN architecture.3.1OverviewOur SDN architecture can be logically viewed in three layers, de- picted in Fig. ?. B? serves multiple WAN sites, each with a num- ber of server clusters. Within each B? site, the switch hardware layer primarily forwards tra?c and does
40、 not run complex control so?ware, and the site controller layer consists of Network Control Servers (NCS) hosting both OpenFlow controllers (OFC) and Net- work Control Applications (NCAs).ese servers enable distributed routing and central tra?c engi- neeringasaroutingoverlay.OFCsmaintainnetworkstate
41、basedon NCA directives and switch events and instruct switches to set for- warding table entries based on this changing network state. For fault toleranceofindividualserversandcontrolprocesses, aper-sitein-Table ?: Summary of design decisions in B?.aration delivers a number of bene?ts. It allowed us
42、 to focus initial work on building SDN infrastructure, e.g., the OFC and agent, rout- ing, etc. Moreover, since we initially deployed our network with no new externally visible functionality such as TE, it gave time to de- velop and debug the SDN architecture before trying to implement new features
43、such as TE.Perhaps most importantly, we layered tra?c engineering on top of baseline routing protocols using prioritized switch forwarding ta- ble entries ( ?). is isolation gave our network a “big red button”; faced with any critical issues in tra?c engineering, we could dis- able the service and f
44、all back to shortest path forwarding. is fault recovery mechanism has proven invaluable ( ?).Each B? site consists of multiple switches with potentially hun- dreds of individual ports linking to remote sites. To scale, the TE ab- stracts each site into a single node with a single edge of given capac
45、- ity to each remote site. To achieve this topology abstraction, all traf-?c crossing a site-to-site edge must be evenly distributed across allits constituent links. B? routers employ a custom variant of ECMP hashing ? to achieve the necessary load balancing.In the rest of this section, we describe
46、how we integrate ex- isting routing protocols running on separate control servers with OpenFlow-enabled hardware switches. ? then describes how we layer TE on top of this baseline routing implementation.3.2Switch DesignConventional wisdom dictates that wide area routing equipment must have deep bu?e
47、rs, very large forwarding tables, and hardware support for high availability. All of this functionality adds to hard- ware cost and complexity. We posited that with careful endpoint management, we could adjust transmission rates to avoid the need for deep bu?ers while avoiding expensive packet drops
48、. Further, our switches run across a relatively small set of data centers, so we did not require large forwarding tables. Finally, we found that switch failures typically result from so?ware rather than hardware issues. By moving most so?ware functionality o? the switch hard- ware, we can manage so?
49、ware fault tolerance through known tech- niques widely available for existing distributed systems.Even so, the main reason we chose to build our own hardware was that no existing platform could support an SDN deployment, i.e., one that could export low-level control over switch forwarding behavior.
50、Any extra costs from using custom switch hardware are more than repaid by the e?ciency gains available from supporting novel services such as centralized TE. Given the bandwidth requiredFigure ?: B? architecture overview.stance of Paxos ? elects one of multiple available so?ware replicas (placed on
51、di?erent physical servers) as the primary instance.e global layer consists of logically centralized applications (e.g. an SDN Gateway and a central TE server) that enable the central control of the entire network via the site-level NCAs. e SDN Gate- way abstracts details of OpenFlow and switch hardw
52、are from the central TE server. We replicate global layer applications across mul- tiple WAN sites with separate leader election to set the primary.Each server cluster in our network is a logical “Autonomous Sys- tem”(AS) with a set of IP pre?xes. Each cluster contains a set of BGP routers (not show
53、n in Fig. ?) that peer with B? switches at each WAN site. Even before introducing SDN, we ran B? as a single AS pro- viding transit among clusters running traditional BGP/ISIS network protocols. We chose BGP because of its isolation properties between domains and operator familiarity with the protoc
54、ol. e SDN-based B? then had to support existing distributed routing protocols, both for interoperability with our non-SDN WAN implementation, and to enable a gradual rollout.We considered a number of options for integrating existing rout- ing protocols with centralized tra?c engineering. In an aggre
55、ssive approach, we would have built one integrated, centralized service combining routing (e.g., ISIS functionality) and tra?c engineering. We instead chose to deploy routing and tra?c engineering as in- dependent services, with the standard routing service deployed ini- tially and central TE subseq
56、uently deployed as an overlay. is sep-Design DecisionRationale/Bene?tsChallengesB? routers built from merchant switch siliconB? apps are willing to trade more average bandwidth for fault tolerance.Edge application control limits need for large bu?ers. Limited number of B? sites means large forwardin
57、g tables are not required.Relatively low router cost allows us to scale network capacity.Sacri?ce hardware fault tolerance, deep bu?ering, and support for large routing tables.Drive links to ? utilizationAllows e?cient use of expensive long haul transport.Many applications willing to trade higher av
58、erage bandwidth for predictability. Largest bandwidth consumers adapt dynamically to available bandwidth.Packet loss becomes inevitablewith substantial capacity loss dur- ing link/switch failure.Centralized tra?c engineeringUse multipath forwarding to balance application demands across available cap
59、acity in re-sponsetofailuresandchangingapplicationdemands. Leverage application classi?cation and priority for scheduling in cooperation with edge rate limiting.Tra?c engineering with traditional distributed routing protocols (e.g. link-state) is known to be sub-optimal ?, ? except in special cases ?.Faster, determinist
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網頁內容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
- 4. 未經權益所有人同意不得將文件中的內容挪作商業(yè)或盈利用途。
- 5. 人人文庫網僅提供信息存儲空間,僅對用戶上傳內容的表現方式做保護處理,對用戶上傳分享的文檔內容本身不做任何修改或編輯,并不能對任何下載內容負責。
- 6. 下載文件中如有侵權或不適當內容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 宗教用品經銷管理辦法
- 新課標培訓分享課件內容
- 肩關節(jié)護理課件
- 肥胖兒童管理課件
- 腸胃炎護理課件
- 生鮮日配培訓課件
- 產科異位妊娠課件培訓
- 甘蔗種植管理培訓課件
- 高中對口升學數學試卷
- 二下人教版期末數學試卷
- (高清版)DB11∕T2333-2024危險化學品生產裝置和儲存設施長期停用安全管理要求
- 安徽省2024年普通高校招生普通高職(???提前批院校投檔分數及名次
- 重慶市地圖矢量動態(tài)模板圖文
- LY/T 2005-2024國家級森林公園總體規(guī)劃規(guī)范
- 2025年四川大學自主招生個人陳述的自我定位
- 蘇州工業(yè)園區(qū)企業(yè)名錄
- 2025年福建省建工集團及下屬集團招聘235人高頻重點提升(共500題)附帶答案詳解
- 上海市混合廢塑料垃圾熱解處理項目可行性研究報告
- DB33T 1152-2018 建筑工程建筑面積計算和竣工綜合測量技術規(guī)程
- 部編版道德與法治五年級下冊全冊復習選擇題100道匯編附答案
- DB45T 2364-2021 公路路基監(jiān)測技術規(guī)范
評論
0/150
提交評論