Border Gateway Protocol
Encyclopedia
The Border Gateway Protocol (BGP) is the protocol backing the core routing decisions on the Internet
Internet
The Internet is a global system of interconnected computer networks that use the standard Internet protocol suite to serve billions of users worldwide...

. It maintains a table of IP networks or 'prefixes' which designate network reachability among autonomous systems
Autonomous system (Internet)
Within the Internet, an Autonomous System is a collection of connected Internet Protocol routing prefixes under the control of one or more network operators that presents a common, clearly defined routing policy to the Internet....

 (AS). It is described as a path vector protocol
Path vector protocol
A path vector protocol is a computer network routing protocol which maintains the path information that gets updated dynamically. Updates which have looped through the network and returned to the same node are easily detected and discarded...

. BGP does not use traditional Interior Gateway Protocol
Interior gateway protocol
An interior gateway protocol is a routing protocol that is used to exchange routing information within an autonomous system ....

 (IGP) metrics, but makes routing decisions based on path, network policies and/or rulesets. For this reason, it is more appropriately termed a reachability protocol rather than routing protocol
Routing protocol
A routing protocol is a protocol that specifies how routers communicate with each other, disseminating information that enables them to select routes between any two nodes on a computer network, the choice of the route being done by routing algorithms. Each router has a priori knowledge only of...

.

BGP was created to replace the Exterior Gateway Protocol
Exterior Gateway Protocol
The Exterior Gateway Protocol is a now obsolete routing protocol for the Internet originally specified in 1982 by Eric C. Rosen of Bolt, Beranek and Newman, and David L. Mills. It was first described in RFC 827 and formally specified in RFC 904...

 (EGP) protocol to allow fully decentralized routing in order to transition from the core ARPAnet model to a decentralized system that included the NSFNET backbone and its associated regional networks. This allowed the Internet to become a truly decentralized system. Since 1994, version four of the BGP has been in use on the Internet. All previous versions are now obsolete. The major enhancement in version 4 was support of Classless Inter-Domain Routing
Classless Inter-Domain Routing
Classless Inter-Domain Routing is a method for allocating IP addresses and routing Internet Protocol packets. The Internet Engineering Task Force introduced CIDR in 1993 to replace the previous addressing architecture of classful network design in the Internet...

 and use of route aggregation
Supernet
A supernet is an Internet Protocol network that is formed from the combination of two or more networks with a common Classless Inter-Domain Routing prefix. The new routing prefix for the combined network aggregates the prefixes of the constituent networks. It must not contain other prefixes of...

 to decrease the size of routing table
Routing table
In computer networking a routing table, or Routing Information Base , is a data table stored in a router or a networked computer that lists the routes to particular network destinations, and in some cases, metrics associated with those routes. The routing table contains information about the...

s. Since January 2006, version 4 is codified in RFC 4271, which went through more than 20 drafts based on the earlier RFC 1771 version 4. RFC 4271 version corrected a number of errors, clarified ambiguities and brought the RFC much closer to industry practices.

Most Internet service provider
Internet service provider
An Internet service provider is a company that provides access to the Internet. Access ISPs directly connect customers to the Internet using copper wires, wireless or fiber-optic connections. Hosting ISPs lease server space for smaller businesses and host other people servers...

s must use BGP to establish routing between one another (especially if they are multihomed
Multihoming
Multihoming is a technique used to increase the reliability of the Internet connection for an IP network. As an adjective, it is typically used to describe a customer, rather than an Internet service provider network...

). Therefore, even though most Internet users do not use it directly, BGP is one of the most important protocols of the Internet. Compare this with Signaling System 7 (SS7), which is the inter-provider core call setup protocol on the PSTN. Very large private IP
Internet Protocol
The Internet Protocol is the principal communications protocol used for relaying datagrams across an internetwork using the Internet Protocol Suite...

 networks use BGP internally. An example would be the joining of a number of large OSPF
Open Shortest Path First
Open Shortest Path First is an adaptive routing protocol for Internet Protocol networks. It uses a link state routing algorithm and falls into the group of interior routing protocols, operating within a single autonomous system . It is defined as OSPF Version 2 in RFC 2328 for IPv4...

 (Open Shortest Path First) networks where OSPF by itself would not scale to size. Another reason to use BGP is multihoming
Multihoming
Multihoming is a technique used to increase the reliability of the Internet connection for an IP network. As an adjective, it is typically used to describe a customer, rather than an Internet service provider network...

 a network for better redundancy either to multiple access points of a single ISP (RFC 1998) or to multiple ISPs.

Operation

BGP neighbors, peers, are established by manual configuration between routers to create a TCP
Transmission Control Protocol
The Transmission Control Protocol is one of the core protocols of the Internet Protocol Suite. TCP is one of the two original components of the suite, complementing the Internet Protocol , and therefore the entire suite is commonly referred to as TCP/IP...

 session on port
TCP and UDP port
In computer networking, a port is an application-specific or process-specific software construct serving as a communications endpoint in a computer's host operating system. A port is associated with an IP address of the host, as well as the type of protocol used for communication...

 179. A BGP speaker will periodically send 19-byte keep-alive messages to maintain the connection (every 60 seconds by default). Among routing protocols, BGP is unique in using TCP as its transport protocol.

When BGP runs between two peers in the same autonomous system
Autonomous system (Internet)
Within the Internet, an Autonomous System is a collection of connected Internet Protocol routing prefixes under the control of one or more network operators that presents a common, clearly defined routing policy to the Internet....

 (AS), it is referred to as Internal BGP (IBGP or Interior Border Gateway Protocol). When it runs between autonomous systems, it is called External BGP (EBGP or Exterior Border Gateway Protocol). Routers on the boundary of one AS exchanging information with another AS are called border or edge routers. In the Cisco operating system, IBGP routes have an administrative distance
Administrative distance
Administrative distance is the measure used by Cisco routers to select the best path when there are two or more different routes to the same destination from two different routing protocols. Administrative distance defines the reliability of a routing protocol. Each routing protocol is prioritized...

 of 200, which is less preferred than either external BGP or any interior routing protocol. Other router implementations also prefer EBGP to IGP
Interior gateway protocol
An interior gateway protocol is a routing protocol that is used to exchange routing information within an autonomous system ....

s, and IGPs to IBGP.

Extensions negotiation

During the OPEN, BGP speakers can negotiate optional capabilities of the session, including multiprotocol extensions and various recovery modes. If the multiprotocol extensions to BGP are negotiated at the time of creation, the BGP speaker can prefix the Network Layer Reachability Information (NLRI) it advertises with an address family prefix. These families include the IPv4 (default), IPv6, IPv4/IPv6 Virtual Private Networks and multicast BGP. Increasingly, BGP is used as a generalized signaling protocol to carry information about routes that may not be part of the global Internet, such as VPNs.

Finite-state machine

In order to make decisions in its operations with other BGP peers, a BGP peer uses a simple finite state machine
Finite state machine
A finite-state machine or finite-state automaton , or simply a state machine, is a mathematical model used to design computer programs and digital logic circuits. It is conceived as an abstract machine that can be in one of a finite number of states...

 (FSM) that consists of six states: Idle; Connect; Active; OpenSent; OpenConfirm; and Established. For each peer-to-peer session, a BGP implementation maintains a state variable that tracks which of these six states the session is in. The BGP protocol defines the messages that each peer should exchange in order to change the session from one state to another. The first state is the “Idle” state. In the “Idle” state, BGP initializes all resources, refuses all inbound BGP connection attempts and initiates a TCP connection to the peer. The second state is “Connect”. In the “Connect” state, the router waits for the TCP connection to complete and transitions to the "OpenSent" state if successful. If unsuccessful, it starts the ConnectRetry timer and transitions to the "Active" state upon expiration. In the "Active" state, the router resets the ConnectRetry timer to zero and returns to the "Connect" state. In the "OpenSent" state, the router sends an Open message and waits for one in return. Keepalive messages are exchanged and, upon successful receipt, the router is placed into the “Established” state. In the “Established” state, the router can send/receive: Keepalive; Update; and Notification messages to/from its peer.
  • Idle State:
    • Refuse all incoming BGP connections
    • Start event triggers the initialization of
    • Initiates a TCP connection with its configured BGP peer.
    • Listens for a TCP connection from its peer.
    • Changes its state to Connect.
    • If an error occurs at any state of the FSM process, the BGP session is terminated immediately and returned to the Idle state. Some of the reasons why a router does not progress from the Idle state are:
      • TCP port 179 is not open.
      • A random TCP port over 1023 is not open.
      • Peer address configured incorrectly on either router.
      • AS number configured incorrectly on either router .
  • Connect State:
    • Waits for successful TCP negotiation with peer.
    • BGP does not spend much time in this state if the TCP session has been successfully established.
    • Sends Open message to peer and changes state to OpenSent.
    • If an error occurs, BGP moves to the Active state. Some reasons for the error are:
      • TCP port 179 is not open.
      • A random TCP port over 1023 is not open.
      • Peer address configured incorrectly on either router.
      • AS number configured incorrectly on either router.
  • Active State:
    • If the router was unable to establish a successful TCP session, then it ends up in the Active state.
    • BGP FSM will try to restart another TCP session with the peer and, if successful, then it will send an Open message to the peer.
    • If it is unsuccessful again, the FSM is reset to the Idle state.
    • Repeated failures may result in a router cycling between the Idle and Active states. Some of the reasons for this include:
      • TCP port 179 is not open.
      • A random TCP port over 1023 is not open.
      • BGP configuration error.
      • Network congestion.
      • Flapping network interface.
  • OpenSent State:
    • BGP FSM listens for an Open message from its peer.
    • Once the message has been received, the router checks the validity of the Open message.
    • If there is an error it is because one of the fields in the Open message doesn’t match between the peers, e.g. BGP version mismatch, MD5 password mismatch, the peering router expects a different My AS. The router will then send a Notification message to the peer indicating why the error occurred.
    • If there is no error, a Keepalive message is sent, various timers are set and the state is changed to OpenConfirm.
  • OpenConfirm State:
    • The peer is listening for a Keepalive message from its peer.
    • If a Keepalive message is received and no timer has expired before reception of the Keepalive, BGP transitions to the Established state.
    • If a timer expires before a Keepalive message is received, or if an error condition occurs, the router transitions back to the Idle state.
  • Established State:
    • In this state, the peers send Update messages to exchange information about each route being advertised to the BGP peer.
    • If there is any error in the Update message then a Notification message is sent to the peer, and BGP transitions back to the Idle state.
    • If a timer expires before a Keepalive message is received, or if an error condition occurs, the router transitions back to the Idle state.

BGP router connectivity and learning routes

In the simplest arrangement all routers within a single AS and participating in BGP routing must be configured in a full mesh: each router must be configured as peer to every other router. This causes scaling problems, since the number of required connections grows quadratically
Quadratic growth
In mathematics, a function or sequence is said to exhibit quadratic growth when its values are proportional to the square of the function argument or sequence position, in the limit as the argument or sequence position goes to infinity...

 with the number of routers involved. To alleviate the problem, BGP implements two options: route reflectors (RFC 4456) and confederations (RFC 5065). The following discussion of basic UPDATE processing assumes a full IBGP mesh.

Basic update processing

A given BGP router may accept NLRI in UPDATEs from multiple neighbors and advertise NLRI to the same, or a different set, of neighbors. Conceptually, BGP maintains its own "master" routing table, called the Loc-RIB (Local Routing Information Base), separate from the main routing table
Routing table
In computer networking a routing table, or Routing Information Base , is a data table stored in a router or a networked computer that lists the routes to particular network destinations, and in some cases, metrics associated with those routes. The routing table contains information about the...

 of the router. For each neighbor, the BGP process maintains a conceptual Adj-RIB-In (Adjacent Routing Information Base, Incoming) containing the NLRI received from the neighbor, and a conceptual Adj-RIB-Out (Outgoing) for NLRI to be sent to the neighbor.

Conceptual, in the preceding paragraph, means that the physical storage and structure of these various tables are decided by the implementer of the BGP code. Their structure is not visible to other BGP routers, although they usually can be interrogated with management commands on the local router. It is quite common, for example, to store the two Adj-RIBs and the Loc-RIB together in the same data structure, with additional information attached to the RIB entries. The additional information tells the BGP process such things as whether individual entries belong in the Adj-RIBs for specific neighbors, whether the per-neighbor route selection process made received policies eligible for the Loc-RIB, and whether Loc-RIB entries are eligible to be submitted to the local router's routing table
Routing table
In computer networking a routing table, or Routing Information Base , is a data table stored in a router or a networked computer that lists the routes to particular network destinations, and in some cases, metrics associated with those routes. The routing table contains information about the...

 management process.

By eligible to be submitted, BGP will submit the routes that it considers best to the main routing table process. Depending on the implementation of that process, the BGP route is not necessarily selected. For example, a directly connected prefix, learned from the router's own hardware, is usually most preferred. As long as that directly connected route's interface is active, the BGP route to the destination will not be put into the routing table. Once the interface goes down, and there are no more preferred routes, the Loc-RIB route would be installed in the main routing table.
Until recently, it was a common mistake to say BGP carries policies. BGP actually carried the information with which rules inside BGP-speaking routers could make policy decisions. Some of the information carried that is explicitly intended to be used in policy decisions are communities and multi-exit discriminators (MED).

Route selection

The BGP standard specifies a number of decision factors, more than are used by any other common routing process, for selecting NLRI (Network Layer Reachability Information) to go into the Loc-RIB (Routing Information Base). The first decision point for evaluating NLRI is that its next-hop attribute must be reachable (or resolvable). Another way of saying the next-hop must be reachable is that there must be an active route, already in the main routing table
Routing table
In computer networking a routing table, or Routing Information Base , is a data table stored in a router or a networked computer that lists the routes to particular network destinations, and in some cases, metrics associated with those routes. The routing table contains information about the...

 of the router, to the prefix in which the next-hop address is located.

Next, for each neighbor, the BGP process applies various standard and implementation-dependent criteria to decide which routes conceptually should go into the Adj-RIB-In. The neighbor could send several possible routes to a destination, but the first level of preference is at the neighbor level. Only one route to each destination will be installed in the conceptual Adj-RIB-In. This process will also delete, from the Adj-RIB-In, any routes that are withdrawn by the neighbor.

Whenever a conceptual Adj-RIB-In changes, the main BGP process decides if any of the neighbor's new routes are preferred to routes already in the Loc-RIB. If so, it replaces them. If a given route is withdrawn by a neighbor, and there is no other route to that destination, the route is removed from the Loc-RIB, and no longer sent, by BGP, to the main routing table
Routing table
In computer networking a routing table, or Routing Information Base , is a data table stored in a router or a networked computer that lists the routes to particular network destinations, and in some cases, metrics associated with those routes. The routing table contains information about the...

 manager. If the router does not have a route to that destination from any non-BGP source, the withdrawn route will be removed from the main routing table.

Per-neighbor decisions

After verifying that the next hop is reachable, if the route comes from an internal (i.e. IBGP) peer, the first rule to apply according to the standard is to examine the LOCAL_PREF attribute. If there are several IBGP routes from the neighbor, the one with the highest LOCAL_PREF is selected unless there are several routes with the same LOCAL_PREF. In the latter case the route selection process moves to the next tie breaker. While LOCAL_PREF is the first rule in the standard, once reachability of the NEXT_HOP is verified, Cisco and several other vendors first consider a decision factor called WEIGHT which is local to the router (i.e. not transmitted by BGP). The route with the highest WEIGHT is preferred.

LOCAL_PREF, WEIGHT, and other criteria can be manipulated by local configuration and software capabilities. Such manipulation is outside the scope of the standard but is commonly used. For example the COMMUNITY attribute (see below) is not directly used by the BGP selection process. The BGP neighbor process however can have a rule to set LOCAL_PREFERENCE or another factor based on a manually programmed rule to set the attribute if the COMMUNITY value matches some pattern matching criterion. If the route was learned from an external peer the per-neighbor BGP process computes a LOCAL_PREFERENCE value from local policy rules and then compares the LOCAL_PREFERENCE of all routes from the neighbor.

At the per-neighbor level - ignoring implementation-specific policy modifiers - the order of tie breaking rules is:
  1. Prefer the route with the shortest AS_PATH. An AS_PATH is the set of AS numbers that must be traversed to reach the advertised destination. AS1-AS2-AS3 is shorter than AS4-AS5-AS6-AS7.
  2. Prefer routes with the lowest value of their ORIGIN attribute.
  3. Prefer routes with the lowest MULTI_EXIT_DISC (multi-exit discriminator or MED) value.

Before the most recent edition of the BGP standard, if an UPDATE had no MULTI_EXIT_DISC value, several implementations created a MED with the least possible value. The current standard however specifies that missing MEDs are to be treated as the highest possible value. Since the current rule may cause different behavior than the vendor interpretations, BGP implementations that used the nonstandard default value have a configuration feature that allows the old or standard rule to be selected.

Decision factors at the Loc-RIB level

Once candidate routes are received from neighbors, the Loc-RIB software applies additional tie-breakers to routes to the same destination.
  1. If at least one route was learned from an external neighbor (i.e., the route was learned from EBGP), drop all routes learned from IBGP.
  2. Prefer the route with the lowest interior cost to the NEXT_HOP, according to the main Routing Table
    Routing table
    In computer networking a routing table, or Routing Information Base , is a data table stored in a router or a networked computer that lists the routes to particular network destinations, and in some cases, metrics associated with those routes. The routing table contains information about the...

    . If two neighbors advertised the same route, but one neighbor is reachable via a low-bitrate link and the other by a high-bitrate link, and the interior routing protocol calculates lowest cost based on highest bitrate, the route through the high-bitrate link would be preferred and other routes dropped.

If there is more than one route still tied at this point, several BGP implementations offer a configurable option to load-share among the routes, accepting all (or all up to some number).
  1. Prefer the route learned from the BGP speaker with the numerically lowest BGP identifier
  2. Prefer the route learned from the BGP speaker with the lowest peer IP address

Communities

BGP communities are attribute tags that can be applied to incoming or outgoing prefixes to achieve some common goal (RFC 1997). While it is common to say that BGP allows an administrator to set policies on how prefixes are handled by ISPs, this is generally not possible, strictly speaking. For instance, BGP natively has no concept to allow one AS to tell another AS to restrict advertisement of a prefix to only North American peering customers. Instead, an ISP generally publishes a list of well-known or proprietary communities with a description for each one, which essentially becomes an agreement of how prefixes are to be treated. Examples of common communities include local preference adjustments, geographic or peer type restrictions, DoS avoidance (black holing), and AS prepending options. An ISP might state that any routes received from customers with community XXX:500 will be advertised to all peers (default) while community XXX:501 will restrict advertisement to North America. The customer simply adjusts their configuration to include the correct community(ies) for each route, and the ISP is responsible for controlling who the prefix is advertised to. The end user has no technical ability to enforce correct actions being taken by the ISP, though problems in this area are generally rare and accidental.

It is a common tactic for end customers to use BGP communities (usually ASN:70,80,90,100) to control the local preference the ISP assigns to advertised routes instead of using MED (the effect is similar). It should also be noted that the community attribute is transitive, but communities applied by the customer very rarely become propagated outside the next-hop AS.

Extended communities

The BGP Extended Community Attribute was added in 2006 in order to extend the range of such attributes and to provide a community attribute structuring by means of a type field. The extended format consists of one or two octets for the type field followed by seven or six octets for the respective community attribute content. The definition of this Extended Community Attribute is documented in RFC 4360. The IANA administers the registry for BGP Extended Communities Types.
The Extended Communities Attribute itself is a transitive optional BGP attribute. However, a bit in the type field within the attribute decides whether the encoded extended community is of a transitive or non-transitive nature. The IANA registry therefore provides different number ranges for the attribute types.
Due to the extended attribute range, its usage can be manifold. RFC 4360 exemplarly defines the "Two-Octet AS Specific Extended Community", the "IPv4 Address Specific Extended Community", the "Opaque Extended Community", the "Route Target Community" and the "Route Origin Community". A number of BGP QoS drafts also use this Extended Community Attribute structure for inter-domain QoS signalling.

Uses of multi-exit discriminators

MEDs, defined in the main BGP standard, were originally intended to show to another neighbor AS the advertising AS's preference as to which of several links are preferred for inbound traffic. Another application of MEDs is to advertise the value, typically based on delay, of multiple AS that have presence at an IXP, that they impose to send traffic to some destination.

Message header format

The following is the BGP version 4 message header format:
bit offset 0–15 16–23 24–31
0 Marker
32
64
96
128 Length Type

  • Marker: Included for compatibility, must be set to all ones.
  • Length: Total length of the message in octets
    Octet (computing)
    An octet is a unit of digital information in computing and telecommunications that consists of eight bits. The term is often used when the term byte might be ambiguous, as there is no standard for the size of the byte.-Overview:...

    , including the header.
  • Type: Type of BGP message. The following values are defined:
    • Open (1)
    • Update (2)
    • Notification (3)
    • KeepAlive (4)

Internal BGP scalability

An autonomous system with internal BGP (IBGP) must have all of its IBGP peers connect to each other in a full mesh
Complete graph
In the mathematical field of graph theory, a complete graph is a simple undirected graph in which every pair of distinct vertices is connected by a unique edge.-Properties:...

 (where everyone speaks to everyone directly). This full-mesh configuration requires that each router maintain a session to every other router. In large networks, this number of sessions may degrade performance of routers, due either to a lack of memory, or too much CPU process requirements.

Route reflectors and confederations both reduce the number of IBGP peers to each router and thus reduce processing overhead. Route reflectors are a pure performance-enhancing technique, while confederations also can be used to implement more fine-grained policy.

Route reflector
Route reflector
A route reflector is a network routing component. It offers an alternative to the logical full-mesh requirement of internal border gateway protocol . An RR acts as a focal point for IBGP sessions. The purpose of the RR is concentration...

s reduce the number of connections required in an AS. A single router (or two for redundancy) can be made a route reflector: other routers in the AS need only be configured as peers to them.

Confederations
BGP confederation
In network routing, BGP confederation is a method to use Border Gateway Protocol to subdivide a single autonomous system into multiple internal sub-AS's, yet still advertise as a single AS to external peers. The intent is to reduce IBGP mesh size....

 are sets of autonomous systems. In common practice, only one of the confederation AS numbers is seen by the Internet as a whole. Confederations are used in very large networks where a large AS can be configured to encompass smaller more manageable internal ASs.

Confederations can be used in conjunction with route reflectors. Both confederations and route reflectors can be subject to persistent oscillation unless specific design rules, affecting both BGP and the interior routing protocol, are followed.

However, these alternatives can introduce problems of their own, including the following:
  • route oscillation
  • sub-optimal routing
  • increase of BGP convergence time


Additionally, route reflectors and BGP confederations were not designed to ease BGP router configuration. Nevertheless, these are common tools for experienced BGP network architects. These tools may be combined, for example, as a hierarchy of route reflectors.

Instability

The routing tables managed by a BGP implementation are adjusted continually to reflect actual changes in the network, such as links breaking and being restored or routers going down and coming back up. In the network as a whole it is normal for these changes to happen almost continuously, but for any particular router or link changes are supposed to be relatively infrequent. If a router is misconfigured or mismanaged then it may get into a rapid cycle between down and up states. This pattern of repeated withdrawal and re-announcement known as route flapping
Route flapping
In computer networking and telecommunications, route flapping occurs when a router alternately advertises a destination network via one route then another in quick sequence...

 can cause excessive activity in all the other routers that know about the broken link, as the same route is continuously injected and withdrawn from the routing tables. The BGP design is such that delivery of traffic may not function while routes are being updated. On the Internet, a BGP routing change may cause outages for several minutes.

A feature known as route flap damping (RFC 2439) is built into many BGP implementations in an attempt to mitigate the effects of route flapping. Without damping the excessive activity can cause a heavy processing load on routers, which may in turn delay updates on other routes, and so affect overall routing stability. With damping, a route's flapping is exponentially decayed. At the first instance when a route becomes unavailable and quickly reappears, damping does not take effect, so as to maintain the normal fail-over times of BGP. At the second occurrence, BGP shuns that prefix for a certain length of time; subsequent occurrences are timed out exponentially. After the abnormalities have ceased and a suitable length of time has passed for the offending route, prefixes can be reinstated and its slate wiped clean. Damping can also mitigate denial of service attacks; damping timings are highly customizable.

It is also suggested in RFC 2439 (under "Design Choices -> Stability Sensitive Suppression of Route Advertisement") that route flap dampening is a feature more desirable if implemented to Exterior Border Gateway Protocol Sessions (EBGP sessions or simply called exterior peers) and not on Interior Border Gateway Protocol Sessions (IBGP sessions or simply called internal peers); With this approach when a route flaps inside an autonomous system, it is not propagated to the external ASs - flapping a route to an EBGP will have a chain of flapping for the particular route throughout the backbone. This method also successfully avoids the overhead of route flap dampening for IBGP sessions.

However, subsequent research has shown that flap damping can actually lengthen convergence times in some cases, and can cause interruptions in connectivity even when links are not flapping. Moreover, as backbone links and router processors have become faster, some network architects have suggested that flap damping may not be as important as it used to be, since changes to the routing table can be absorbed much faster by routers. This has led the RIPE Route Working Group to write that "with the current implementations of BGP flap damping, the application of flap damping in ISP networks is NOT recommended. ... If flap damping is implemented, the ISP operating that network will cause side-effects to their customers and the Internet users of their customers' content and services ... . These side-effects would quite likely be worse than the impact caused by simply not running flap damping at all."http://www.ripe.net/docs/routeflap-damping.html Improving stability without the problems of flap damping is the subject of current research.http://tools.ietf.org/internet-drafts/draft-li-bgp-stability-01.txt

Routing table growth

One of the largest problems faced by BGP, and indeed the Internet infrastructure as a whole, is the growth of the Internet routing table. If the global routing table grows to the point where some older, less capable, routers cannot cope with the memory requirements or the CPU load of maintaining the table, these routers will cease to be effective gateways between the parts of the Internet they connect. In addition, and perhaps even more importantly, larger routing tables take longer to stabilize (see above) after a major connectivity change, leaving network service unreliable, or even unavailable, in the interim.

Until late 2001, the global routing table was growing exponentially
Exponential growth
Exponential growth occurs when the growth rate of a mathematical function is proportional to the function's current value...

, threatening an eventual widespread breakdown of connectivity. In an attempt to prevent this, ISPs cooperated in keeping the global routing table as small as possible, by using Classless Inter-Domain Routing
Classless Inter-Domain Routing
Classless Inter-Domain Routing is a method for allocating IP addresses and routing Internet Protocol packets. The Internet Engineering Task Force introduced CIDR in 1993 to replace the previous addressing architecture of classful network design in the Internet...

 (CIDR) and route aggregation. While this slowed the growth of the routing table to a linear process for several years, with the expanded demand for multihoming by end user networks the growth was once again superlinear by the middle of 2004. As of April 2010, the routing table has in excess of 310,000 entries.

Route summarization is often used to improve aggregation of the BGP global routing table, thereby reducing the necessary table size in routers of an AS
Autonomous system (Internet)
Within the Internet, an Autonomous System is a collection of connected Internet Protocol routing prefixes under the control of one or more network operators that presents a common, clearly defined routing policy to the Internet....

. Consider AS1 has been allocated the big address space of 172.16.0.0/16, this would be counted as one route in the table, but due to customer requirement or traffic engineering purposes, AS1 wants to announce smaller, more specific routes of 172.16.0.0/18, 172.16.64.0/18 and 172.16.128.0/18. The prefix 172.16.192.0/18 does not have any hosts so AS1 does not announce a specific route 172.16.192.0/18. This all counts as AS1 announcing four routes.

AS2 will see the 4 routes from AS1 (172.16.0.0/16, 172.16.0.0/18, 172.16.64.0/18 and 172.16.128.0/18) and it is up to the routing policy of AS2 to decide whether or not to take a copy of the four routes or, as 172.16.0.0/16 overlaps all the other specific routes, to just store the summary, 172.16.0.0/16.

If AS2 wants to send data to prefix 172.16.192.0/18, it will be sent to the routers of AS1 on route 172.16.0.0/16. At AS1's router, it will either be dropped or a destination unreachable ICMP
Internet Control Message Protocol
The Internet Control Message Protocol is one of the core protocols of the Internet Protocol Suite. It is chiefly used by the operating systems of networked computers to send error messages indicating, for example, that a requested service is not available or that a host or router could not be...

 message will be sent back, depending on the configuration of AS1's routers.

If AS1 later decides to drop the route 172.16.0.0/16, leaving 172.16.0.0/18, 172.16.64.0/18 and 172.16.128.0/18, AS1 will drop the number of routes it announces to three. AS2 will see the three routes, and depending on the routing policy of AS2, it will store a copy of the three routes, or aggregate the prefix's 172.16.0.0/18 and 172.16.64.0/18 to 172.16.0.0/17, thereby reducing the number of routes AS2 stores to only two: 172.16.0.0/17 and 172.16.128.0/18.

If AS2 wants to send data to prefix 172.16.192.0/18, it will be dropped or a destination unreachable ICMP message will be sent back at the routers of AS2 (not AS1 as before), because 172.16.192.0/18 would not be in the routing table.

Load-balancing problem

Another factor causing this growth of the routing table is the need for load balancing of multi-homed networks. It is not a trivial task to balance the inbound traffic to a multi-homed network across its multiple inbound paths, due to limitation of the BGP route selection process. For a multi-homed network, if it announces the same network blocks across all of its BGP peers, the result may be that one or several of its inbound links become congested while the other links remain under-utilized, because external networks all picked that set of congested paths as optimal. Like most other routing protocols, the BGP protocol does not detect congestion.

To work around this problem, BGP administrators of that multihomed network may divide a large continuous IP address block into smaller blocks, and tweak the route announcement to make different blocks look optimal on different paths, so that external networks will choose a different path to reach different blocks of that multi-homed network. Such cases will increase the number of routes as seen on the global BGP table.

IP Hijacking

IP hijacking
IP hijacking
IP hijacking is the illegitimate take over of groups of IP addresses by corrupting Internet routing tables....

 (sometimes referred to as BGP hijacking, prefix hijacking or route hijacking) is the illegitimate take over of groups of IP addresses by corrupting Internet routing tables. This can be caused both by programming errors, attacks on a network or attempts at censorship. This will cause internet sites to go out of service, as routers will no longer find them.

Requirements of a router for use of BGP for Internet and backbone-of-backbones purposes

Routers, especially small ones intended for Small Office/Home Office
Small office/home office
Small office/home office, or SOHO, refers to the category of business or cottage industry which involves from 1 to 10 workers. SOHO can also stand for single office/home office....

 (SOHO) use, may not include BGP software. Some SOHO routers simply are not capable of running BGP using BGP routing tables of any size. Other commercial routers may need a specific software executable image that contains BGP, or a license that enables it. Open source packages that run BGP include GNU Zebra
GNU Zebra
Zebra is a routing software package that provides TCP/IP based routing services with routing protocols support such as RIP, OSPF and BGP. Zebra also supports special BGP Route Reflector and Route Server behavior. In addition to traditional IPv4 routing protocols, Zebra also supports IPv6 routing...

, Quagga
Quagga (Software)
Quagga is a network routing software suite providing implementations of Open Shortest Path First , Routing Information Protocol , Border Gateway Protocol and IS-IS for Unix-like platforms, particularly Linux, Solaris, FreeBSD and NetBSD....

, OpenBGPD
OpenBGPD
OpenBGPD allows general purpose computers to be used as routers. It is a Unix system daemon that provides a free, open-source implementation of the Border Gateway Protocol version 4. This allows a machine to exchange routes with other systems that speak BGP....

, BIRD
Bird Internet routing daemon
BIRD is an open source implementation of a TCP/IP routing daemon for Unix like systems. Developed as a school project at the Faculty of Mathematics and Physics, Charles University, Prague, with major contributions from developers Martin Mares, Pavel Machek and Ondrej Filip...

, XORP
XORP
XORP is an open source Internet Protocol routing software suite originally designed at the International Computer Science Institute in Berkeley, California...

 and Vyatta
Vyatta
Vyatta manufactures an open source router/firewall/VPN product for Internet Protocol networks . A free download of Vyatta has been available since March 2006. The system is a specialized Debian-based Linux distribution with networking applications such as Quagga, OpenVPN, and many others...

. Devices marketed as Layer 3
Network layer
The network layer is layer 3 of the seven-layer OSI model of computer networking.The network layer is responsible for packet forwarding including routing through intermediate routers, whereas the data link layer is responsible for media access control, flow control and error checking.The network...

 switches are less likely to support BGP than devices marketed as routers, but high-end Layer 3 Switches usually can run BGP.

Products marketed as switches may or may not have a size limitation on BGP tables, such as 20,000 routes, far smaller than a full Internet table plus internal routes. These devices, however, may be perfectly reasonable and useful when used for BGP routing of some smaller part of the network, such as a confederation-AS representing one of several smaller enterprises that are linked, by a BGP backbone of backbones, or a small enterprise that announces routes to an ISP but only accepts a default route
Default route
A default route, also known as the gateway of last resort, is the network route used by a router when no other known route exists for a given IP packet's destination address. All the packets for destinations not known by the router's routing table are sent to the default route...

 and perhaps a small number of aggregated routes.

A BGP router used only for a network with a single point of entry to the Internet may have a much smaller routing table size (and hence RAM and CPU requirement) than a multihomed network. Even simple multihoming can have modest routing table size. See RFC 4098 for vendor-independent performance parameters for single BGP router convergence in the control plane. The actual amount of memory required in a BGP router depends on the amount of BGP information exchanged with other BGP speakers, and the way in which the particular router stores BGP information. The router may have to keep more than one copy of a route, so it can manage different policies for route advertising and acceptance to a specific neighboring AS. The term view is often used for these different policy relationships on a running router.

If one router implementation takes more memory per route than another implementation, this may be a legitimate design choice, trading processing speed against memory. A full BGP table as of November 2011 is in excess of 370,000 prefixes. Large ISPs may add another 50% for internal and customer routes. Again depending on implementation, separate tables may be kept for each view of a different peer AS.

Free and open source implementations

  • Bird Internet routing daemon
    Bird Internet routing daemon
    BIRD is an open source implementation of a TCP/IP routing daemon for Unix like systems. Developed as a school project at the Faculty of Mathematics and Physics, Charles University, Prague, with major contributions from developers Martin Mares, Pavel Machek and Ondrej Filip...

    , a GPL routing package for Unix-like systems.
  • GNU Zebra
    GNU Zebra
    Zebra is a routing software package that provides TCP/IP based routing services with routing protocols support such as RIP, OSPF and BGP. Zebra also supports special BGP Route Reflector and Route Server behavior. In addition to traditional IPv4 routing protocols, Zebra also supports IPv6 routing...

    , a GPL
    GNU General Public License
    The GNU General Public License is the most widely used free software license, originally written by Richard Stallman for the GNU Project....

     routing suite supporting BGP4.
  • OpenBGPD
    OpenBGPD
    OpenBGPD allows general purpose computers to be used as routers. It is a Unix system daemon that provides a free, open-source implementation of the Border Gateway Protocol version 4. This allows a machine to exchange routes with other systems that speak BGP....

    , a BSD licensed implementation by the OpenBSD
    OpenBSD
    OpenBSD is a Unix-like computer operating system descended from Berkeley Software Distribution , a Unix derivative developed at the University of California, Berkeley. It was forked from NetBSD by project leader Theo de Raadt in late 1995...

     team.
  • Quagga
    Quagga (Software)
    Quagga is a network routing software suite providing implementations of Open Shortest Path First , Routing Information Protocol , Border Gateway Protocol and IS-IS for Unix-like platforms, particularly Linux, Solaris, FreeBSD and NetBSD....

    , a fork of GNU Zebra for Unix-like
    Unix-like
    A Unix-like operating system is one that behaves in a manner similar to a Unix system, while not necessarily conforming to or being certified to any version of the Single UNIX Specification....

     systems.
  • XORP
    XORP
    XORP is an open source Internet Protocol routing software suite originally designed at the International Computer Science Institute in Berkeley, California...

    , the eXtensible Open Router Platform, a BSD licensed suite of routing protocols.
  • VNE, a C# software library implementing BGP


Simulators

  • BGPviz, a Flash application that presents a graphical visualization of BGP routes and updates for any real AS on the Internet
  • SSFnet, SSFnet network simulator includes a BGP implementation developed by BJ Premore
  • C-BGP, a BGP simulator able to perform large-scale simulation trying to model the ASes of the Internet or modelling ASes as large as Tier-1.
  • BGP++, a patch integrating GNU Zebra
    GNU Zebra
    Zebra is a routing software package that provides TCP/IP based routing services with routing protocols support such as RIP, OSPF and BGP. Zebra also supports special BGP Route Reflector and Route Server behavior. In addition to traditional IPv4 routing protocols, Zebra also supports IPv6 routing...

     software on ns-2 and GTNetS network simulators
  • ns-BGP, a BGP extension for ns-2 simulator based on the SSFnet implementation
  • NetViews, a Java application that monitors and visualizes BGP activity in real time.

Test equipment

Systems for testing BGP conformance, load or stress performance come from vendors such as:
  • Ixia
  • Spirent Communications
  • Agilent Technologies
    Agilent Technologies
    Agilent Technologies , or Agilent, is a company that designs and manufactures electronic and bio-analytical measurement instruments and equipment for measurement and evaluation...


See also

  • Autonomous system (Internet)
    Autonomous system (Internet)
    Within the Internet, an Autonomous System is a collection of connected Internet Protocol routing prefixes under the control of one or more network operators that presents a common, clearly defined routing policy to the Internet....

  • Internet Assigned Numbers Authority
    Internet Assigned Numbers Authority
    The Internet Assigned Numbers Authority is the entity that oversees global IP address allocation, autonomous system number allocation, root zone management in the Domain Name System , media types, and other Internet Protocol-related symbols and numbers...

  • Private IP
    Private IP
    Not to be confused with Private IP address.PIP in telecommunications and datacommunications stands for Private Internet Protocol or Private IP. PIP refers to connectivity into a private extranet network which by its design emulates the functioning of the Internet...

  • Regional Internet Registry
    Regional Internet Registry
    A regional Internet registry is an organization that manages the allocation and registration of Internet number resources within a particular region of the world...

  • Routing
    Routing
    Routing is the process of selecting paths in a network along which to send network traffic. Routing is performed for many kinds of networks, including the telephone network , electronic data networks , and transportation networks...

  • Route filtering
    Route filtering
    In the context of network routing, route filtering is the process by which certain routes are not considered for inclusion in the local route database, or not advertised to one's neighbours...

  • Routing Assets Database
    Routing Assets Database
    Routing Assets Database , run by Merit Network, is a lookup database designed to make fundamental information about networks available. The RADb is a public registry of routing information for networks in the Internet...

     (RADB)
  • QPPB
    QPPB
    QoS Policy Propagation via BGP , is a mechanism that allows propagation of quality of service policy and classification by the sending party based on access lists, community lists and autonomous system paths in the Border Gateway Protocol , thus helping to classify based on destination instead of...

  • AS 7007 incident
    AS 7007 incident
    The AS 7007 incident was a major disruption of the Internet on April 25, 1997 that started with a router operated by autonomous system 7007 accidentally leaking a substantial part of its entire route table to the Internet, creating a routing black hole.Probably because of a bug in the...

  • Route analytics
    Route analytics
    Route analytics is an emerging network monitoring technology specifically developed to analyze the routing protocols and structures in meshed IP Networks...



Further reading

  • Chapter "Border Gateway Protocol (BGP)" in the Cisco
    Cisco Systems
    Cisco Systems, Inc. is an American multinational corporation headquartered in San Jose, California, United States, that designs and sells consumer electronics, networking, voice, and communications technology and services. Cisco has more than 70,000 employees and annual revenue of US$...

     "IOS Technology Handbook"

External links

  • Cyclops A BGP network audit tool (prefix hijack, route leakage) by UCLA
  • Codenomicon Defensics for BGP, a test automation framework for BGP Fuzzing and robustness testing
  • LinkRank A tool for BGP routing visualization by University of California, Los Angeles
  • A look at fuzzing BGP for security testing
  • BGP Routing Resources (includes a dedicated section on BGP & ISP Core Security)
  • BGP table statistics
  • ASNumber Firefox Extension showing the AS number and additional information of the website currently open
  • RIPE Routing Information Service collecting over 550 IPv4 and IPv6 BGP feeds at 14 sites around the world
  • RIS Looking Glass into the Default Free Routing zone of the Internet
  • RISwhois providing IPv4/IPv6 Address to BGP AS Origin Mapping
  • RIS BGPlay BGP routing visualization tool by Università degli Studi Roma Tre
  • Linux Magazine: Demystifying BGP (Good, Detailed BGP explanation; requires registration)
  • BGP config generator Free web-based tool to generate a Cisco/Quagga BGP configuration
  • Some important BGP RFCs
    • RFC 4271, A Border Gateway Protocol 4 (BGP-4)
    • RFC 4456, BGP Route Reflection - An Alternative to Full Mesh Internal BGP (IBGP)
    • RFC 4278, Standards Maturity Variance Regarding the TCP MD5 Signature Option (RFC 2385) and the BGP-4 Specification
    • RFC 4277, Experience with the BGP-4 Protocol
    • RFC 4276, BGP-4 Implementation Report
    • RFC 4275, BGP-4 MIB Implementation Survey
    • RFC 4274, BGP-4 Protocol Analysis
    • RFC 4273, Definitions of Managed Objects for BGP-4
    • RFC 4272, BGP Security Vulnerabilities Analysis
    • RFC 3392, Capabilities Advertisement with BGP-4
    • RFC 5065, Autonomous System Confederations for BGP
    • RFC 2918, Route Refresh Capability for BGP-4
    • RFC 1772, Application of the Border Gateway Protocol in the Internet Protocol (BGP-4) using SMIv2
    • RFC 4893, BGP Support for Four-octet AS Number Space
    • RFC 2439, BGP Route Flap Damping
    • RFC 4760, Multiprotocol Extensions for BGP-4

  • Obsolete RFCs
    • RFC 2796, Obsolete - BGP Route Reflection - An Alternative to Full Mesh IBGP
    • RFC 3065, Obsolete - Autonomous System Confederations for BGP
    • RFC 1965, Obsolete - Autonomous System Confederations for BGP
    • RFC 1771, Obsolete - A Border Gateway Protocol 4 (BGP-4)
    • RFC 1657, Obsolete - Definitions of Managed Objects for the Fourth Version of the Border Gateway
    • RFC 1655, Obsolete - Application of the Border Gateway Protocol in the Internet
    • RFC 1654, Obsolete - A Border Gateway Protocol 4 (BGP-4)
    • RFC 1105, Obsolete - Border Gateway Protocol (BGP)
    • RFC 2858, Obsolete - Multiprotocol Extensions for BGP-4
  • BGP Interactions at Router Startup Described as a Sequence Diagram (PDF)
  • http://www.mudynamics.com/products/protocol.htmlBGP Protocol Service Level Traffic Variations Mu Dynamics
    Mu Dynamics
    Mu Dynamics is an American private venture-funded company that makes hardware and software to test network services, to quantify their product's or service's reliability, availability and security.-Products and services:...

    ]
  • BGP Route Reflection Troubleshooting
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK