Lloyd

Woyd's Bwog

Sunday, February 12, 2006

[Techie] How Does iBGP's Next Hop Work?

[Note: This post is not about BGP's "next-hop-self" command.]

iBGP neighbors are usually many hops away from each other on the edges of a network. They use internal routing protocols (also called internal "gateway" protocols, or IGP) such as OSPF, RIP, or EIGRP to find each other and communicate routing information with the rest of the internal network. Usually they do this by redistributing some or all of the learned BGP routing information into the IGP. So if two iBGP neighbors decide that one of them is the preferred route to an external network, how is that information relayed into the IGP, so that the internal non-BGP routers are aware of that information?

The short answer: The iBGP neighbor that is not the preferred route stops redistributing the route into the IGP altogether. In other words, the router defers to the IGP for the route in question.

For the long answer, here's a diagram:



In the diagram, Routers A and B are learning BGP routes from two separate Internet service providers. An internal web browser is routing through the non-BGP (OSPF) internal routers to the edge and out to a web server somewhere out on the Internet through the BGP edge routers. In order for the OSPF routers to learn the route to the external server, the BGP edge routers must redistribute that route (or the default route) into OSPF. Assuming you are not doing anything crazy with route-maps on the redistribution, the redistributed route shows up in the internal routing table with whatever default metric you defined on the edge routers. So, all else being equal, the OSPF routers will see pretty much equal-cost routing through the two BGP routers. So far, so good, right?

So what happens if your company's routing policy tells you that you must use Router A to get to the Internet, and Router B is a cold standby? Well, you could set a higher default OSPF metric on Router B's redistribution, but let's say that instead, you decided to use the Local Preference (LP) attribute of BGP. Here's what would happen:

  1. Routers A and B would still learn the destination route from their perspective ISPs over the eBGP connections.
  2. After learning the route, Routers A and B would use a route-map to identify the destination route in question, and set the LP attribute accordingly. The commands to do that would look something like this (using Router A as an example):

  3. ip prefix-list ExamplePrefixList seq 5 permit a.b.c.d/xy
    !
    route-map ExampleRouteMap permit 10
      match ip address prefix-list ExamplePrefixList
      set local-preference 100
    !
    route-map ExampleRouteMap permit 20
    !
    router bgp 2
      neighbor [neighbor address] route-map ExampleRouteMap in

    The only difference on Router B would be that instead of "set local-preference 100", it would have a lower setting, say "set local-preference 50".
  4. Routers A and B would communicate the destination route information to each other over their iBGP connection, along with the LP attribute. Through this iBGP communication, they would learn that Router A is the one with the better route to the destination network.

Now, what happens if a packet whose destination is the web server lands on Router B? Remember, Router B has an eBGP route to the destination, and eBGP has an administrative distance of 20, which is only behind Connected, Static, and EIGRP Summary routes in preference. But since Router B has learned through iBGP that the packet should go through Router A, Router B dumps the packet back into the internal network toward Router A. So, basically, Router B is dumping the packet back into the IGP, which may have delivered the packet to Router B in the first place. In other words, how does the IGP (OSPF in this case) know to send the packet to Router A instead of sending it right back to Router B, causing a routing loop?

This has been a question which has been bugging me for a while. Anyone who knows BGP knows that this will automatically work, but how? So I set it up in a lab and tested it out and here's where the magic comes in: When Router B gets the iBGP information about the destination, it basically acts like it never learned the route through eBGP. It stops redistributing the BGP route into the IGP and defers to the IGP for the route to that destination. So now the only route the IGP knows is the one it is receiving from Router A's redistribution from BGP. In the scenario above, if you do a "show ip route" for the destination route, the only thing you will see is whatever the IGP is advertising. It's like the router totally forgot it had learned the route through BGP. It looks like this, for example:

RouterB#show ip route a.b.c.d/xy
Routing entry for a.b.c.d/xy
  Known via "ospf 100", distance 110, metric 100
  Tag 1, type extern 2, forward metric 2
  Last update from 10.1.5.3 on Vlan5, 00:01:49 ago
  Routing Descriptor Blocks:
    10.1.5.3, from 10.1.4.1, 00:01:49 ago, via Vlan5
      Route metric is 100, traffic share count is 1
      Route tag 1
  * 10.1.5.2, from 10.1.4.1, 00:01:49 ago, via Vlan5
      Route metric is 100, traffic share count is 1
      Route tag 1

This behavior works automagically whenever you weight the route using a BGP attribute, such as LP or Multiple Exit Discriminator (MED).

So now you know. Here's some further reading:

0 Comments:

Post a Comment

<< Home