Friday, August 31, 2012

ARP and ping in POX - Building a POX-based OpenFlow router

What are we doing?

Today, we're going to look at how to handle ARP and ICMP ping messages in the OpenFlow controller POX. The results aren't amazing - latency is between 5 and 50 milliseconds (using pypy makes no difference) - but it's an important feature for any layer 3 device.

Why is it important?

If we want to make native router modules in OpenFlow, we need to be able to assign IP addresses to interfaces on our device. This means the router can talk IP to other devices on the network, a vital step towards building an OpenFlow router.

What about RouteFlow?

RouteFlow is a fully-functional OpenFlow router that you can use today, that translates the physical ports on your OpenFlow device to interfaces on a virtual machine. This virtual machine runs a software router daemon like Quagga or BIRD, meaning you can leverage a mature software router instead of making your own.

RouteFlow represents an important step in OpenFlow routing, but I think we can do better. RouteFlow polls the RIB on a virtual machine and translates that to OpenFlow, which means the router daemons don't know that they're talking to the controller.

If we build a clean interface, we can write POX modules for OSPF, IS-IS, BGP and the like, and let them talk directly to the controller.

How to make packets in POX

I love the packet library in POX, it's clean and easy to use. To make your own packets, just do what your network stack normally does - create the payload, wrap that in the layer below, then the layer below that, and once you're at Ethernet you're finished.

Step 1: ARP replies

I've started with the forwarding.l2_learning module from POX, and added some code to the _handle_PacketIn function, just under self.macToPort[packet.src] = event.port (so that MAC addresses are still stored for each new port).

match = of.ofp_match.from_packet(packet)
if ( match.dl_type == packet.ARP_TYPE and
match.nw_proto == arp.REQUEST and
match.nw_dst == IPAddr("10.1.1.253")):
  self.RespondToARP(packet, match, event)
  return

This checks for ARP requests for our hardcoded IP 10.1.1.253, and responds. The code to respond is as follows:

  def RespondToARP(self, packet, match, event):
    # reply to ARP request
    r = arp()
    r.opcode = arp.REPLY
    r.hwdst = match.dl_src
    r.protosrc = IPAddr("10.1.1.253")
    r.protodst = match.nw_src
    r.hwsrc = EthAddr("00:12:34:56:78:90")
    e = ethernet(type=packet.ARP_TYPE, src=r.hwsrc, dst=r.hwdst)
    e.set_payload(r)
    log.debug("%i %i answering ARP for %s" %
     ( event.dpid, event.port,
       str(r.protosrc)))
    msg = of.ofp_packet_out()
    msg.data = e.pack()
    msg.actions.append(of.ofp_action_output(port =
                                          of.OFPP_IN_PORT))
    msg.in_port = event.port
    event.connection.send(msg)

We build an ARP packet by calling the arp() function from pox.lib.packet, and it initialises the packet as follows:

def __init__(self, raw=None, prev=None, **kw):
        packet_base.__init__(self)

        self.prev = prev

        self.hwtype     = arp.HW_TYPE_ETHERNET
        self.prototype  = arp.PROTO_TYPE_IP
        self.hwsrc      = ETHER_ANY
        self.hwdst      = ETHER_ANY
        self.hwlen      = 6
        self.opcode     = 0
        self.protolen   = 4
        self.protosrc   = IP_ANY 
        self.protodst   = IP_ANY
        self.next       = b''

        if raw is not None:
            self.parse(raw)

        self._init(kw)

We just need to set the OPCODE, HWSRC, HWDST, PROTOSRC and PROTODST fields of this. I've done this in the body of the code, but we can simplify it by passing extra arguments as follows:

r = arp( opcode=arp.REPLY, 
         hwsrc=EthAddr("00:12:34:56:78:90"),
         hwdst=match.dl_src,
         protosrc = IPAddr("10.1.1.253"),
         protodst = match.nw_src)

Once we've created the ARP packet, we need to create an Ethernet packet to put this into. This isn't perfect (we should check for VLAN tags and add them, or steal the body of the original packet and modify that), but it works if we're just dealing with a straight Ethernet network.

e = ethernet(type=packet.ARP_TYPE, src=r.hwsrc, dst=r.hwdst)
e.set_payload(r)

Then we send this off to the controller, which sends it out the port it came through. Now we have an IP address that people can find, let's make it respond to something.

Step 2: Ping replies

If we can reply to ARP requests, we can reply to pings. This has a few more layers, but that just makes the code a little longer, not any more complicated.

The ARP reply is easy - we make an ARP packet, then put that in an Ethernet packet. For ping reply, this is what we do:
  1. Get the payload from the echo request (ping)
  2. Create an echo reply packet, insert the old payload
  3. Create an ICMP packet, insert the echo reply
  4. Create an IPv4 packet, insert the ICMP
  5. Create an Ethernet packet, insert the IPv4
Here's what the code looks like:

  def RespondToPing(self, ping, match, event):
    p = ping
    # we know this is an ICMP Echo packet, so loop through
    # maybe this needs a try... except?
    while not isinstance(p, echo):
      p = p.next
    
    r = echo(id=p.id, seq=p.seq)
    r.set_payload(p.next)
    i = icmp(type=0, code=0)
    i.set_payload(r)
    ip = ipv4(protocol=ipv4.ICMP_PROTOCOL,
              srcip=IPAddr("10.1.1.253"),
              dstip=match.nw_src)
    ip.set_payload(i)
    e = ethernet(type=ping.IP_TYPE,
                 src=match.dl_dst,
                 dst=match.dl_src)
    e.set_payload(ip)
    log.debug("%i %i answering PING for %s" % (
              event.dpid, event.port,
              str(match.nw_src)))
    msg = of.ofp_packet_out()
    msg.data = e.pack()
    msg.actions.append(of.ofp_action_output(port =
                                          of.OFPP_IN_PORT))
    msg.in_port = event.port
    event.connection.send(msg)

Simple, just slightly longer than the ARP code.

Pictures

Here's a look at the controller output, and the view from Wireshark.

The OpenFlow dissector for Wireshark is part of the OpenFlow reference switch. It's a few years old, and uses an obselete API call - I can put up a patch if anyone gets stuck.

Next steps

  • ARP tables - if we're going to route traffic, we need to find the MAC addresses of destination IPs so that we send traffic to them
  • Routing protocol - RIP and OSPF will be fairly easy, BGP will be a bit harder due to relying on TCP. These can all be added to POX as modules
  • TUN/TAP support - we can create TUN/TAP interfaces and let the linux TCP stack do the hard work for us. This means a BGP module would create the TUN/TAP interface and handle OpenFlow encapsulation/decapsulation, but could offload the TCP to the Linux stack.

3 comments:

  1. Just a note that forwarding.l3_learning and samples.pong have some examples of ARP and ICMP as well.

    Also, I have the start to a RIP class for the packet library somewhere if you want it. :)

    Great posts!

    ReplyDelete
  2. Thanks sam,
    When i am trying to use the RespondToPing, I actually am not able to resolve the ping ( passed in as argument). How can I resolve it?

    ReplyDelete
  3. Nice post. Thanks.

    I have a question for you. In case, I want to learn the IP devices of all the hosts connected to the switches pro-actively, how should I do it ?

    ReplyDelete