Friday, September 7, 2012

Tunneling traffic through your OpenFlow controller - Building a POX-based OpenFlow router

Why would you do this?

If we want to make an OpenFlow router, we need to be able to communicate with other non-OpenFlow routers. Normally, you would assign an IP address to your router, turn on BGP/OSPF, and then configure these protocols to talk to other routers using this IP address. With OpenFlow, the controller has the brains, but no obvious way to talk to other network devices. If only we could pretend that the controller was in the router somehow...

Can't we just look at the OpenFlow messages?

Sure, and we looked at this last week, but it's clumsy and means we need to reinvent the wheel to make software routers talk to POX. RouteFlow abstracts this by loading software routers in virtual machines, last week's demonstration hardcodes everything into the controller, but tunnelling gives us a middle-of-the-road solution: no virtual machines needed, but we can still bind stuff to a network interface on the controller and let the linux network stack handle already-solved problems like TCP and the like.

Building a tunnel

Linux has a fantastic tool called TUN/TAP, which lets you create virtual network interfaces. One end talks to the Linux network stack and lets any application use it, and the other end talks to our program. In the spirit of keeping things modular, and minimising opportunities for me to write bad code, I've used the PyTap library to set this up. PyTap has a PIP package, which means we can easily add it to a virtualenv and continue to keep everything self-contained.

Protip: TUN interfaces take IP packets, TAP interfaces take Ethernet packets

If you haven't used virtualenvs, here's the basic idea:

virtualenv tundemo
cd tundemo
source bin/activate
pip install pytap
git clone

This will set you up with a virtualenv that has POX and PyTap ready to go. Despite being in a virtualenv, PyTap still needs root privileges, so you'll need to be root before source'ing into your virtualenv to make this work. If anyone can show me how to make this work without root privileges I'll be happy to hear (presumably some trickery with the /dev/net/tun device)

As with my other modules, I've hacked code into a copy of forwarding.l2_learning - this time I've renamed it to tundemo, and changed the name of the class all through the source.

Here are all my imports, add these at the top:

from pytun import TunTapDevice, IFF_TAP
from pox.lib.addresses import *
from pox.lib.packet import *
from threading import Thread
import subprocess

In the __init__() function, I've put the following code to make the TAP device:

    # Our table
    self.macToPort = {}
    # TAP device
    self.tap = TunTapDevice(flags=IFF_TAP)
    self.tap.addr = ''
    self.tap.netmask = ''
    print "hwaddr for " + + ": " + str(EthAddr(self.tap.hwaddr))
    # Bring tap interface up
    subprocess.check_call("ifconfig " + + " up", shell=True)

PyTap chooses a random MAC address when it creates the interface, so printing it out lets us debug things a bit easier.

Tunneling fron TAP to switch

Once we have our TAP interface up, we need to handle packets that we receive on it. Let's set up a thread to handle this

# Create thread to read from tap and send to switch = Thread(target=handle_tap_in, args=(self)) = True

    # Set max packet size to 1400 bytes

Our handler function is fairly straightforward

def handle_tap_in(switch):
  while True:
    packettap =
    print "Packet read from tap"
    e = ethernet()
    port = of.OFPP_ALL
    if e.dst in switch.macToPort:
        port = switch.macToPort[e.dst]
    msg = of.ofp_packet_out() = packettap[4:]
    msg.actions.append(of.ofp_action_output(port =

This will send all packets that come up on the tap0 interface to the switch, and either floods them or sends them on the right port, depending on what MAC addresses we've already learned.

Tunneling from switch to TAP

We already get sent packets from the switch by default, and these go to the _handle_PacketIn() function. We just need to get the raw data out and send this to the TAP interface

My switch always sends VLAN-tagged packets, so if yours doesn't then you'll want to change this a bit. Here is the SendToTap() function:

def SendToTap():
     # remove vlan header and rebuild
      print "Forwarding packet"
      v =
      i =
      eth = ethernet(src=packet.src, dst=packet.dst, type=v.eth_type)
      print type(i)
      # first 4 bytes are 00 00 08 00 (null short, then IPv4 ethertype)
      totap = struct.pack('!bbbb', 0, 0, 8, 0) + eth.pack()
      #print totap.encode('hex')

And we call this when a packet comes to us with a multicast MAC or our MAC:

if packet.dst == EthAddr(self.tap.hwaddr):
      print "Packet for us!"

if packet.dst.isMulticast():
      flood() # 3a

Now the tunnel is all good to go. Just make sure any devices plugged into the switch have an MTU of 1300, and you can talk to the controller, transfer files off with SCP (30 minutes to copy an Ubuntu ISO at around 4Mb/s)

A couple of hiccups

Packet sizes

My switch doesn't seem to handle having the packet-size value changed. POX by default tells the switch to send the first 128 bytes of packets, and while we can send messages to increase this, they're ignored. The work-around is to change DEFAULT_MISS_SEND_LEN to 1400 in pox/openflow/


Latency varies from 1ms to 50ms, and TCP really, really doesn't like this. UDP routing protocols like OSPF shouldn't notice this, and even TCP-based routing protocols like BGP should be fine - but TCP gets really confused and this means you shouldn't expect any large data flows to work well with this.

MTU sizes

This stuff confuses me. I'm a network engineer, and I'm supposed to know this stuff, but I don't. When we read from the TAP device, we read the MTU + 24 bytes. There's 14 bytes for the Ethernet header, 4 bytes for the TAP header, and another 6 bytes in there for no obvious reason. 24 bytes just seems to work, and I have no idea why.

TAP device

Two things bug me about this - there doesn't seem to be a nice way to bring it up (apart from using ifconfig), and you need root to create it in the first place - I'd want to fix both of these for a nicer solution

Next steps

  • TAP devices could be created for each physical port on an OpenFlow device, or as routed interfaces for each VLAN - limitless opportunities here
  • BIRD or Quagga could bind to a TAP device, and the controller could turn routes into flows. BIRD has a python interface, but since both use standard routing protocols, you could easily sniff the traffic and build routing tables out of these. Sniffing BGP updates is still way easier than trying to build a Python TCP stack
  • VRFs? Traffic injection? Just another example of how easy it is to grab POX and do novel things with inexpensive hardware

Friday, August 31, 2012

ARP and ping in POX - Building a POX-based OpenFlow router

What are we doing?

Today, we're going to look at how to handle ARP and ICMP ping messages in the OpenFlow controller POX. The results aren't amazing - latency is between 5 and 50 milliseconds (using pypy makes no difference) - but it's an important feature for any layer 3 device.

Why is it important?

If we want to make native router modules in OpenFlow, we need to be able to assign IP addresses to interfaces on our device. This means the router can talk IP to other devices on the network, a vital step towards building an OpenFlow router.

What about RouteFlow?

RouteFlow is a fully-functional OpenFlow router that you can use today, that translates the physical ports on your OpenFlow device to interfaces on a virtual machine. This virtual machine runs a software router daemon like Quagga or BIRD, meaning you can leverage a mature software router instead of making your own.

RouteFlow represents an important step in OpenFlow routing, but I think we can do better. RouteFlow polls the RIB on a virtual machine and translates that to OpenFlow, which means the router daemons don't know that they're talking to the controller.

If we build a clean interface, we can write POX modules for OSPF, IS-IS, BGP and the like, and let them talk directly to the controller.

How to make packets in POX

I love the packet library in POX, it's clean and easy to use. To make your own packets, just do what your network stack normally does - create the payload, wrap that in the layer below, then the layer below that, and once you're at Ethernet you're finished.

Step 1: ARP replies

I've started with the forwarding.l2_learning module from POX, and added some code to the _handle_PacketIn function, just under self.macToPort[packet.src] = event.port (so that MAC addresses are still stored for each new port).

match = of.ofp_match.from_packet(packet)
if ( match.dl_type == packet.ARP_TYPE and
match.nw_proto == arp.REQUEST and
match.nw_dst == IPAddr("")):
  self.RespondToARP(packet, match, event)

This checks for ARP requests for our hardcoded IP, and responds. The code to respond is as follows:

  def RespondToARP(self, packet, match, event):
    # reply to ARP request
    r = arp()
    r.opcode = arp.REPLY
    r.hwdst = match.dl_src
    r.protosrc = IPAddr("")
    r.protodst = match.nw_src
    r.hwsrc = EthAddr("00:12:34:56:78:90")
    e = ethernet(type=packet.ARP_TYPE, src=r.hwsrc, dst=r.hwdst)
    log.debug("%i %i answering ARP for %s" %
     ( event.dpid, event.port,
    msg = of.ofp_packet_out() = e.pack()
    msg.actions.append(of.ofp_action_output(port =
    msg.in_port = event.port

We build an ARP packet by calling the arp() function from pox.lib.packet, and it initialises the packet as follows:

def __init__(self, raw=None, prev=None, **kw):

        self.prev = prev

        self.hwtype     = arp.HW_TYPE_ETHERNET
        self.prototype  = arp.PROTO_TYPE_IP
        self.hwsrc      = ETHER_ANY
        self.hwdst      = ETHER_ANY
        self.hwlen      = 6
        self.opcode     = 0
        self.protolen   = 4
        self.protosrc   = IP_ANY 
        self.protodst   = IP_ANY       = b''

        if raw is not None:


We just need to set the OPCODE, HWSRC, HWDST, PROTOSRC and PROTODST fields of this. I've done this in the body of the code, but we can simplify it by passing extra arguments as follows:

r = arp( opcode=arp.REPLY, 
         protosrc = IPAddr(""),
         protodst = match.nw_src)

Once we've created the ARP packet, we need to create an Ethernet packet to put this into. This isn't perfect (we should check for VLAN tags and add them, or steal the body of the original packet and modify that), but it works if we're just dealing with a straight Ethernet network.

e = ethernet(type=packet.ARP_TYPE, src=r.hwsrc, dst=r.hwdst)

Then we send this off to the controller, which sends it out the port it came through. Now we have an IP address that people can find, let's make it respond to something.

Step 2: Ping replies

If we can reply to ARP requests, we can reply to pings. This has a few more layers, but that just makes the code a little longer, not any more complicated.

The ARP reply is easy - we make an ARP packet, then put that in an Ethernet packet. For ping reply, this is what we do:
  1. Get the payload from the echo request (ping)
  2. Create an echo reply packet, insert the old payload
  3. Create an ICMP packet, insert the echo reply
  4. Create an IPv4 packet, insert the ICMP
  5. Create an Ethernet packet, insert the IPv4
Here's what the code looks like:

  def RespondToPing(self, ping, match, event):
    p = ping
    # we know this is an ICMP Echo packet, so loop through
    # maybe this needs a try... except?
    while not isinstance(p, echo):
      p =
    r = echo(, seq=p.seq)
    i = icmp(type=0, code=0)
    ip = ipv4(protocol=ipv4.ICMP_PROTOCOL,
    e = ethernet(type=ping.IP_TYPE,
    log.debug("%i %i answering PING for %s" % (
              event.dpid, event.port,
    msg = of.ofp_packet_out() = e.pack()
    msg.actions.append(of.ofp_action_output(port =
    msg.in_port = event.port

Simple, just slightly longer than the ARP code.


Here's a look at the controller output, and the view from Wireshark.

The OpenFlow dissector for Wireshark is part of the OpenFlow reference switch. It's a few years old, and uses an obselete API call - I can put up a patch if anyone gets stuck.

Next steps

  • ARP tables - if we're going to route traffic, we need to find the MAC addresses of destination IPs so that we send traffic to them
  • Routing protocol - RIP and OSPF will be fairly easy, BGP will be a bit harder due to relying on TCP. These can all be added to POX as modules
  • TUN/TAP support - we can create TUN/TAP interfaces and let the linux TCP stack do the hard work for us. This means a BGP module would create the TUN/TAP interface and handle OpenFlow encapsulation/decapsulation, but could offload the TCP to the Linux stack.

Tuesday, August 21, 2012

DjangoFlow part two: Quick and simple UI

In the last episode...

My previous blog post showed how to integrate POX with Django, and didn't have too much colour. It took a bit of playing to get it to integrate for the first time, but now it's done, it is incredibly easy to do cool stuff with this software combo, and make new apps for easy OpenFlow access.

Part 2: An actual User Interface

I put in about 10 minutes extra just so I could give you all some lovely pictures, and here's what it all looks like:

This is all Django - I've just turned on the admin interface. Here we can add and remove flows from the database.

Here's a list of the two flows that I put in for testing

And here's the output from my OpenFlow switch.

It doesn't update in real time, but you can manipulate the database via the interweb and push those flows directly out to your switch. Making this update in real time is going to be another half-hour's work, tops.

What's new?

I changed the model a bit - now we can choose ports as well. Remember to delete your old database before syncing again or else it'll get upset

class Flow(models.Model):
internalip = models.CharField(max_length=200)
externalip = models.CharField(max_length=200)
internalport = models.IntegerField()
externalport = models.IntegerField()
idletime = models.IntegerField()
hardtime = models.IntegerField()
def __unicode__(self):
return "Internal: IP=" + self.internalip + " port=" + str(self.internalport) +", External: IP=" + self.externalip + " port=" + str(self.externalport)

The code in got a bit of a birthday too:

class LearningSwitch (EventMixin):
  def __init__ (self, connection, transparent):
    # Switch we'll be adding L2 learning switch capabilities to
    self.connection = connection
    self.transparent = transparent

    # Our table
    self.macToPort = {}

    # We want to hear PacketIn messages, so we listen

    #log.debug("Initializing LearningSwitch, transparent=%s",
    #          str(self.transparent))
    # add new flows by default
    for flow in Flow.objects.all():
  def AddFlowFromModel(self, flow):
    # add outgoing flow
    msg = of.ofp_flow_mod()
    msg.match = of.ofp_match()
    msg.match.dl_type = ethernet.IP_TYPE
    msg.match.nw_src = str(flow.internalip)
    msg.match.nw_dst = str(flow.externalip)
    msg.match.in_port = flow.internalport
    msg.idle_timeout = flow.idletime
    msg.hard_timeout = flow.hardtime
    msg.actions.append(of.ofp_action_output(port = flow.externalport))
    # add incoming flow
    msg = of.ofp_flow_mod()
    msg.match = of.ofp_match()
    msg.match.dl_type = ethernet.IP_TYPE
    msg.match.nw_src = str(flow.externalip)
    msg.match.nw_dst = str(flow.internalip)
    msg.match.in_port = flow.externalport
    msg.idle_timeout = flow.idletime
    msg.hard_timeout = flow.hardtime
    msg.actions.append(of.ofp_action_output(port = flow.internalport))

After this, it was just a case of enabling the admin interface as per - this is our

from django.contrib import admin
from flew.models import Flow

And for the sake of completeness, here's how to start them - the web interface is

python runserver

and the controller is

python forwarding.l2_learning

What next?

I don't know... I thought this would be much harder? Authentication will be fun, some way to dynamically check the database and update flows in real time (and remove them maybe) - this is left as an exercise for the reader though

DjangoFlow - Web UI for the POX OpenFlow controller

The Plan

OpenFlow is a simple concept - an open interface to switch and router hardware. Can we tie this into an open web framework to create a foundation for a really simple Web UI?

The tools

I've been learning how to use Django recently, and it seems like a perfect choice for this task. It's written in Python, so it integrates easily with the POX controller. I've also used virtualenv to make it more portable - this means the project is largely self-contained and can be copied to a new system by simply copying the folder.


To start, I created a virtualenv called "djangoflow" on an Ubuntu machine and installed django in it. There are lots of ways to do this - do a google for "django setup virtualenv ubuntu" and you'll get tons of hits.

I made a project called mysite, and an app called flew (NZ english for "flow"), and left that bit for the time being.

The folder structure then looks something like this:

- bin
- include
- lib
- local
- mysite
--- flew
--- mysite

I then did a git clone of the latest build of POX from the NOXREPO github, into the djangoflow folder, and then copied everything from the base mysite folder into the pox folder. The folder structure then looks like:

- bin
- include
- lib
- local
- pox
--- .git
--- flew
--- mysite
--- pox

This is great - now back to Django.

The model that I used was really simple, here's what I've got so far:

from django.db import models

# Create your models here.

class Flow(models.Model):
internalip = models.CharField(max_length=200)
externalip = models.CharField(max_length=200)
idletime = models.IntegerField()
hardtime = models.IntegerField()
def __unicode__(self):
return "Internal: " + self.internalip + ", External: " + self.externalip

class User(models.Model):
name = models.CharField(max_length=200)

class Device(models.Model):
dpid = models.CharField(max_length=200)

We'll only use the Flow model today, the others are for expansion later. To make this work, we'll need to edit the settings in mysite.settings - set up the databases and installed_apps sections as follows:

    'default': {
        'ENGINE': 'django.db.backends.sqlite3', # Add 'postgresql_psycopg2', 'mysql', 'sqlite3' or 'oracle'.
        'NAME': 'flew.db',                      # Or path to database file if using sqlite3.
        'USER': '',                      # Not used with sqlite3.
        'PASSWORD': '',                  # Not used with sqlite3.
        'HOST': '',                      # Set to empty string for localhost. Not used with sqlite3.
        'PORT': '',                      # Set to empty string for default. Not used with sqlite3.

    # 'django.contrib.sites',
    # Uncomment the next line to enable the admin:
    # Uncomment the next line to enable admin documentation:

Awesome, now run syncdb (check against whatever tutorial you use to see how this is done) and populate the database. Now, fire up the shell and create your first flow:

python shell
from flew.models import Flow
f = Flow(internalip = "", externalip = "", idletime = 300, hardtime = 3600)

If that all worked, then you'll have a new entry in your database. We are never going to access the database directly though - we can access all the Django goodness from POX.

Open up pox/forwarding/ and have a look through - if you've used this before, then good, if not, then see what it all does.

I've hijacked this just for this example, but it sets a starting point for any POX-Django integrated tools. Make sure your imports section looks like this:

# import django stuff
from import setup_environ
from mysite import settings
from flew.models import Flow

from pox.core import core
import pox.openflow.libopenflow_01 as of
from pox.lib.revent import *
from pox.lib.util import dpidToStr
from pox.lib.util import str_to_bool
from pox.lib.packet import ethernet
from pox.lib.packet import ipv4
import time

Then go down to the __init__ function for LearningSwitch and add this to the end:

    # add new flow by default
    flow1 = Flow.objects.all()[0]
    msg = of.ofp_flow_mod()
    msg.match = of.ofp_match()
    msg.match.dl_type = ethernet.IP_TYPE
    msg.match.nw_src = str(flow1.internalip)
    msg.match.nw_dst = str(flow1.externalip)
    msg.idle_timeout = flow1.idletime
    msg.hard_timeout = flow1.hardtime
    msg.actions.append(of.ofp_action_output(port = 1))
    #msg.buffer_id = event.ofp.buffer_id # 6a

What does this do? It takes the first Flow out of our database, creates a flow to allow traffic from internalip, going to externalip, to go out port 1, which should be pointing at the outside world. This flow will stay in the switch for an hour, or 5 minutes without being triggered - whichever happens first.

Question time

Q: Is it really this easy?
A: Yes. Python is easy, Django is easy, Virtualenv is easy, POX is easy. It's just a little tricky making them all work together - that's what this guide is for.

Q: Why add flows this way when we already have a CLI?
A: Django has a clean admin interface that I haven't covered here (check the setup tutorial on the Django website) - you can set flows in there, and every time a switch connects, it will use those flows.

Q: Could I start using this right now?
A: Totally. If you want your switches to retrieve their configuration from the controller automatically when they start up, you can set a bunch of super specific flows here and roll it out now. If you want something a bit more clever, then you can jump into the Django and POX API and make it happen yourself.

Q: What next?
A: There are two extra models that we didn't use - one for user authentication, and one for managing devices. If your OpenFlow devices connect by SSL (which they should in the real world) then you can create models for them that hold particular flows - this can all be managed via a web interface. As for user authentication, there are millions of ways to do this - you can set privileges for who can set certain flows, make requests that admins can approve - the possibilities are endless!

Thursday, May 31, 2012

Jak użyć szi-python?

Dawno temu napisałem program SZI - Ściągnij z IPLY. Było trudno użyć, tylko mroczny CLI a nic więcej. Działo się przecież, ale dla większości ludzi, było za trudno.

Szi-python można ściagać w

Pare tygodni temu zacząłem napisać nową wersję w python. Jest śliczny język, prosto napisać, a dużo z tym można zrobić. Po tylko kilku godzin, mam już nową wersję - prosta, czysta, a dobrze się działo (mam nadzieję).

A jak wygląda? Nudny. Otworzę program i widzę to:

Kliknę na + obok "IPLA", a widzę już kategorii

Prosto szukać przez drzewa do czegokołwiek. Kiedy znajdziesz coś, że Ci się podoba, kliknij dwukrotnie, i czekaj...

Już nie dodałem multi-threading, więc trzeba czekać aż otrzymasz listę odzinki z serwera.

Kliknij dwukrótnie na odciniek, co chcesz ściągać, i czekać nawet dłuższej. Odcinki Wagy są malutkie (tylko 3.5MB) więc czekać trzeba tylko pare minut, ale większy odcinki trzeba dłuższej czekać.

I to już wszystko. Mam zamiar wykorzystać multi-threading, posprzątać interfejs itd... ale to nie potrzebne teraz. Już działo się, prosto użyc, to najważniejsze na teraz.

Jeżeli chcesz Mi pomóc z tym app'em, to chętnie bym pogadał z Tobą - mój twitter to @samrussellnz, tweetuj i może napiszemy świetny app!

Friday, May 4, 2012

Accident of simplicity: why the Internet trumps everything


Once upon a time, there were book writers, and there were book publishers. They were dependent on each other - the book writer couldn't reach a large audience, and the book publisher couldn't produce good material by itself. So they joined forces - a book writer would allow a book publisher to sell copies of her books, in return for a royalty on each copy sold.

Then along came copyright. Copyright made sense - if the licensed publisher had the additional cost of royalties to the writer, then another publisher could produce copies of the same book, at the same cost, and sell them cheaper than the licensed publisher. This isn't necessarily bad for book writers - stories have been created and shared since the beginning of time - but it destroys any incentive for publishers to go into business with writers.

Fast-forward to 2012, and what has changed? Writing is less about creating art and information to share with the world, and more about the lofty goal of earning a lifetime's salary from a few weeks' work. Publishers take a much larger cut, writers are locked into more oppressive contracts, and copyright allows this unbalanced system to keep working.

The world has completely changed though... WRITERS DON'T NEED PUBLISHERS! A writer can pen a book, procure her own artwork and editing, and release a professional ebook, or charge for hard copies. And yet, writers are enslaved by publishers, and the dream of becoming rich from a disproportionately small amount of work.


I mean no disrespect to content creators (we love and need good content), but creating content doesn't automatically entitle you to a monetary reward. To quote Fight Club:

"Advertising has these people chasing cars and clothes they don't need.  Generations have been working in jobs they hate, just so they can buy what they don't really need."

You may notice that this quote is from the original novel by Chuck Palahniuk, and not from the movie that made a $10 million profit... go figure

Kids these days chase jobs as writers, musicians, programmers and actors, trying to make money selling their content. This is back-to-front - we should be passionate about the content we create first and foremost.

Punchline #1

Copyright just happened to work, back when the only way to distribute books was by hiring a book publisher. Today, copying is free, books can go around the world in seconds, and copyright isn't relevant anymore. Writers can distribute their content to billions of people at the click of a mouse, but this is no longer the goal. Copyright is now used to enslave writers, and punish their potential fans.


In the beginning, the telephone was connected to one end of a long copper cable, with another telephone on the other end. Wiretapping was just that - connecting your equipment to that cable and listening in on the conversation. Because of how simple it was to do, wiretapping was a tool that police forces (the FBI if you're in the US of A) added to their arsenal.

The telephone system has changed a lot since then. There are no longer rooms full of operators that patch phone calls through to different houses, but digital systems that the government is allowed to spy on. This is already totally different to the original wire tap, but nobody has really complained too much - it's still the same telephone company that has always been allowed to spy on us.

But now, the FBI wants to have this power to spy on every digital communication! No longer an accident of the simplicity of the telephone, the FBI is so used to being able to spy on every single citizen, that it demands to have this right extended to where it doesn't make sense anymore. There's no Facebook wire that you can tap, and there's no magical black box that decrypts Skype calls - but the FBI is demanding that these companies betray the trust of their users and re-engineer their products to let them continue to spy on us.

The biggest thing here is that the original wiretap didn't deal with encryption. But now, Twitter uses HTTPS encryption, Skype and other VOIP providers use encryption, and the FBI wants the right to be able to break that encryption so that it can continue to violate your privacy.


Have you ever read an End User License Agreement (EULA)? Most EULA on networking software (including web browsers and email clients) mention that you cannot export this "product" to certain countries because it includes encryption. Cryptography exported to any country had to be limited to 40 bits, and even today, cryptography with more than 64-bits (Internet Explorer uses 128) cannot be exported to the following countries:

North Korea

Done a tour of Asia recently? Had your smartphone with you? You broke the law! Does that mean that you're a bad person, or does it mean that the law is totally out of touch with reality?


You can't fight change, and creating new versions of old technology rarely makes sense. Copying is free, and encryption is standard with most communication products. These are both good things, and every law that limits them will hold us back from finding the next good thing.

Tuesday, April 3, 2012

Creating an Olive with JunOS 12.1 on VirtualBox

First, download jinstall-12.1R1.9-domestic-signed.tgz from the Juniper website.

You’ll need to unpack it and play around with it a little. Unpack the tgz

sam@laptop:~/Downloads$ mkdir jinst-signed
sam@laptop:~/Downloads$ cd jinst-signed/
sam@laptop:~/Downloads/jinst-signed$ tar -xzf ../jinstall-12.1R1.9-domestic-signed.tgz
sam@laptop:~/Downloads/jinst-signed$ ls -l
razem 437104
-rw-r--r-- 1 sam sam      7153 2012-03-24 19:57 certs.pem
-rw-r--r-- 1 sam sam        50 2012-03-25 03:31 +COMMENT
-rw-r--r-- 1 sam sam      1154 2012-03-25 03:31 +CONTENTS
-rw-r--r-- 1 sam sam       195 2012-03-25 03:31 +DESC
-rw-r--r-- 1 sam sam     87216 2012-03-25 03:31 +INSTALL
-rw-r--r-- 1 sam sam      6267 2012-03-25 02:25 issu-indb.tgz
-rw-r--r-- 1 sam sam 447458263 2012-03-25 02:40 jinstall-12.1R1.9-domestic.tgz
-rw-r--r-- 1 sam sam        33 2012-03-25 03:19 jinstall-12.1R1.9-domestic.tgz.md5
-rw-r--r-- 1 sam sam        41 2012-03-25 03:19 jinstall-12.1R1.9-domestic.tgz.sha1
-rw-r--r-- 1 sam sam       525 2012-03-25 03:31 jinstall-12.1R1.9-domestic.tgz.sig

There are tons of of files in this archive, and lots of guides say to try and play with it and pack it back up, but it’s actually a lot easier to just use the unsigned archive jinstall-12.1R1.9-domestic.tgz

If we unpack this, we get the following files:

-rw-r--r-- 1 sam sam  11673600 2012-03-25 02:38 bootstrap-install-12.1R1.9.tar
-rw-r--r-- 1 sam sam        39 2012-03-25 02:39 +COMMENT
-rw-r--r-- 1 sam sam       702 2012-03-25 02:39 +CONTENTS
-rw-r--r-- 1 sam sam    106121 2012-03-25 02:39 +DEINSTALL
-rw-r--r-- 1 sam sam       244 2012-03-25 02:39 +DESC
-rw-r--r-- 1 sam sam    107634 2012-04-02 19:57 +INSTALL
-rw-r--r-- 1 sam sam 440390207 2012-03-25 02:26 jbundle-12.1R1.9-domestic.tgz
-rw-r--r-- 1 sam sam      5933 2012-04-02 19:56 pkgtools.tgz
-rw-r--r-- 1 sam sam    106669 2012-04-02 19:57 +REQUIRE

We need to do edit the +INSTALL and +REQUIRE files here. Do a search for "re_name" and you'll find the following line:

    re_name=`/sbin/sysctl -n 2>/dev/null`
    if [ -z "$re_name" ]; then
        Error " sysctl not supported."

This will make the install fail, so we need to replace it with the following:

check_arch_compatibility() { #re_name=`/sbin/sysctl -n 2>/dev/null` re_name="olive" if [ -z "$re_name" ]; then Error " sysctl not supported." fi

Once you've updated both of these files, unpack the pkgtools.tgz into a new directory

sam@laptop:~/Downloads/jinst-signed/jinst$ mkdir pkgtools
sam@laptop:~/Downloads/jinst-signed/jinst$ cd pkgtools/
sam@laptop:~/Downloads/jinst-signed/jinst/pkgtools$ tar -xzf ../pkgtools.tgz
sam@laptop:~/Downloads/jinst-signed/jinst/pkgtools$ ls -l
razem 8
drwxrwxr-x 2 sam sam 4096 2012-04-02 18:52 bin
drwxrwxr-x 2 sam sam 4096 2012-04-02 18:52 pkg
sam@laptop:~/Downloads/jinst-signed/jinst/pkgtools$ ls -l bin/
razem 4
-rwxr-xr-x 1 sam sam 2980 2012-04-02 19:55 checkpic

The checkpic file needs to be replaced with the "true" program from FreeBSD, so now is a good time to install it on our virtual machine. I've used FreeBSD 7.3, because FreeBSD 8 and higher use gpart instead of bsdlabel for changing the boot partition, but if you're keen to play with it, look in the +INSTALL file for the line "bsdlabel -B -b /boot/boot $labeldrive"

Anyway, get the i386 ISO for FreeBSD (the NZ mirror is on Citylink, so if your ISP zero-rates traffic through WIX/APE then it won't add to your data cap), and make up a virtual machine. I've used VirtualBox, but it should work in any virtual machine software

I used 512MB at first, but it wasn't enough for JunOS 12.1 - bumping it up to 640MB for the install made it work though. As you'll see in the later screenshots, I did this on my netbook with 990MB of RAM, so I couldn't bump it up too much further.

 An 8GB virtual hard drive gives you lots of room for new versions of software in your /var/tmp directory

This part is important - when booting, you can only see the output from JunOS through the console port. The easiest way to get this set up is to install socat and screen, and use the following command:
socat UNIX-CONNECT:/tmp/olivecom1 PTY,link=olivescreen
This will hang (only works once the virtual machine is running and the pipe is created) - so in a second terminal use this command:
screen olivescreen

 Boot up the installer, and go for a Standard install.
At the partition screen, push C to create a new slice, and allocate all the disk space to it. Then press Q and go to the next screen
 Choose a Standard boot loader
 The next screen lets you create partitions. Create the first partition with 1024M, as a file system partition, and mount point /

 Once this is done, create a 1024M swap partition, then a 16sM file system partition mounted at /dummy (this makes the labels line up so the installer doesn't fail), a 1024M file system partition at /config, and the last partition (with all the left over space) on /var

 Looking good so far? If the drive labels don't match up (i.e. no partition ad0s1f) then the JunOS install will fail

 Do a minimum install, install from CD/DVD, say no to all the questions, and you're done!
 Now you can exit the installer and reboot (don't forget to eject the virtual DVD!)
 When it reboots, use "dhclient em0" to get an address (probably, and then copy the file /usr/bin/true to your computer. The command "scp /usr/bin/true name@" should work for some computers, but under lubuntu it fails, so I had to use my GuruPlug instead ;)

Let's leave that virtual machine cooking for a bit. Now you have this "true" file, copy it in place of checkpic in your pkgtools/bin directory, then pack it all up

sam@laptop:~/Downloads/jinst-signed/jinst/pkgtools$ tar -xzf ../pkgtools.tgz *

Delete the pkgtools directory, and pack this all up into another .tgz

sam@laptop:~/Downloads/jinst-signed/jinst$ tar -xzf ../jinstall-12.1R1.9-domestic-olive.tgz *

Then you're good to go. Use scp to copy this onto your virtual machine, and use the pkg_add command to install JunOS 12.1
 When it's all done, rebooot, and watch it all in your terminal
And here is your JunOS 12.1 Olive. Haven't tested mine yet, but will end up using it as a testbed for big changes at work. Make sure you use the "cli" command to get the JunOS commandline, then run it like normal - use "request system power-off" to shut it down before turning it off. Check my older posts for how to network them, and remember - ethernet multicast fails on these, but you can make it work over GRE or IP-IP tunnels (older posts cover this too).

Thursday, March 29, 2012

Polishing pyswitch

Polishing pyswitch

I've had my modified version of pyswitch running on NOX for a couple of weeks, and it's working fine. The key to OpenFlow is the controller - if your controller is processing a lot of packets, then it's a bottleneck; but if all your traffic is matching flows in the switch, then it will work at line speed.

As I've been using the switch for more and more test servers, I've noticed that my modifications have oversimplified things a little. Here's a summary of the current pyswitch logic:

  1. If a packet doesn't match a flow in the switch, send to the controller
  2. For each packet sent to the controller, save the source address and source port
  3. If the controller gets a packet with a destination address it knows, it sends it to that port and installs a new flow into the switch
Do you see the problem? It's fine with two computers on the switch, but here's how it works with three:
  1. PC A sends a packet to PC B. No flows in the switch so the controller gets the packet, saves the address and port of A, and floods the packet
  2. PC B replies. No flow matched, controller gets the packet, saves the address and port of B, and recognises PC A. Controller then forwards the packet to the port that PC A was seen on, and installs a flow into the switch
  3. PC A sends another packet to PC B. No flow matched, controller gets the packet, recognises address of PC B so it forwards the packet and stores a flow in the switch.
  4. Flows are in the switch for both PC A and PC B, so packets to them are sent at line speed without touching the controller
What happens when PC C comes along?
  1. PC C sends a packet to PC A. There is a flow for this, so it is forwarded at line speed in the switch
  2. PC A replies to PC C. No flow, so the controller gets the packet, saves the source details (address and port of PC A), doesn't have details of PC C so it floods the packet
Do you see the problem? The source details of PC C never get stored, because all its outbound packets match flows in the switch. This is a serious problem - it means that all of the traffic back to PC C goes through the OpenFlow controller at about 10 packets per second, breaking the network.

The original pyswitch didn't have this problem - it created very specific flows based on all the source and destination attributes. I could have fixed it up to handle VLANS better (by making it recognise ethertype 0x8100 as VLAN and move up the header for the actual ethertype), but this isn't efficient - a connection to a website would have 2 flows for the original arp requests, another 2 for the dns lookup, and another 2 for the TCP connection - 6 flows for a single web page?

We could strike a compromise and set flows based on the source and destination MAC addresses, but I still don't like that. It means that for N MAC addresses on the switch, you go from N flows to NxN flows; for a 48-port switch, this is from 48 flows to 2,304 flows. It may be a case of trading extra flows for simpler code, but I think I have a better solution.

My new addition to pyswitch adds a flow to the switch whenever it has to flood a packet. The idea is, when PC C comes along and sends a packet, we want that to go to the controller, even if we know the destination. Here's the new code:

# --
# If we've learned the destination MAC set up a flow and
# send only out of its inport.  Else, flood.
# --
def forward_l2_packet(dpid, inport, packet, buf, bufid):    
    dstaddr = packet.dst.tostring()
    if not ord(dstaddr[0]) & 1 and[dpid].has_key(dstaddr):
        prt =[dpid][dstaddr]
        if  prt[0] == inport:
            log.err('**warning** learned port = inport', system="pyswitch")
  '**warning** learned port = inport')
            inst.send_openflow(dpid, bufid, buf, openflow.OFPP_ALL, inport)
            # We know the outport, set up a flow
            log.msg('installing flow for ' + mac_to_str(packet.dst), system="pyswitch")
  'installing flow for ' + mac_to_str(packet.dst))
            # delete src flow if exists
            delflow = {}
            delflow[core.DL_SRC] = packet.dst
            inst.delete_datapath_flow(dpid, delflow)
            # sam edit - just load dest address, the rest doesn't matter
            flow = create_l2_out_flow(packet)
            actions = [[openflow.OFPAT_OUTPUT, [0, prt[0]]]]
            inst.install_datapath_flow(dpid, flow, CACHE_TIMEOUT, 
                                       openflow.OFP_FLOW_PERMANENT, actions,
                                       bufid, openflow.OFP_DEFAULT_PRIORITY,
                                       inport, buf)
        # haven't learned destination MAC. Flood 
        if ord(dstaddr[0]) & 1:
  'broadcast/multicast packet to ' + mac_to_str(packet.dst) + ', flooding')
            inst.send_openflow(dpid, bufid, buf, openflow.OFPP_ALL, inport)
  'no MAC known for ' + mac_to_str(packet.dst) + ', flooding')
            # set up flow to capture source packet
            flow = {}
            flow[core.DL_SRC] = packet.dst
            actions = [[openflow.OFPAT_OUTPUT, [65535, openflow.OFPP_CONTROLLER]]]
            inst.send_openflow(dpid, bufid, buf, openflow.OFPP_ALL, inport)
            inst.install_datapath_flow(dpid, flow, CACHE_TIMEOUT,
                                       1, actions,
                                       None, openflow.OFP_DEFAULT_PRIORITY+1,
                                       None, None)

Pay attention to the install_datapath_flow() functions. If we start from the bottom, you'll see that the else statement is a lot larger. Broadcast/multicast packets get flooded, but unknown packets also install a flow (at default priority+1) so that any packets from this unknown host come to the controller. This is matched by a delete_datapath_flow() call further up the function, so that when a new flow is installed, it tries to delete any flows that match the source address.

How does it perform? Each new flow sends roughly 3 packets to the controller (the first unknown, and a couple because of our source-match flow - it doesn't get deleted before the next queued packet comes through), but we get our O(N) amount of flows in the table. If we look at our ARP + UDP + TCP example from before, it performs way better - for 6 flows the controller gets 6 packets, but for our 2 flows the controller also gets 6 packets. This means it uses the controller as much as the old, specific pyswitch, but uses a fraction of the flows.


One extra note for those of you who haven't spotted it - I've changed the action from OFPP_FLOOD to OFPP_ALL. The Pronto 3290 we have at work has always responded to FLOOD messages weirdly - it looks like it sets up individual flows for each active port, and after trolling through the OpenFlow spec I've figured out why:

OpenFlow-only switches support only the required actions below, while OpenFlow-
enabled switches, routers, and access points may also support the NORMAL
action. Either type of switch can also support the FLOOD action.
Required Action: Forward. OpenFlow switches must support forwarding
the packet to physical ports and the following virtual ones:
• ALL: Send the packet out all interfaces, not including the incoming in-
• CONTROLLER: Encapsulate and send the packet to the controller.
• LOCAL: Send the packet to the switchs local networking stack.
• TABLE: Perform actions in flow table. Only for packet-out messages.
• IN PORT: Send the packet out the input port.
Optional Action: Forward. The switch may optionally support the following
virtual ports:
• NORMAL: Process the packet using the traditional forwarding path
supported by the switch (i.e., traditional L2, VLAN, and L3 processing.)
The switch may check the VLAN field to determine whether or not to
forward the packet along the normal processing route. If the switch can-
not forward entries for the OpenFlow-specific VLAN back to the normal
processing route, it must indicate that it does not support this action.
• FLOOD: Flood the packet along the minimum spanning tree, not includ-
ing the incoming interface.

See the difference? FLOOD is an optional action, that activates any spanning-tree code in the switch. It's not as intensive as NORMAL (which only true hybrid switches will support), but it isn't what pyswitch is supposed to do. Changing the code to use OFPP_ALL instead of OFPP_FLOOD seems to make the switch work less on each packet that comes back from the controller - and this means the controller can handle even more flows per second!

Here's a code dump of my latest version, I may polish it and send it back to the NOX dudes later if I get the time:

# Copyright 2008 (C) Nicira, Inc.
# This file is part of NOX. Additions from Sam Russell for
# compatibility with OVS on Pronto 3920
# NOX is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# NOX is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# GNU General Public License for more details.
# You should have received a copy of the GNU General Public License
# along with NOX.  If not, see <>.
# Python L2 learning switch 
# ----------------------------------------------------------------------
# This app functions as the control logic of an L2 learning switch for
# all switches in the network. On each new switch join, it creates 
# an L2 MAC cache for that switch. 
# In addition to learning, flows are set up in the switch for learned
# destination MAC addresses.  Therefore, in the absence of flow-timeout,
# pyswitch should only see one packet per flow (where flows are
# considered to be unidirectional)

from nox.lib.core     import *

from nox.lib.packet.ethernet     import ethernet
from nox.lib.packet.packet_utils import mac_to_str, mac_to_int

from twisted.python import log

import logging
from time import time
from socket import htons
from struct import unpack

logger = logging.getLogger('nox.coreapps.examples.pyswitch')

# Global pyswitch instance 
inst = None

# Timeout for cached MAC entries

# Modified extract_flow except just dest address - another sam edit
def create_l2_out_flow(ethernet):
    attrs = {}
    attrs[core.DL_DST] = ethernet.dst
#    attrs[core.DL_SRC] = ethernet.src
    return attrs

# --
# Given a packet, learn the source and peg to a switch/inport 
# --
def do_l2_learning(dpid, inport, packet):
    global inst'learning MAC for incoming packet...' + mac_to_str(packet.src))
    # learn MAC on incoming port
    srcaddr = packet.src.tostring()
    if ord(srcaddr[0]) & 1:
        log.msg('MAC is null', system='pyswitch')'MAC is null')
        dst =[dpid][srcaddr]
        if dst[0] != inport:
            log.msg('MAC has moved from '+str(src)+'to'+str(inport), system='pyswitch')
  'MAC has moved from '+str(src)+'to'+str(inport))
    else:'learned MAC '+mac_to_str(packet.src)+' on %d %d'% (dpid,inport))

    # learn or update timestamp of entry[dpid][srcaddr] = (inport, time(), packet)

    # Replace any old entry for (switch,mac).
    mac = mac_to_int(packet.src)

# --
# If we've learned the destination MAC set up a flow and
# send only out of its inport.  Else, flood.
# --
def forward_l2_packet(dpid, inport, packet, buf, bufid):    
    dstaddr = packet.dst.tostring()
    if not ord(dstaddr[0]) & 1 and[dpid].has_key(dstaddr):
        prt =[dpid][dstaddr]
        if  prt[0] == inport:
            log.err('**warning** learned port = inport', system="pyswitch")
  '**warning** learned port = inport')
            inst.send_openflow(dpid, bufid, buf, openflow.OFPP_ALL, inport)
            # We know the outport, set up a flow
            log.msg('installing flow for ' + mac_to_str(packet.dst), system="pyswitch")
  'installing flow for ' + mac_to_str(packet.dst))
            # delete src flow if exists
            delflow = {}
            delflow[core.DL_SRC] = packet.dst
            inst.delete_datapath_flow(dpid, delflow)
            # sam edit - just load dest address, the rest doesn't matter
            flow = create_l2_out_flow(packet)
            actions = [[openflow.OFPAT_OUTPUT, [0, prt[0]]]]
            inst.install_datapath_flow(dpid, flow, CACHE_TIMEOUT, 
                                       openflow.OFP_FLOW_PERMANENT, actions,
                                       bufid, openflow.OFP_DEFAULT_PRIORITY,
                                       inport, buf)
        # haven't learned destination MAC. Flood 
        if ord(dstaddr[0]) & 1:
  'broadcast/multicast packet to ' + mac_to_str(packet.dst) + ', flooding')
            inst.send_openflow(dpid, bufid, buf, openflow.OFPP_ALL, inport)
  'no MAC known for ' + mac_to_str(packet.dst) + ', flooding')
            # set up flow to capture source packet
            flow = {}
            flow[core.DL_SRC] = packet.dst
            actions = [[openflow.OFPAT_OUTPUT, [65535, openflow.OFPP_CONTROLLER]]]
            inst.send_openflow(dpid, bufid, buf, openflow.OFPP_ALL, inport)
            inst.install_datapath_flow(dpid, flow, CACHE_TIMEOUT,
                                       1, actions,
                                       None, openflow.OFP_DEFAULT_PRIORITY+1,
                                       None, None)
# --
# Responsible for timing out cache entries.
# Is called every 1 second.
# --
def timer_callback():
    global inst

    curtime  = time()
    for dpid in
        for entry in[dpid].keys():
            if (curtime -[dpid][entry][1]) > CACHE_TIMEOUT:
                log.msg('timing out entry'+mac_to_str(entry)+str([dpid][entry])+' on switch %x' % dpid, system='pyswitch')

    inst.post_callback(1, timer_callback)
    return True

def datapath_leave_callback(dpid):'Switch %x has left the network' % dpid)

def datapath_join_callback(dpid, stats):'Switch %x has joined the network' % dpid)

# --
# Packet entry method.
# Drop LLDP packets (or we get confused) and attempt learning and
# forwarding
# --
def packet_in_callback(dpid, inport, reason, len, bufid, packet):

    if not packet.parsed:
        log.msg('Ignoring incomplete packet',system='pyswitch')
    if not
        log.msg('registering new switch %x' % dpid,system='pyswitch')[dpid] = {}

    # don't forward lldp packets    
    if packet.type == ethernet.LLDP_TYPE:
        return CONTINUE

    # learn MAC on incoming port
    do_l2_learning(dpid, inport, packet)

    forward_l2_packet(dpid, inport, packet, packet.arr, bufid)

    return CONTINUE

class pyswitch(Component):

    def __init__(self, ctxt):
        global inst
        Component.__init__(self, ctxt) = {}

        inst = self

    def install(self):
        inst.post_callback(1, timer_callback)

    def getInterface(self):
        return str(pyswitch)

def getFactory():
    class Factory:
        def instance(self, ctxt):
            return pyswitch(ctxt)

    return Factory()