This article is more technical that my normal posts – we are after-all selling high tech!
Before I delve into the challenges of selling SDN I thought it would be interested to explain what it is, why everyone is getting excited and my personal perspective since in reality SDN is an old idea in the telecom world but very new in the networking world.
At the heart of OpenFlow is the idea to separate out control from data. In data networking, that means messages to control set-up, build routing tables etc are all carried in the same channel as the data for example these control messages are carried alongside the web page you're downloading.
Routers and networking switches have in-built intelligence and when configured correctly will make decisions about how your data gets from A to B.
Separating control from data happened in the telecom world around 40 years ago so the concept is far from new. In fact it happened several times in telecom. The very old systems used tones carried in the speech path for signalling purposes. The problem with this is that users could “hack” the telecom network to trigger the controls. This problem still exists to some extent and is called freaking. DTMF is a signalling (control) mechanism to signal to the network on your fixed line phone what number you want to dial – this control info is carried in the data (speech plane). Hackers manipulate PABX, voice-mail and other systems which use DTMF for control. Hacking the core telecom network in this way is not possible anymore. Mobile phones send call set-up signalling information separately from the voice since they use more up-to-date technology.
The other change in the telecom network which happened 30-40 years ago was in the core and was more fundamental and had a profound impact. Telecom circuit have a separate channel in which signalling information is carried. The fundamental change was not the physical separation of the signalling from the data which was already present but the centralisation of signalling. Digital telecom networks relied on the dialled digits to signal to the next switch how to route the call. The numbering plan was cleverly designed so the dialled digits described how to get the call from A to B. You can kind of think of it as source routed packets. However the problem was all this signalling channels were difficult to maintain (read labour intensive) since each switch needed to have part of the information about how to get from A to B. The transition to centralised signalling or STPs (Signal Transfer Points) was a fundamental change – it meant each switch simply needed enough signalling information to know how to handle queries and the STPs handle the details of the call set-up. It is this change which has a close parallel to what is happening in networking with SDN.
So why was the change to centralised signalling in telecom such a fundamental change? Up until that point, dialled digits were geographical – the call was set-up hop-by-hop, with each switch holding lots of configuration data. It meant things like freephone numbers were difficult if not impossible to establish. In the old model each node would need to know how to route free-phone numbers. With the centralised model it was possible to add new functionality (Service Creation Points) which sat above the STPs allowing number abstraction – suddenly numbers were not tied to geographical locations. This meant someone dialling an 800 number could be routed to a physical phone based on their originating location or based on the state of the destination eg busy. Because signalling was centralised, operator based controls became possible for example call blocking. These were not services for users – these were services to protect the network. If a telephone number was in high demand for example ticket booking or voting, call volumes could be throttled to stop excessive network load.
So let's jump forward to SDN. Separation of the control and data planes is actually giving routing information to a controller which is analogous to the STP. Controllers will be connected via northbound interfaces to systems allowing applications to control the network (analogous to SCPs) . The service abstraction was labelled under Intelligent Networking.
So one of the key problems with selling SDN is actually “what problem does it solve”. The messages for the past 2 days at the ONF conference have been about
- Lower costs
We may think of IP routing as being fresh and modern but the reality is it has progressed very little in the last 20 years. There have been lots of bolt on protocols attempting to overcome shortcomings. SDN concepts have been born out of frustration. Cloud services and virtualisation in the server space have enabled the time to create a new server to be reduced from days to minutes. Creating a new server is a low cost, almost labour free activity and yet the bottle neck is now the networking. The moment networking changes are needed there is no automation, it takes days to implementation the manual changes and because humans are involved, they are error prone. Companies like Google have therefore resorted to writing their own SDN since they want networking to be exactly like the server space. This covers the points of flexibility and responsiveness.
The other area which has been covered over the last couple of days is lower costs. The same networking vendors that have progressed things little in the last 20 years have kept prices high. There is nothing fundamental in SDN to make things cheaper – at least from a capital cost perspective. What is really happening is SDN is a disruption change allowing new players to emerge and it is shifting value from boxes to the higher layers. Networking boxes therefore become more commodity. It is this area where I have been involved for the last 18 months with the LINC OpenFlow switch which enables new low cost vendors to enter the networking space.
Returning to our telecom parallel, it is the STP (OpenFlow controllers) and SCP (OpenFlow applications) where new opportunities and players are emerging. The volume is in the switch/router layer whilst the value and therefore profit will be with the higher layers. Switches will be commodity with little opportunity to differentiate.
The fundamental change in telecom was not the cost element since in reality this was extra boxes to buy so actually higher cost. Operators moved to a centralised control model since costs were reduced from a support perspective. It was far easier to maintain since routing could be changed dynamically (least cost routing) and adds and changes made in a simple manner from a central location.
Throttling before, was complex configuration activity and therefore prone to human error but with a centralised control function it became possible to automate the activity.
Number translation and number abstraction was also a powerful and fundamental change. It meant numbers were no longer tied to a location. Call control became more flexible allowing dynamically change to call routing and call manipulation. This lead to virtualised PABX type service like Centrex. The operators had a powerful new tool to create new services in a simple way. For example private dial/numbering plans.
So fast forward to today. The sales messages for SDN and OpenFlow are about lower cost and flexibility (easier support and faster provision). The reality is OpenFlow will probably not lead to lower capital costs – at least not initially. If you already have a router network and it works then you won't save money by replacing it with OpenFlow since you will have new investment costs and scrappage costs for your existing network. Depending on how much service and network provisioning change is taking place in your network you may have reduced support and administration costs resulting from OpenFlow's flexibility – this becomes a simple business case. Are my support cost reduction gains worth the capital investment?
Contrasting with telecoms is useful. Load control and throttling is comparable to today's denial of service threats. Today if we have an attack, it can be quite a manual task to apply blocking rules across a large distributed network. Ironically building resilient fault tolerant networks actually creates more problems compared to a simple linear service chain. More meshing means more options for threats and attacks to find alternative routes.
OpenFlow's centralisation enables similar functionality to telecom's traffic management. Centralised blocking control means that policies such as block all traffic from this IP address or all traffic to this IP address can be simply created and applied on a network-wide basis. If you are in the firewall business you better watch out. OpenFlow is going to impact your business.
For example it now be possible to block queries to particular DNS lookups. Schools may restrict access to facebook.com during specific hours so that the lookup simply fails!
It's worth briefly considering IP v4's addressing. In reality it is hierarchical with the numbering having a geographical binding. This sub-net relates to a country and sub-net location. It is not so different from old world telecoms yet we like to fool ourselves that IP is a fresh new idea!
The decoupling of the control of the data forwarding plane in telecom allowed address abstraction – the numbers pretty much became irrelevant. We still needed “unique” numbers but they were no longer cast in-stone.
It is this decoupling in telecoms that spawned service innovation and flexibility. I believe that exactly the same will happen in IP and OpenFlow enables this control abstraction.
So let's look at telecom services and see what insights we can gain for the future of SDN. Firstly numbering abstraction. We probably won't have freephone equivalents since IP bearer costs are not exposed to the end user (yet) however being able to route traffic to an address (telephone number) and route traffic optimally is an interesting and potentially powerful feature. Consider Google. Currently there is coarse granularity traffic routing eg google.com, google.co.uk etc and there are data centre load balancing similar to ACD functionality in the telecom world. We now have IP addresses which the control plane can decide to route in different ways to today. We could for instance allow the network to handle routing to the nearest Google data centre without lots of special hacks such as DNS, or special routes to do the data centre assignment.
Virtual, overlapping or private address spaces suddenly are possible. The traffic can be routed based on the originator. The actual address becomes in many respects irrelevant.
OpenFlow is currently based around applying routes for persistent flows yet this doesn't have to be that way. Telecoms services allowed calls (flows) to change their route based on conditions (end-point was busy so forward to voice-mail) or actions such as “I'll forward your call onto xyz as they can help you”. Doing the same in the IP has some challenges since TCP assumes that endpoints remain constant and have state information but there's nothing stopping this model for UDP traffic sources or future protocols. It's probably easier to adopt redirection approaches for TCP.
It is not clear what SDN based applications and services will emerge yet it is clear that this base functionality will simplify the creation of innovative new services.
Telecom may seem old world but it has continued to move forward, particularly in the mobile space. Today's mobile networks are IP based with specialist boxes to overcome the limitations of IP eg GSNs for mobility functions. The latest generation is adopting a key element called a PCRF (Policy Control Function) which manages the service level and privileges at a user level. These parallel worlds can learn from each other!
So back to selling SDN. Today the message is one of flexibility and cost. It's hard to sell these application layer services when we don't yet know what they will be or the value they deliver. If the customer values on the fly network provisioning and dimensioning then there is real value which can be sold. It's feasible to randomly wire an OpenFlow network and it will sort it out. Today's legacy IP networks struggle. The types of customers that would value this dynamic network provisioning are cloud providers or companies with large constantly changing data centres.
Customers that place a high value on service availability and up-time would also value OpenFlow. Many IP boxes have found their way into the traffic path creating long serial service chains. When there are lots of boxes in a chain there is more chance of a service outage if one fails and more chance for unexpected interactions between boxes. Examples of these boxes are firewalls, load balancers, deep packet inspection. Generally these boxes are control functions.
It is rare that these types of boxes modify content in the payload. Usually they make a decision do I allows this or where should I send this. These are control functions. Occasionally there are boxes that modify payload for example content caches, content adaptation and maybe deliberate injection of jitter on VoIP calls! For the most part they are control functions so they can be moved into the control plane by OpenFlow. The end result is the data path becomes much shorter with fewer boxes which means higher reliability.
Today building fault tolerant IP networks is a challenge. Adding protection, redundancy and fail-over creates complexity. Complexity often leads to reduced availability so ironically the additional equipment built to provide resilience and higher availability can lead to reduced availability!
As the saying goes “every change has a consequence”. The consequence of simplifying the data path is the control layer potentially becomes more complex. The network has the potential for the network to become more fluid and dynamic. Particular services such as Video traffic, might request routing for low latency and therefore have a very different traffic path to web browsing. Sure the services get the best available performance for their needs but it creates a new problem for the operations staff when they want to trouble shoot an issue. Where was the traffic?
OpenFlow does allow traffic forwarding so the job of monitoring and troubleshooting can be assisted. Just like firewall vendors, if you're a probe vendors then watch out. OpenFlow is set to change your business!