History of Ouroboros
Disclaimer: this article expresses the view of its author.
The origin of the Ouroboros project lies within European-funded research and experiences implementing RINA. These research projects started off as validation projects for the ideas that are embedded in the 2007 book Patterns in Network Architecture. After initial experiences, it became clear that software development within research projects has its limitations, and 2 parties that were working on the OpenIRATI prototype decided to roll their own RINA projects: Vincenzo Maffione at Nextworks started rlite and we at IMEC started Ouroboros. rlite stayed more-or-less true to RINA specifications, but our ambition was to fully retrace the methodology that John Day introduced to arrive at the RINA architecture, rather than start from its conclusions. We started from scratch, without any bias towards any predetermined outcomes, and treaded lightly. This is a more detailed account for those interested in the back story.
2012: IBBT meets RINA
In 2012, a new research project was in the proposal stage for the Future Internet Research and Experimentation" (FIRE) area of the 7th framework programme of the European Commission. The objectives of FIRE are "hands-on", aimed at building and deploying Internet technologies.
The project,Investigating RINA as an Alternative to TCP/IP (IRATI) was going to experimentally verify the ideas in PNA. The coordinating partner was i2CAT, a research center on Internet technologies located in Barcelona. Eduard Grasa was the technical lead for the project.
The IRATI project promised an implementation of RINA in Linux _and_ FreeBSD/JunOS, with detailed comparisons of RINA against TCP/IP in various scenarios, and also demonstrate interoperability with RINA proof-of-concept prototypes with very limited functionality: the | TINOS prototype and the TRIA prototype.
The Interdisciplinary Institute for Broad-Band Technology (IBBT) -- soon to be known as iMinds, and now fully integrated in IMEC -- had a very convincing track record in the FIRE program because of its networking testbed facilities, and was invited as a consortium member. IBBT and i2CAT were both part of a testbed project, called OFELIA (OpenFlow in Europe: Linking Infrastructure and Applications). The IRATI project would adopt (and extend) the OFELIA testbed as its platform for experimental deployment of the RINA prototypes.
Three partners were responsible for the new RINA implementation: i2cat, who had experience on RINA; Nextworks, an Italian SME with expertise in networking and software development, and us at IBBT.
The Italian branch of Interoute (now part of GTT), Europe's largest Cloud Service Platform provider had a role in the project overseeing its use-cases, and Boston University (BU) joined as an unfunded partner with an advisory role.
The project was subdivided in three phases, each started with development activities followed by experimentation. Phase I was projected to end in November 2013 (initial closed source prototype, demonstrate capabilities comparable to TCP/IP), Phase II in June 2014 (open source prototype, demonstrate capabilities beyond TCP/IP) and Phase III was to conclude in December 2014 (final prototype, complete implementation, demonstrate interoperability with previous RINA prototypes) with the end of the project.
2013: IRATI Phase I
IRATI kicked off in January 2013 at the i2CAT offices in Barcelona. The consortium meeting was followed by a RINA workshop, bringing the project team in touch with the RINA community. John Day gave a 2-day in-depth tutorial on the subject. Eduard presented an outline of the IRATI objectives. With the RINA community gathered at the workshop, there were already initial ideas for a follow-up research proposal to IRATI.
The first work was determining the software design of the implementation. IRATI was going to build an in-kernel OS/Linux implementation of RINA. A lot of the heavy lifting on the design was already done during the project proposal preparation phase, and about 3 months into the project, the components to be implemented were well-defined and quickly published. Broadly speaking, there were 3 things to implement: the IPCPs that make up the RINA layers (Distributed IPC Facilities, DIFs), the component that is responsible for creating and starting these IPCPs (the IPC manager), and the core library to communicate between these components, called librina. The prototype was planned to be built in 3 phases over the course of 2 years.
i2cat was going to get started on most of the management parts (IPC Manager, based on their existing Java implementation; librina, including the Common Distributed Application Protocol (CDAP) and the DIF management functions in the normal IPCP) and the Data Transfer Protocol (DTP). iMinds was going to be responsible for the kernel modules that will allow the prototype to run on top of Ethernet. Nextworks was taking a crucial software-architectural role on kernel development and software integration. For most of these parts we had access to a rough draft of what they were supposed to do, John Day's RINA reference model, which were usually referred to as _the specs_.
i2cat had a vested interest in RINA and was putting in a lot of development effort with 3 people working on the project: Eduard, Leonardo Bergesio and Miquel Tarzán. Nextworks assigned Francesco Salvestrini, an experienced kernel developer to the project. From iMinds, the development effort would come from Sander.
The project established efficient lines of communications, mostly using Skype and the mailing lists and the implementation work got underway swiftly. There was a genuine sense of excitement in everybody involved in the project.
Sander's first task was to implement the shim DIF over Ethernet. This is a Linux loadable kernel module (LKM) that wraps the Ethernet 802.1Q VLAN with a thin software layer to present itself using the RINA API. The VLAN ID would dub as the DIF name. No functionality would be added to the existing Ethernet protocol so with only the src and dst address fields left, this _shim DIF_ was restricted to having only a single application registered at a time, and to a single RINA "flow" between the endpoints. We could deploy about 4000 of these _shim DIFs_ in parallel (each one is a VLAN) to support larger RINA networks. The name resolution for endpoint applications was planned to be using the Address Resolution Protocol (ARP), which resolves a protocol address (L3) to a hardware address (L2). ARP was readily available in the Linux kernel, or so we thought.
The ARP implementation in the kernel only supports IPv4 protocol addresses (note that IPv6 doesn't use ARP), so it could not handle the resolution of generic RINA _application names_ to MAC addresses, which we needed for the shim DIF. So after some deliberation, the project decided to implement an RFC 826 compliant version of ARP to support the shim DIF.
In the meantime, we also submitted a small 3-partner project proposal the GEANT framework, tailored to researching RINA in a National Research and Education Networks (NREN) environment. The project was lead by us (iMinds), partnering with i2cat, and teaming up with TSSG. IRINA would kick off in October 2013, meaning there were to be 2 parallel projects on RINA.
Despite the unforeseen development burden on an RFC826-compliant ARP, the IRATI project had made quite some progress in its first 6 months. There were minimal working implementations for most of the needed components, and in terms of prototype functionality, IRATI was quickly overtaking the existing RINA prototypes. However, the pace of development in the kernel was still much slower than anticipated and some of the implementation objectives were revised (the FreeBSD/JunOS implementation was dropped in favor of a shim DIF for Hypervisors). With the eye on testbed deployments, Sander started designing a second shim DIF, one that would allow us to run the IRATI prototype over TCP/UDP.
In the meantime, the follow-up project that was coined during the first RINA workshop took shape and was submitted. Lead by our IRINA partner TSSG, it was envisioned to be a a relatively large project, about 3.3 million Euros in EC contributions, running for 30 months and bringing together 13 partners with the objective to build the IRATI prototype into what was essentially a carrier network demonstrator for RINA, adding policies for mobility, security and reliability. PRISTINE got funded. This was an enormous boon to the RINA community, but also a bit of a shock to us as IRATI developers. The software was already behind schedule and now we have a really big third project on the horizon. The furthest we could push back the start of PRISTINE was January 2014.
As the IRATI project was framed within FIRE there was a strong implied commitment to get experimental results with the software prototype. By the last quarter of 2013, the experimentation needed to get started, and the prototype was getting its first deployment trials on the FIRE testbeds. With this move from virtual machines used for development to real hardware came more problems. The network switches in the OFELIA testbed weren't agreeing very well with our RFC-compliant ARP implementation, dropping everything that hadn't IPv4 as the network addresses. We got to experience TCP/IP network ossification first hand. The main testbed at the imec computer lab also relied on VLANs to seperate experiments, which didn't fare well with our idea to use them within an experiment for the shim DIF. While Sander developed the shim DIFs using the actual testbed hardware, other components had been developed predominantly in a virtual machine environment and had not been subjected to the massive parallelism that was available on dual Xeon hardware. The stability of the implementation had to be substantially improved to get stable and reliable measurements. These initial trials in deploying IRATI also showed that configuring the prototype was very time consuming. The components used json configuration files which were to be created for each experiment deployment, causing substantial overhead - and giving us nightmares to this very day.
The clock was ticking and while the IRATI development team was working tirelessly to stabilize the stack and to create (kernel) patches and fixes for the testbeds so they use VLANs (on a non-standard Ethertype) within experiments. IRATI (internally) released prototype 1 just before the end of the year.
2014: PRISTINE, IRATI phase II and IRINA
The PRISTINE kick-off was organized in January 2014, together with a workshop, where John Day presented RINA, similar to the IRATI kick-off one year earlier, except this time it was in Dublin and the project was substantially bigger, especially in headcount. It brought together experts in various fields of networking with the intent of them applying that experience into developing policies for RINA.
Many of the participants to the PRISTINE project were very new to RINA, still getting to grips with some of the concepts. The first couple of months of PRISTINE was mostly about getting the participants up-to-speed with the RINA architecture and defining the use-case, which centered on a 5G scenario with highly mobile end-users and intelligent edge nodes. The scenarios were very elaborate, and the associated deliverables were absolute dreadnoughts.
During this PRISTINE ramp-up phase, development of the IRATI prototype was continuing at a fierce pace. The PRISTINE project brought in some extra developers to work on the IRATI core: Bernat Gaston (i2cat), Vincenzo Maffione (Nextworks), and Douwe de Bock (a master student at iMinds). i2cat focused on management and flow control, and porting the Java user-space parts to C++. Vincenzo was focusing on the shim Hypervisor, which would allow communications between processes running in a VM host and its guest, and we were building the shim layer to run RINA over TCP and UDP.
By this time, some frustrations were starting to creep in. Despite all the effort in development, the prototype was arguably not in a good shape. The development effort was also highly skewed, with i2cat putting in the bulk of the work. The research dynamic was also changing. At the start of IRATI, there was a lot of ongoing architectural discussions about what each component should do, to improve the specs, but due to the ever increasing time pressure, the teams were working more and more in isolation. Getting it done became a lot more important than getting it right.
All this development had led to very little scientific output, which didn't go unnoticed at project reviews. The upshot of the large time-overlap between the two projects was that, in combination with the IRATI design paper that got published early-on in the project, we could afford to lose out a bit on dissemination in IRATI and try to catch up in PRISTINE. But apart from the relatively low output in research papers, the IRATI project had no real contributions to standardization bodies, which is one of the objectives put forward in the funding scheme.
In any case, the project had no choice but to push on with development, and, despite all difficulties, somewhere mid 2014 IRATI had most basic functionalities in place to bring the software in a limited way into PRISTINE so it could start development of the PRISTINE software developement kit (SDK) (which was developed by people also in IRATI).
Mostly to please the reviewers, the IRATI and PRISTINE projects tried to get some standardization going, presenting RINA at an ISO SC6 JTC1 meeting in London and also at IETF91. Miquel and myself would continue to follow up on standardization in SC6 WG7 on "Future Network" as part of PRISTINE, gathering feedback on the specs and getting them on the track towards ISO RINA standards.
The IRATI project was officially ending soon, and the development was targeting the last functions of the Data Transfer Control Protocol (DTCP) component of EFCP, such as retransmission logic. Other development was now shifted completely out of IRATI towards the PRISTINE SDK.
In the meantime, the project also sorely needed some experimental results. Experimentation with the prototype was a painful and very time-consuming undertaking. A publication at Globecom 2014 was derived from of some test results and that was combined that with a RINA tutorial session.
2015: End of IRATI and IRINA, PRISTINE, and the RINAiSense IoT project
January 2015, a new year and another RINA workshop. This time in Ghent, as part of a Flemish research project called RINAiSense -- which should be pronounced like the French renaissance -- that would investigate RINA applied to sensor networks - which now falls under the umbrella of "Internet of Things" (IoT). John Day again presented RINA to new members of the consortium, and this was also the time to properly introduce the IRATI prototype to everyone with a hands-on VM tutorial session, and to introduce RINAsim, an OMNET++ RINA simulator under development within the PRISTINE project.
After the workshop, it was time to wrap up IRATI. For an external observer it may lack impact and show little output in publications, and it definitely didn't deliver a convincing case for RINA as an alternative for TCP/IP. But despite that, the project has made some achievements, in terms of building for the first time some open source tools that can be used to explore the RINA architecture.
IRINA was also wrapping up, and the paper that outlined the shim DIF over hypervisors was submitted and attributed to this project.
In the spring of 2015, the clock was starting to tick on PRISTINE, as the project was now already halfway its anticipated 30-month runtime. With the IRATI project now officially over, the IRATI project team could re-align their focus.
The main objective for the iMinds team was on policies for resilient routing: making sure the DIF survives underlying link failures. This has been a long-time research topic in our group, so we pretty much already knew how to do it at a conceptual level and just repackage that research in a RINA jacket. But there were three additional requirements to get credible results: first and foremost, we needed scale: experiments needed to be able to run with many nodes and many flows in the network. Second, it needed stability: to measure the recovery time, we needed to send packets at small but -- more importantly -- steady intervals and thirdly, we needed measurement tools.
As part of IRINA, we developed a very rudimentary traffic-generator, which could be extended for PRISTINE and tailored to suit our needs. Stability of the IRATI prototype was a bigger problem, but improved gradually over time. Our real problem was scale, to which the one hurdle was the configuration of the IRATI stack, which required a preconfiguration in _json_. Vincenzo had developed a tool called the demonstrator based on tiny buildroot VMs to create setups for local testing, but this wasn't compatible with the Fed4FIRE testbeds. So Sander developed one of the first orchestrators for RINA, called the configurator for deploying IRATI on emulab.
Somewhere around that time, the one-flow-only-limitation of the shim DIF over VLAN was hampering our scalability and a shim DIF over Ethernet Link Layer Control (LLC) was drafted and developed. By mapping endpoints to LLC Service Access Points (SAPs), this shim DIF could support parallel flows (data flows and management flows) between the client IPCPs in the layer above.
With the PRISTINE SDK released as part of OpenIRATI somewhere after the January workshop a good month prior, there was another influx of code into the prototype for all the new features (a.k.a. policies). Around that time, Francesco, who had been managing a lot of the software integration was leaving the RINA projects. This is roughly where faith in the IRATI codebase started to wane and the first ideas of branching off -- or even starting over -- began to emerge.
The next Horizon-2020-proposal deadline was also approaching, so our struggles at that point also inspired us to propose developing a more elaborate RINA orchestrator and make deployment and experimentation with (open)IRATI a less painful ordeal. That project, ARCFIRE would start in 2016.
Now, to get the resiliency work done on the IRATI prototype were still struggling with the basics: getting link state routing running, adding some simple loop-free alternates (LFA) policy to it based on the operation of IP FRR, and running a bunch of flows over that network to measure packet loss when we break a link. Sander was focusing on the policy design and implementation, I was going to have a look at the IRATI code for scaling up the flow counts, which needed non-blocking I/O in the application. After that short hands-on stint in the IRATI codebase, planning started for building a RINA implementation beyond IRATI.
It was now summer 2015, PRISTINE would end in 12 months and as it was reliant on openIRATI there was no choice but to plow on. A couple of frustrating months lied ahead of us, trying to get experimental results out of a prototype that was nowhere near ready for it. The code base was also becoming so big and complex that it was impossible to maintain for anyone but the original developers. This is unfortunately the seemingly inescapable fate of any software project whose development cycle is heavily stressed by external deadlines, especially so when these deaslines are set within the rigid timeline of a publicly funded research project.
By the end of summer, we were still a long way off the mark in terms of what we hoped to achieve. The traffic generator tool and configurator were ready, and the implementation of LFA was as good as done, so we could deploy the machines for the use case scenarios on the testbeds. But due to stability problems, the deployment that actually worked on IRATI was limited to about 3 nodes in a triangle that showed the traffic getting routed over the two remaining link if a link got severed.
In the meantime, Vincenzo was making good progress on his own RINA implementation, rlite, and Sander and myself started discussing options on a more and more regular basis on what to do. Should we branch off IRATI and try to clean it up? Keep only IRATI kernel space and rewrite user space? Hop on the rlite train? Or just start over entirely? Should we go user-space entirely or keep parts in-kernel?
In the last semester of 2015, Sander was heading for a 3-month research stint at Boston University to work on routing in RINA with John Day and the BU team. By that time, we had ruled out branching off of openIRATI. Our estimate was that cleaning up the code base would be more work than starting over. We'd also have IRATI as an upstream dependency, and trying to merge contributions upstream would lead to endless discussions and further hamper progress for both projects. IRATI was out. Contributing to rlite then? Vincenzo was making progress fast, and we knew he was extremely talented and competent. But we were also afraid of running into disagreements of how to proceed. In the meantime, Sander's original research plans in Boston got subverted by a 'major review' decision on the _shim hypervisor_ article, putting his priority on getting that accepted and published. When I visited Sander in Boston at the end of October, we were again assessing the situation, and agreed that the best decision was to start our own prototype, to avoid rlite having _too many cooks in the kitchen_. Our development was not going to be part of some funded project, so we were free to evaluate and scrutinize all design decisions, and we could get feedback on the RINA mailing lists on our findings. When all considerations settled, our own RINA implementation was going to be in user-space and target POSIX.
We were confident we could get it done, so we took the gamble. ARCFIRE was going to start soon, but the first part of the ARCFIRE project would be tool development. Our experimentation contributions to PRISTINE were planned to wrap up by April -- the project was planned to end in June, but a 4-month extension pushed it to the end of October. So starting May, we'd have some time to work on this new RINA prototype relatively undisturbed. In the very worst case, if our project went down the drain, we could still use IRATI or rlite to meet any objectives for ARCFIRE. We named our new RINA-implementation-to-be _Ouroboros_, the mythical snake that eats its own tail represented recursion, and also -- with a touch of imagination -- resembles the operation of a _ring buffer_.
2016: ARCFIRE and the start of Ouroboros
Another year, and we kept the tradition of RINA project kick-offs going. This time it was again in Barcelona, but without a co-located workshop. ARCFIRE (like IRATI before it) was within the FIRE framework, and the project committed to get experiments running with a reasonable number of nodes (on the order of 100) to demonstrate stability and scale of the prototypes and also to bring more tooling to the RINA community. The project was coordinated by Sven van der Meer (Ericsson at that time), who had done significant work on the PRISTINE use cases, and would focus on the impact of RINA on network management. The industry-inspired use cases were brought by Diego López (Telefónica), _acteur incontournable_ in the Network Functions Virtualization (NFV) world. The project was of course topped off with the usual suspects - i2cat, Nextworks, and ourselves. The order at hand for us was to develop a fleshed-out testbed deployment framework for RINA, which we named Rumba. The name derived from rhumba - a bunch of rattlesnakes - and an Ouroboros is a mythical snake, and it was written in Python. A rhumba project already existed, but rumba was an accepted alternate spelling.
In early 2016, the RINA landscape was very different compared to when we embarked on IRATI in 2013. There were now 2 open source prototypes, IRATI was the de-facto standard used in European Funded projects, but Vincenzo's rlite was also becoming available at the time and would also be used in ARCFIRE. And soon, the development of a third prototype -- _Ouroboros_ -- would start. External perception of RINA in the scientific community had also been shifting, and not in a positive direction. At the start of the IRATI project, we had the position paper with project plans and outlines, and the papers on the _shims_ showed some ways on how RINA could be deployed. But other articles trying to demonstrate the benefits of RINA were -- despite all the efforts and good will of all people involved -- lacking in quality, mostly due to the limitations of the software. All these subpar publications did more harm than good, as the quality of the publications rubbed off on the perceived merits of the RINA architecture as a whole. We were always feeling this pressure to publish _something_, _anything_ -- and reviewers were always looking for a value proposition -- Why is this better than TCP/IP or my preferred solution?, Compare this in depth to my preferred solution -- that we simply couldn't support with data at this point in time. And not for lack of want or a lack of trying. But at least, ARCFIRE had at 2 years to look forward to, a focused scope and by now, the team had a lot of experience in the bag. But for the future of RINA, we knew the pressure was on -- this was a _now or never_ type of situation.
We laid the first stone on Ouroboros on Friday February 12th, 2016. At that point in time Ouroboros was still planned as a RINA implementation, so we started from the beginning: an empty git repository under our cursor, renewed enthousiasm in our minds, fresh _specs_ -- still warm from the printer and smelling of toner -- in our hands, and Sander's initial software design and APIs in colored marker on the whiteboard. Days were long -- we still had work to do on PRISTINE -- and evenings were short. I could now imagine the frustration of the i2cat people, who a couple of years prior were probably also spending their evenings and nights enthusiastically coding on IRATI while, for us, IRATI was still a (very interesting) job rather than a passion. We would feel no such frustrations as we knew from the onset that the development of Ouroboros was going to be a two-man job for the foreseeable future.
While we were spending half our days gathering and compiling results from our _LFA_ experiments for PRISTINE, and half our days on the rumba framework, our early mornings and evenings were filled with discussions on the RINA API used in Ouroboros. It was initially based on IRATI. Flow allocation used source and destination _naming information_ -- 4 objects that the RINA _specs_ (correctly, might I add) say should be named: Application Process Name, Application Process Instance Id, Application Entity Name and Application Entity Instance Id. This _naming information_ as in IRATI, was built into a single structure -- a 4-tuple -- and we were quickly running into a mess, because, while these names need to be identified, they are not resolved at the same time, nor in the same place. Putting them in a single struct and passing that around with NULL values all the time was really ugly. The naming API in Ouroboros changed quickly over time, initially saving some state in an _init_ call (the naming information of the current application, for instance) and later on removing the source naming information from the flow allocation protocol altogether, because it could so easily be filled with fake garbage that one shouldn't rely on it for anything. The four-tuple was then broken up to pass two 2-tuple name and instance-id, using one for the Process, the other for the Entity. But we considered these changes to be just a footnote in the RINA service definition, -- taste, one could take it or leave it, no big deal. Little did we know that these small changes were just the start, and Ouroboros would diverge significantly from RINA almost exactly one year later.
Another small change was with the _register_ function. To be able to reach a RINA application, you need to register it in the _DIF_. When we were implementing this, it just struck us that this code was being repeated over and over again in applications. And just think about it, how does an application know which DIFs there are in the system?. And if new DIFs are created while the application is running, how is that information fed into the application? That's all functionality that would have to be included in _every_ RINA application. IRATI has this as whole set of library calls. But we did something rather different. We moved the registering of applications _outside_ of the applications themselves. It's _application management_, not _IPC_. Think about how much simpler this small change makes life for an application developer, and a network administrator. That's what the bind() and register() calls in Ouroboros do: you bind some program to a name _from the command line_, you register that name in the layer _from the command line_ , and all the (server) program has to do is call flow_accept() and it will receive incoming flows. It is this change in the RINA API that inspired us to name our very first public presentation about Ouroboros, at FOSDEM 2018, IPC in 1-2-3.
We had also implemented our first shim DIF, which would allow to run the Ouroboros prototype over UDP/IPv4. We started with a UDP shim because there is a POSIX sockets API for UDP. Recall that we were targeting POSIX, including FreeBSD and MacOS X to make the Ouroboros prototype more accessible. Programming interfaces into Ethernet, such as _raw sockets_, are not standard between operating systems, so we would implement an Ethernet _shim DIF_ later. Now, the Ouroboros _shim DIF_ stopped being a _shim_ pretty fast. When we were developing the _shim DIFs_ for IRATI, there was one very important rule: we were not allowed to add functionality to the protocol we were wrapping with the RINA API; we could only _map_ functions that were existing in the (Etherent/UDP) protocol. This -- was the underlying reasoning -- would show that the protocol/layers in the current Internet were _incomplete_ layers. But that also meant that the functions that were not present -- the flow allocator in particular -- would need to be circumvented through manual configuration at the endpoints. We weren't going to have any of that -- the Ouroboros IPCP daemons all implement a flow allocator. You may also be wondering why none of the prototypes have a shim DIF directly over IP. It's perfectly possible! But the reason is simple: it would use a non-standardized value for the protocol field in the IP header, and most IP routers drop such packets. It is for that same reason QUIC/HTTP3 is over UDP and not over IP. TCP/IP protocol ossification at work again!
Somewhere around April, we were starting the implementation of a _normal_ IPCP in Ouroboros, and another RINA component was quickly becoming a nuisance: the Common Distributed Application Protocol (CDAP). While I had no problem with the objectives of CDAP, I was -- to put it mildly -- not a big fan of the object-oriented paradigm that was underneath it. Its methods, _read/write, create/destroy, start/stop_ make sense to many, but just like the HTTP methods PUT/GET/DELETE/POST/... there is nothing fundamental about it. It might as well have just one method, execute.
Summer was approaching fast. Most of the contributions to PRISTINE were in, so the ARCFIRE partners could start to focus on that project. There was a risk: ARCFIRE depended on the Fed4FIRE testbeds, and Fed4FIRE was ending and its future was not certain. The projected target API for _rumba_ was jFed. To mitigate the risk, we made an inventory of other potential testbeds, and to accomodate for the wait for the results of the funding calls, we proposed (and got) an extention to ARCFIRE with 6 months to a 30-month project duration. In the end Fed4FIRE+ was funded, ARCFIRE had some breathing space -- after all, we had to fire on all cylinders to get the best possible results and make a case for RINA -- and Sander and myself had some extra time to get Ouroboros up and running.
Sander quickly developed an Ethernet LLC shim DIF for Ouroboros based on the UDP shim DIF, and after that, we both moved our focus on the key components in the normal IPCP, implementing the full flow allocator and building the data transfer protocol (DTP), and the routing and forwarding functionality. CDAP was getting more and more annoying, but apart from that, this part of the RINA specs were fairly mature following the implementation work in IRATI, and the implementation progress was steady and rather uneventful. For now.
At the end of October, work on the PRISTINE project was wrapped up, and the final deliverables were submitted. PRISTINE was a tough project for us, with very little outcomes. Together with Miquel, I did make some progress with RINA standardization in ISO JTC1/SC6. But we could show few research results, no published papers where we were the main authors. PRISTINE as a whole also fell short in its main objectives: the RINA community hadn't substantially grown, and its research results were still -- from an external vantage point -- mediocre. For us, it was a story of trying to do too much too soon. Everyone tried their best, and I think we achieved what was achievable given the time and resources we had. The project definitely had some nice outcomes. Standardization at least got somewhere, with a project in ISO and also some traction within the Next Generation Protocols (NGP) group at ETSI. RINAsim was a nice educational tool, especially for visualizing the operation of RINA.
2017: Ouroboros diverges from RINA
By January 2017, we had a minimal working normal IPCP. Sander was looking into routing, working on a component we called the graph adjacency manager (GAM). As its name suggest, the GAM would be responsible for managing links in the network, what would be referred to as the _network topology_, and would get policies that instruct it how to maintain the graph based on certain parameters. This component, however, was short-lived and replaced by an API to connect IPCPs so the actual layer management logic could be a standalone program outside of the IPCPs instead of a module inside the IPCPs, which is far more flexible.
In the meantime, I was implementing and revising CACEP, the Common Application Connection Establishment Phase that was accompanying CDAP in RINA. Discussions on CACEP between Sander and myself were interesting and sometimes heated -- whiteboard markers have experienced flight and sudden deceleration. CDAP was supposed to support different encoding schemes -- the OSI _presentation layer_. We were only going to implement Google Protocol Buffers, which was also used in IRATI, but the support for others should be there. The flow allocator and the RIB were built on top of our CDAP implementation. And something was becoming more and more obvious. What we were implementing -- agreeing on protocol versions, encoding etc -- was something rather universal to all protocols. Now, you may remember that the flow allocator is passing something -- the information needed to connect to a specific Application Entity or Application Entity Instance -- that was actually only needed after the flow allocation procedure was basically established. But after a while, it was clear to me that this information should be there in that CACEP part, and was rather universal for all application connections, not just CDAP. After I presented this to Sander (despair) over IRC, he quickly recognized how this -- to me seemingly small -- change impacted the entire architecture. Now, I will never forget the exchange, and I actually saved that conversation as a text file. The date was February 24th, 2017.
<despair> nice, so then dev.h is even simpler <despair> ae name is indeed not on the layer boundary <dstaesse> wait why is dev.h simpler? <despair> since ae name will be removed there <dstaesse> no <dstaesse> would you? <despair> yes <despair> nobody likes balls on the line <despair> it's balls out
Now, RINA experts will (or should) gasp for air when reading this. It refers to something that traces back to John's ISO JTC1/SC16 days working on Open Systems Interconnect (OSI), when there was a heavy discussion ongoing about the "Application Entity": where was it located? If it was in the application, it would be outside of SC16, which was dealing with networks, if it was in the network, it would be dealt with only in SC16. It was a turf battle between two ISO groups, and because Application Entities were usually drawn as a set of circles, and the boundary between the network and the application as a line, that battle was internally nicknamed -- boys will be boys -- the balls-in, balls-out question. If you ever attended one of John's presentations, he would take a short pause and then continue: "this was the only time that a major scientific insight came from a turf war: the balls were on the line". The Application Entity needed to be known in both the application and the network. Alas! Our implementation was clearly showing that this was not the case. The balls were above the line, the network (or more precise: the flow allocator in the IPCP) doesn't need to know anything about application entities!
Ouroboros now had a clean, crisp boundary between the flow in a DIF, and any connections using that flow in the layer above. Flow allocation creates a flow between Application Instances and after that, a connection phase would create a connection between Application Entity Instances. So roughly speaking -- without the OSI terminology -- first the network connects the running programs, and after that, the programs decide which protocols to use (which can be implicit). What was in the RINA specs , what the RINA API was actually doing, was mixing these exchanges! Now, we have no issues with that from an operational perspective: the Ouroboros flow allocator has a piggyback API. But the contents of the piggybacked information in Ouroboros is opaque. And all this has another, even bigger, implication. One that I would only figure out via another line of reasoning some time later.
With ARCFIRE rolling along and the implementation of the _rumba_ framework in full swing, Sander was working on the link-state routing policy for Ouroboros, and I started implementing a Distributed Hash Table (DHT) that would serve as the directory -- think of the equivalent of DNS SRV for a RINA DIF -- a key-value store mapping application names to addresses. The link-state routing component was something that was really closely related to the Resource Information Base -- the RIB. That RIB was closely coupled with CDAP. Remember that prediction that I made about a year prior, somewhere in April 2016? On September 9th 2017, two weeks before the ARCFIRE RINA hackathon, CDAP was removed from Ouroboros. From that point, Ouroboros could definitely not be considered a RINA implementation anymore.
It was time to get started on the last big component: DTCP -- the Data Transfer Control Protocol. When implementing this, a couple of things were again quickly becoming clear. First, the implementation was proving to be completely independent of DTP. The RINA specs propose a state vector between DTP and DTCP. This solves the IP fragmentation problem -- if an IP fragment gets lost, TCP would resend all fragments. Hence TCP needs to know about the fragmentation in IP and only retransmit the bytes in that fragment. But the code was again speaking otherwise. It was basically telling us: TCP was independent of IP. But fragmentation should be in TCP, and IP should specify its maximum packet size. Anything else would result in an intolerable mess. So that's how we split the Flow and Retransmission Control Protocol (FRCP) and the Data Transfer Protocol(DTP) in Ouroboros.
With FRCP split from DTP in roughly along the same line as TCP was originally split from IP, we had a new question: where to put FRCP? RINA has DTCP/DTP in the DIF as EFCP. And this resulted in something that I found rather ugly: a normal layer would "bootstrap" its traffic (e.g. flow allocator) over its own EFCP implementation to deal with underlying layers that do not have EFCP (such as the _shim DIFs_). Well, fair enough I guess. But there is another thing. One that bugged me even more. RINA has an assumption on the _system_, one that has to be true. The EFCP implementation -- which is the guarantee that packets are delivered, and that they are delivered in-order -- is in the IPCP. But the application process that makes use of the IPCP is a _different process_. So, in effect, the transfer of data, the IPC, between the Application Process and the IPCP has to be reliable and preserve data order _by itself_. RINA has no control over this part. RINA is not controlling ALL IPC; there is IPC outside of RINA. Another way of stating the problem is like this: If a set of processes (IPCPs) are needed to provide reliable state synchronization between two applications A and B, who is providing reliable state synchronization between A and its IPCP and between B and its IPCP? If it's again an IPCP, that's infinite recursion! Now -- granted -- this is a rather academic issue, because most (all?) computer hardware does provide this kind of preserving IPC. However, even theoretical issues were issues, I would prefer that Ouroboros was able to guarantee ALL IPC, even between its own components, and not having to make any external (hardware) assumptions. Then, and only then, the unification of networking and IPC promised by RINA would be achieved.
The third change in the architecture was the big one. And in hindsight, we should already have seen that coming with our realization that the application entity was above the line: we moved FRCP into the application. It would be implemented in the library, not in the IPCP, as a set of function calls, just like HTTP libraries. Sander was initially skeptic, because to his taste, if a single-threaded application uses the library, it should remain single-threaded. How could it send acknowledgements, restransmit packets etc? And the RINA specs also had congestion avoidance as part of EFCP/DTCP. At least that shouldn't be in the application!? I agreed, but said I was confident that it would make the single-threaded thing work by running the functionality as part of the IPC calls, read/write/fevent (this has been reverted now to simplify the implementation and will probably be moved to the IRMd). We moved congestion avoidance logic to the IPCP in the flow allocator. All this meant that Ouroboros Layers were not DIFs, and we stopped using that terminology.
By now, the prototype was running stable enough for us to go open source. We got approval from IMEC to release it to the public under the GPLv2 / LGPL license, and in early 2018, almost exactly 2 years after we started the project, we presented the first public version of Ouroboros at FOSDEM 2018 in Brussels.
But we were still running against the clock. ARCFIRE was soon to end, and Ouroboros had undergone quite some unanticipated changes that meant the implementation was also facing the reality of Hofstadter's Law.
We were again under pressure to get some publications out; in order to meet ARCFIRE objectives, and Sander had to meet some publication quota to finish his PhD. The design of Rumba was interesting enough for a paper, the implementation allowed us to deploy 3 Recursive Network prototypes (IRATI, rlite and Ouroboros) on testeds using different APIs: jFed for Fed4Fire and GENI, Emulab for iMinds virtual wall testbed, QEMU using virtual machines, docker using -- well -- docker containers, and a local option only for Ouroboros. But we needed more publications, so for ARCFIRE Sander had implemented Loop-Free Alternates routing in Ouroboros and was getting some larger-scale results with them. And I reluctantly started working on a paper on Ouroboros -- I still felt the time wasn't right, and we first needed to have a full FRCP implementation and full congestion avoidance to make a worthwile analysis.
We finished the experiments for ARCFIRE, but as with PRISTINE, the results were not accepted for publication. During the writing of the paper, a final realization came. We had implemented our link-state routing a while ago, and it was doing something interesting, akin to all link-state routing protocols: a link-state packet that came in on some flow, was sent out on all other flows. It was -- in effect --doing broadcast. But... OSPF is doing the same. Wait a minute. OSPF uses a multicast IP address. But of course! Multicast wasn't what it seemed to be. Multicast was broadcast on a layer, creating a multicast group was enrollment in that layer. A multicast IP address is a broadcast Layer name! Based on the link-state routing code in the normal IPCP, I implemented the broadcast IPCP in a single night. The normal IPCP was renamed unicast IPCP. It had all fallen into place, the Ouroboros architecture was shaped.