From bdacab729662dfd1e5c64dba9a739c6d6c5775d6 Mon Sep 17 00:00:00 2001 From: Dimitri Staessens Date: Sun, 12 Jun 2022 10:32:35 +0200 Subject: content: Add link to taiga on contribute page --- content/en/docs/Contributions/_index.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/content/en/docs/Contributions/_index.md b/content/en/docs/Contributions/_index.md index ad33af3..558298e 100644 --- a/content/en/docs/Contributions/_index.md +++ b/content/en/docs/Contributions/_index.md @@ -7,6 +7,14 @@ description: > How to contribute to Ouroboros. --- +### Ongoing work + +Ouroboros is far from complete. Plenty of things need to be researched +and implemented. We don't really keep a list, but this +[epic board](https://tree.taiga.io/project/dstaesse-ouroboros/epics) can +give you some ideas of what is still on our mind and where you may be +able to contribute. + ### Communication There are 2 ways that will be used to communicate: The mailing list -- cgit v1.2.3 From 84c6cf1980da5cb749657425b99419d80ffc0d15 Mon Sep 17 00:00:00 2001 From: Dimitri Staessens Date: Wed, 7 Dec 2022 22:15:28 +0100 Subject: blog: Add post on loc-id split --- content/en/blog/20221207-loc-id-mobility-1.png | Bin 0 -> 403093 bytes content/en/blog/20221207-loc-id-mobility-2.png | Bin 0 -> 411592 bytes content/en/blog/20221207-loc-id-split.md | 205 +++++++++++++++++++++++++ content/en/blog/20221207-loc-id.png | Bin 0 -> 81278 bytes 4 files changed, 205 insertions(+) create mode 100644 content/en/blog/20221207-loc-id-mobility-1.png create mode 100644 content/en/blog/20221207-loc-id-mobility-2.png create mode 100644 content/en/blog/20221207-loc-id-split.md create mode 100644 content/en/blog/20221207-loc-id.png diff --git a/content/en/blog/20221207-loc-id-mobility-1.png b/content/en/blog/20221207-loc-id-mobility-1.png new file mode 100644 index 0000000..87bb04a Binary files /dev/null and b/content/en/blog/20221207-loc-id-mobility-1.png differ diff --git a/content/en/blog/20221207-loc-id-mobility-2.png b/content/en/blog/20221207-loc-id-mobility-2.png new file mode 100644 index 0000000..4fedee9 Binary files /dev/null and b/content/en/blog/20221207-loc-id-mobility-2.png differ diff --git a/content/en/blog/20221207-loc-id-split.md b/content/en/blog/20221207-loc-id-split.md new file mode 100644 index 0000000..8c2a068 --- /dev/null +++ b/content/en/blog/20221207-loc-id-split.md @@ -0,0 +1,205 @@ +--- +date: 2022-12-07 +title: "Loc/Id split and the Ouroboros network model" +linkTitle: "On Loc/Id split" +author: Dimitri Staessens +--- + +A few weeks back I had a drink with a Thijs who is now doing a +master's thesis on Loc/Id split, so we dug into the concepts behind +Locators and Identifiers and see if matches or in anyway interferes +with the Ouroboros network model. + +For this, we started from the paper _Locator/Identifier Split +Networking: A Promising Future Internet Architecture_[^1]. + +# Loc/Id split? + +In a nutshell, Loc/Id split starts from the observation that the +transport layer (TCP, UDP) is tightly coupled to network (IP) +addresses via a certain TCP/UDP port. + +Assuming our IPv4 local address is 10.10.0.1/24 and there is an SSH +server on 10.10.5.253/24 listening on port 22, after making a +connection, our client application could be bound to 10.10.0.1/24 on +port 25406. If we move our laptop to another room that is on an access +point in a different subnet, and we receive IP address 10.10.4.7/24, +our TCP connection to the SSL server will break. + +Loc/Id split suggest to split the "address" into two parts, an +Identifier that is location-independent and specifies the _who_ at the +transport layer, and a locator that is location-dependent and +specifies the _where_ at the network layer. Since an IPv6 address has +more than enough (128) bits, there's plenty of space to chop it up and +attach some semantics to the individual pieces. + +Of course, after the split, identifiers need to be mapped to locators, +so there is a mapping system needed to resolve the locator given the +identifier. This mapping system resides in a Sub-Layer between the +transport layer and the network layer. If this mapping system sounds a +lot like DNS to you, then you're right, but then remember that TCP +doesn't bind to a DNS name + port, but to an IP address + port. That's +where the issue lies that the Identifier tries to solve. + +Resolving the Locator from the Identifier usually happens in the +end-host, but some Loc/Id split proposals may forward this +responsibility to other nodes in the network. When only end-hosts +perfom Id->Loc resolution, it's called a host-based Loc/Id split +architecture, if some other nodes perform Id->Loc resolution it's +called a network-based architecture. In a network-based architecture, +the identifier MUST be part of the packet header (in a host-based +architecture it's optional), and the network nodes forward towards a +resolver node based on the identifier and then when the locator is +known based on the locator towards the end-host. I have my doubts that +this can ever scale, so in this article, I'll focus on host based +Loc/Id split. Host-based architectures are summarized in the figure +below, taken from the survey paper[^1]. + +{{
}} + +My first reaction to seeing that was _sounds about right to me_, it's +almost identical to what O7s proposes for a fully scalable and +evolvable architecture. But before I get to that, let's first dig a +bit deeper into those locators and identifiers. What _are_ these +beasts? + +# Mobility in Loc/Id split + +{{
}} + +Let's assume the previous example where, from my laptop, I'm connected +to some SSH server, but this time we're in a Loc/Id split network. So +my laptop got a different address for its interface, an identifier, +say COFF33D00D, and, since I'm in the green network, a locator that is +conveniently the IPv4 address for my wireless LAN interface, +10.10.0.1/24. The TCP connection in the SSH client is Loc/Id aware, +and now bound to C0FF33D00D:25406. After connecting to the client at +008BADF00D, It learns that I'm C0FF33D00D and my locator is 10.10.0.1. + +When I move to another floor, the laptop WLAN interface gets a new +locator, but my identifier stays the same. It's now +C0FF33D00D:10.10.4.7. The OS is implementing a host-based Loc/Id split +architecture, so I quickly send a _loc/id update_ message to the +server at 10.10.5.253 that my locator for C0FF33D00D has changed to +10.10.4.7, and it updates its mapping. The Loc/Id-aware TCP state +machine in my laptop had some packet loss to deal with while I was in +the elevator, but other than that, since it was bound to my identifier +the connection remains intact. + +Nice! Splitting an address into a locator and identifier has a pretty +elegant solution to mobility. + +Notice I didn't give the routers identifiers parts in their +address? That's on purpose. + +Let's take a little thought experiment. + +Instead of moving to the other floor, I already have a laptop already +sitting there. Its WLAN interface has address COFFEEBABE:10.10.4.7. + +{{
}} + +Now, what I do in this thought experiment, is copy the entire _program +state_ of my SSH client to that other laptop, _including_ the TCP +state[^2] and fork it as a new process on the other laptop. What is +needed to make it work from a network perspective? + +Well, like when actually moving with my laptop, I need to update the +server that my identifier C0FF33D00D has moved to another locator at +10.10.4.7. That should do the trick, quite easy. + +Unless there was already another application connected on port 25406 +on that destination laptop. Then there is no way for the incoming +laptop to know where to deliver the packets to. Unless the identifier +is in the packet header. But host-based Loc/Id split had them +optional? This seems to hint that host-based Loc/Id split supports +device mobility but not real application mobility[^3]. + +# What does the Ouroboros model say? + +Now, Ouroboros does things a little bit differently, but it maps quite +well. Ouroboros[^4] gives each application process a name, which (well +its hash) is mapped to a network address. That application name +basically maps to the _identifier_, and the network address maps to +the _locator_. + +{{
}} + +Let's compare the architecture of Ouroboros above with the figure at +the top. + +First, the similarities. The Ouroboros model conjectures a split of +the transport layer into an _application end-to-end layer_ (roughly +TCP without congestion avoidance) and a network end-to-end layer that +includes the _flow allocator_. + +The _flow allocator_ in O7s performs the name <--> address mapping +that is similar to id <--> loc mapping. Interesting to note is that +the Flow allocator is present in every network host, which is needed +for Congestion Notifications. Given that identifiers are mapping to +application names, resolving in name <--> address in other nodes than +the source, like in network-based Loc/Id split, is not violating the +O7s architecture. But we haven't considered this as it doesn't look +feasible from a scalability perspective. + +Now, the differences. First, the naming. The "identifier" in Ouroboros +is a network-wide unique application name[^7]. Processes[^7] can be +_bound_ to an application name. If a single process binds to an +application name it's unicast, if multiple processes on the same +server, it provides per-connection load-balancing between these +processes. If multiple processes on different servers bind to the same +name, it's anycast. + +Ouroboros endpoint identifiers (EIDs) are only known to the Flow +Allocator. This allows allocating a new flow (including new EIDs) +while keeping the connection state in the process (FRCP) intact, and +thus allowing application mobility in addition to device mobility. + +Taking another look at the Loc/Id split figure, note that Ouroboros +splits "network" from "application" just above the "Sub-layer", instead +of above the transport layer. + +# Wrapping up + +The discussions on Loc/Id split were quite interesting. A lot of the +steps and solutions it proposes are in line with the O7s model. What +strikes me most is that LoC/Id split is still not very well-defined as +a _model_. What exactly _are_ identifiers? What exactly _are_ +locators? The thing that sets O7s apart is that the model consists of +a limited amount of objects (forwarding elements and flooding +elements, which form Layers[^7], application, process, ...) that have +well-defined names[^8] that are immutable and exist only for as long +as the object exists. But that's a whole post by itself. + + +[^1]: https://doi.org/10.1109/COMST.2017.2728478 + +[^2]: This is hard to do with TCP state being in the kernel, but let's + forget about that and memory addresses and others stuff for a + moment and assume the complete application state is a nice + containerized package. + +[^3]: The Ouroboros model does allow complete application + mobility. The problem in this Loc/Id proposal is that the port + is still part of the Transport Layer state (see the figure at + the start of the post). + +[^4]: This, and a lot of other things in O7s, were proposed in the + RINA architecture, that's where the credit should go. + +[^5]: We might change that to "service name" but terminology is hard + to get right. + +[^6]: In O7s, processes are named with a process name (which in the + implementation maps to the linux process id (pid). Process names + are only local (system) scope and live until the process dies. + +[^7]: I capitalize layers, as these are have a different meaning than + the layers in the figure above. Maybe we should call them + _strata_ instead of layers. Again, terminology is hard. + +[^8]: Synonyms are allowed, but they serve no function in the + architecture. As an example, application names are hashed (a + synonym) which has practical implications for security and + implementation simplicity, but the architecture is theoretically + identical without that hash. \ No newline at end of file diff --git a/content/en/blog/20221207-loc-id.png b/content/en/blog/20221207-loc-id.png new file mode 100644 index 0000000..51a046d Binary files /dev/null and b/content/en/blog/20221207-loc-id.png differ -- cgit v1.2.3 From 8d39895ee24ce004cebd91dffa464e00263dd1e4 Mon Sep 17 00:00:00 2001 From: Dimitri Staessens Date: Wed, 7 Dec 2022 22:20:34 +0100 Subject: blog: Fix some typos in loc-id post --- content/en/blog/20221207-loc-id-split.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/content/en/blog/20221207-loc-id-split.md b/content/en/blog/20221207-loc-id-split.md index 8c2a068..6544837 100644 --- a/content/en/blog/20221207-loc-id-split.md +++ b/content/en/blog/20221207-loc-id-split.md @@ -7,7 +7,7 @@ author: Dimitri Staessens A few weeks back I had a drink with a Thijs who is now doing a master's thesis on Loc/Id split, so we dug into the concepts behind -Locators and Identifiers and see if matches or in anyway interferes +Locators and Identifiers and see if matches or in any way interferes with the Ouroboros network model. For this, we started from the paper _Locator/Identifier Split @@ -19,11 +19,11 @@ In a nutshell, Loc/Id split starts from the observation that the transport layer (TCP, UDP) is tightly coupled to network (IP) addresses via a certain TCP/UDP port. -Assuming our IPv4 local address is 10.10.0.1/24 and there is an SSH -server on 10.10.5.253/24 listening on port 22, after making a -connection, our client application could be bound to 10.10.0.1/24 on +Assuming our IPv4 local address is 10.10.0.1 /24 and there is an SSH +server on 10.10.5.253 /24 listening on port 22, after making a +connection, our client application could be bound to 10.10.0.1 /24 on port 25406. If we move our laptop to another room that is on an access -point in a different subnet, and we receive IP address 10.10.4.7/24, +point in a different subnet, and we receive IP address 10.10.4.7 /24, our TCP connection to the SSL server will break. Loc/Id split suggest to split the "address" into two parts, an @@ -72,7 +72,7 @@ to some SSH server, but this time we're in a Loc/Id split network. So my laptop got a different address for its interface, an identifier, say COFF33D00D, and, since I'm in the green network, a locator that is conveniently the IPv4 address for my wireless LAN interface, -10.10.0.1/24. The TCP connection in the SSH client is Loc/Id aware, +10.10.0.1 /24. The TCP connection in the SSH client is Loc/Id aware, and now bound to C0FF33D00D:25406. After connecting to the client at 008BADF00D, It learns that I'm C0FF33D00D and my locator is 10.10.0.1. -- cgit v1.2.3 From f35d305da7728cecb4ea91aed5b9c1fd0ddaebb6 Mon Sep 17 00:00:00 2001 From: Dimitri Staessens Date: Sun, 11 Dec 2022 12:59:40 +0100 Subject: blog: Small update to loc-id-split --- content/en/blog/20221207-loc-id-split.md | 87 ++++++++++++++++++-------------- 1 file changed, 49 insertions(+), 38 deletions(-) diff --git a/content/en/blog/20221207-loc-id-split.md b/content/en/blog/20221207-loc-id-split.md index 6544837..bad82ac 100644 --- a/content/en/blog/20221207-loc-id-split.md +++ b/content/en/blog/20221207-loc-id-split.md @@ -5,10 +5,10 @@ linkTitle: "On Loc/Id split" author: Dimitri Staessens --- -A few weeks back I had a drink with a Thijs who is now doing a -master's thesis on Loc/Id split, so we dug into the concepts behind -Locators and Identifiers and see if matches or in any way interferes -with the Ouroboros network model. +A few weeks back I had a drink with Thijs who is now doing a master's +thesis on Loc/Id split, so we dug into the concepts behind Locators +and Identifiers and see if matches or in any way interferes with the +Ouroboros network model. For this, we started from the paper _Locator/Identifier Split Networking: A Promising Future Internet Architecture_[^1]. @@ -113,15 +113,19 @@ on that destination laptop. Then there is no way for the incoming laptop to know where to deliver the packets to. Unless the identifier is in the packet header. But host-based Loc/Id split had them optional? This seems to hint that host-based Loc/Id split supports -device mobility but not real application mobility[^3]. +device mobility but cannot fully support application mobility[^3]. + +So, what is that identifier actually naming? Well, all that moved was +the application state, and the identifier seemed to move with +it... And since the routers in the example don't run "end-host" +applications, they don't need identifiers. # What does the Ouroboros model say? -Now, Ouroboros does things a little bit differently, but it maps quite -well. Ouroboros[^4] gives each application process a name, which (well -its hash) is mapped to a network address. That application name -basically maps to the _identifier_, and the network address maps to -the _locator_. +Ouroboros[^4] gives each application process a name, which is mapped +to an IPCP's address[^5]. The O7s application name basically +corresponds to the _identifier_, and the IPCPs address maps to the +_locator_. {{
}} @@ -134,30 +138,33 @@ TCP without congestion avoidance) and a network end-to-end layer that includes the _flow allocator_. The _flow allocator_ in O7s performs the name <--> address mapping -that is similar to id <--> loc mapping. Interesting to note is that -the Flow allocator is present in every network host, which is needed -for Congestion Notifications. Given that identifiers are mapping to +that is similar to id <--> loc mapping. Interesting to note is that in +O7s, the Flow allocator is present in every IPCP, which is needed for +Congestion Notifications. Given that identifiers are mapping to application names, resolving in name <--> address in other nodes than the source, like in network-based Loc/Id split, is not violating the O7s architecture. But we haven't considered this as it doesn't look feasible from a scalability perspective. Now, the differences. First, the naming. The "identifier" in Ouroboros -is a network-wide unique application name[^7]. Processes[^7] can be -_bound_ to an application name. If a single process binds to an +is a network/globally unique application name[^6]. Processes[^7] can +be _bound_ to an application name. If a single process binds to an application name it's unicast, if multiple processes on the same -server, it provides per-connection load-balancing between these -processes. If multiple processes on different servers bind to the same -name, it's anycast. - -Ouroboros endpoint identifiers (EIDs) are only known to the Flow -Allocator. This allows allocating a new flow (including new EIDs) -while keeping the connection state in the process (FRCP) intact, and -thus allowing application mobility in addition to device mobility. - -Taking another look at the Loc/Id split figure, note that Ouroboros -splits "network" from "application" just above the "Sub-layer", instead -of above the transport layer. +server bind to the same name, it provides per-connection +load-balancing between these processes. If multiple processes on +different servers bind to the same name, it provides a form of anycast +name-based load-balancing. + +Second, Ouroboros endpoint identifiers (EIDs) are only known to the +Flow Allocator at the endpoint and specify the application. The O7s +EID can be viewed as a combination of the L3 _protocol_ field and the +L4 _port_ field into a single field that sits in between L3 and L4 +(the Loc/Id proposed sublayer). This allows O7s to allocate a new flow +(assigning new EIDs) while keeping the connection state in the process +(FRCP) intact, and thus allowing full application mobility in addition +to device mobility. Taking another look at the Loc/Id split figure, +note that Ouroboros splits "network" from "application" just above the +"Sub-layer", instead of above the "transport layer". # Wrapping up @@ -167,9 +174,9 @@ strikes me most is that LoC/Id split is still not very well-defined as a _model_. What exactly _are_ identifiers? What exactly _are_ locators? The thing that sets O7s apart is that the model consists of a limited amount of objects (forwarding elements and flooding -elements, which form Layers[^7], application, process, ...) that have -well-defined names[^8] that are immutable and exist only for as long -as the object exists. But that's a whole post by itself. +elements, which form Layers[^8], application, process, ...) that have +well-defined names[^9] that are immutable and exist only for as long +as the object exists. [^1]: https://doi.org/10.1109/COMST.2017.2728478 @@ -187,18 +194,22 @@ as the object exists. But that's a whole post by itself. [^4]: This, and a lot of other things in O7s, were proposed in the RINA architecture, that's where the credit should go. -[^5]: We might change that to "service name" but terminology is hard - to get right. +[^5]: To be accurate: we hash the application name. + +[^6]: At least, for a public Internetwork, they should be globally + unique. -[^6]: In O7s, processes are named with a process name (which in the +[^7]: In O7s, processes are named with a process name (which in the implementation maps to the linux process id (pid). Process names - are only local (system) scope and live until the process dies. + are only local (system) scope. -[^7]: I capitalize layers, as these are have a different meaning than - the layers in the figure above. Maybe we should call them - _strata_ instead of layers. Again, terminology is hard. +[^8]: I capitalize Layers, as these Layers that are made up of + forwarding elements (unicast Layers) or flooding elements + (broadcast Layers) have a different meaning than the layers in + the discussion above. Maybe we should call them _strata_ instead + of Layers... -[^8]: Synonyms are allowed, but they serve no function in the +[^9]: Synonyms are allowed, but they serve no function in the architecture. As an example, application names are hashed (a synonym) which has practical implications for security and implementation simplicity, but the architecture is theoretically -- cgit v1.2.3