From b116e6470da9177c178ad63a7d5e3a705448fba3 Mon Sep 17 00:00:00 2001 From: Dimitri Staessens Date: Mon, 28 Feb 2022 19:53:05 +0100 Subject: blog: Post on application-level flow monitoring --- content/en/blog/20211229-flow-vs-connection.md | 17 ++- content/en/blog/20220228-flm-app.png | Bin 0 -> 80382 bytes .../en/blog/20220228-flow-liveness-monitoring.md | 150 +++++++++++++++++++++ 3 files changed, 165 insertions(+), 2 deletions(-) create mode 100644 content/en/blog/20220228-flm-app.png create mode 100644 content/en/blog/20220228-flow-liveness-monitoring.md diff --git a/content/en/blog/20211229-flow-vs-connection.md b/content/en/blog/20211229-flow-vs-connection.md index 88b3654..3806dd2 100644 --- a/content/en/blog/20211229-flow-vs-connection.md +++ b/content/en/blog/20211229-flow-vs-connection.md @@ -230,6 +230,10 @@ applications' half of the flow deallocation, but not the complete deallocation. If an IPCP crashes, applications still hold the FRCP state and can recover the connection over a different flow[^6]. +**Edit: the below section is not correct, but it's interesting to read +anyway**[^7]. There is a new post, documenting the +[actual implementation](/blog/2022/02/28/application-level-flow-liveness-monitoring/). + So, now it should be clear that the liveness of a flow has to be detected in the flow allocator of the IPCPs, not in the application (again, reminder: FRCP state is maintained inside the application). @@ -264,7 +268,7 @@ in a way similar to the OSI/TCP models. I omitted the "physical layer", which is handled by dedicated IPCP implementations, such as the ipcpd-local, ipcpd-eth, etc. It's not that important here. What is important is that O7s splits functionality that is in TCP/IP in two -layers (L3/L4), into **3 independent layers**[^7] (and protocols). Let's +layers (L3/L4), into **3 independent layers**[^8] (and protocols). Let's go through O7s from bottom to top. {{
}} @@ -332,7 +336,16 @@ Dimitri [^6]: This has not been implemented yet, and should make for a nice demo. -[^7]: The "recursive layer boundary" in the figure uses the word layer +[^7]: After implementing the solution below, it became apparent to me + that something was off. I needed to leak the FRCP timeout into + the IPCP, which is a layer violation. I noted this fact in my + commit message, but after more thought, I decided to retract my + patch... it just couldn't be right. This layer violation didn't + come up when we implemented FLM in the Flow allocator RINA, + because RINA puts the whole retransmission logic (called DTCP) + in the IPCP. + +[^8]: The "recursive layer boundary" in the figure uses the word layer in the sense of a RINA DIF. We didn't adopt the terminology DIF, since it has special meaning in RINA, and O7s' recursive layers are not interchangeable or compatible with RINA DIFs. \ No newline at end of file diff --git a/content/en/blog/20220228-flm-app.png b/content/en/blog/20220228-flm-app.png new file mode 100644 index 0000000..13df9bd Binary files /dev/null and b/content/en/blog/20220228-flm-app.png differ diff --git a/content/en/blog/20220228-flow-liveness-monitoring.md b/content/en/blog/20220228-flow-liveness-monitoring.md new file mode 100644 index 0000000..ba9190f --- /dev/null +++ b/content/en/blog/20220228-flow-liveness-monitoring.md @@ -0,0 +1,150 @@ +--- +date: 2022-02-28 +title: "Application-level flow liveness monitoring" +linkTitle: "Flows vs connections/sockets (3)" +author: Dimitri Staessens +--- + +This week I completed the (probably final) implementation of flow +liveness monitoring, but now in the application. In the next prototype +version (0.19) Ouroboros will allow setting a keepalive timeout on +flows. If there is no other traffic to send, either side will send +periodic keepalive packets to keep the flow alive. If no activity has +been observed for the keepalive time, the peer will be considered +down, and IPC calls (flow_read / flow_write) will fail with +-EFLOWPEER. This does not remove any flow state in the system, it only +notifies each side that the peer is unresponsive (presumed dead, +either it crashed, or deallocated the flow). It's up to the +application how to respond to this event. + +The duration can be set using the timeout value on the QoS +specification. It is specified in milliseconds, currently as a 32-bit +unsigned integer. This allows timeouts up to about 50 days. Each side +will send a keepalive packet at 1/4 of the specified period (not +configurable yet, but this may be useful at some point). To disable +keepalive, set the timeout to 0. I've set the current default value to +2 minutes, but I'm open to other suggestions. + +The modified oecho application looks as follows (decluttered). On the +server side, we have: + +```C + while (true) { + fd = flow_accept(NULL, NULL); + + printf("New flow.\n"); + + count = flow_read(fd, &buf, BUF_SIZE); + + printf("Message from client is %.*s.\n", (int) count, buf); + + flow_write(fd, buf, count); + + flow_dealloc(fd); + } + + return 0; +``` + +And on the client side, the following example sets a keepalive of 4 seconds: +```C + char * message = "Client says hi!"; + qosspec_t qs = qos_raw; + qs.timeout = 4000; + + fd = flow_alloc("oecho", &qs, NULL); + + flow_write(fd, message, strlen(message) + 1); + + count = flow_read(fd, buf, BUF_SIZE); + + printf("Server replied with %.*s\n", (int) count, buf); + + /* The server has deallocated the flow, this read should fail. */ + count = flow_read(fd, buf, BUF_SIZE); + if (count < 0) { + printf("Failed to read packet: %zd.\n", count); + flow_dealloc(fd); + return -1; + } + + flow_dealloc(fd); +``` + +Running the client against the server will result in (1006 indicates EFLOWPEER). + +``` +[dstaesse@heteropoda website]$ oecho +Server replied with Client says hi! +Failed to read packet: -1006. +``` + +How does it work? + +In the +[first post on this topic]([post](/blog/2021/12/29/behaviour-of-ouroboros-flows-vs-udp-sockets-and-tcp-connections/sockets/), +I explained my reasoning how Ouroboros should deal with half-closed +flows (flow deallocation from one side should eventually result in a +terminated flow at the other side). The implementation should work +with any kind of flow, which means we can't put in the the FRCP +protocol. And thus, I argued, it had to be below the application, in +the flow allocator. This is also where we implemented it in RINA a few +years back, so it was easy to think this would directly translate to +O7s. I was convinced it was right. + +I was wrong. + +After the initial implementation, I noticed that I needed to leak the +FRCP timeout (remaining Delta-t) into the IPCP. I was not planning on +doing that, as it's a _layer violation_. In RINA that is not as +obvious, as DTCP is already in the IPCP. But in O7s, the deallocation +first waits for Delta-t to expire in the application[^1] before +telling the IPCP to get rid of the flow (where it's an instantaneous +operation). This means that for flows with retransmission, the +keepalive timeout will first wait for the peers' Delta-t timer to +expire (because the flow isn't deallocated in the peer's IPCP until it +does), and then again wait for the keepalive to expire in it's own +IPCP. With 2 minutes each, that means the application would only +timeout after 4 minutes after the deallocation. To solve that with +keepalive in the flow allocator, I would need to pass the timeout to +the flow allocator, and on dealloc tell it to stop sending keepalives, +and wait for the longest of the [keepalive, delta-t] to expire before +getting rid of the flow state. It would work, it wouldn't even be a +huge mess to most eyes. But it bugged me tremendously. It had to be in +the application, as shown in the figure below. + +{{
}} + +But this poses a different problem: how to spot keepalive packets from +regular traffic. As I said many times before, it can't be in FRCP, as +it wouldn't work with raw flows. It also has to work with +encryption. Raw flows have no header, so I can't mark them easily, and +adding a header just for marking keepalive flows is also a bridge too +far. + +I think I found an elegant solution. _0-length packets_. No header. No +flags. Nothing. Nada. The flow at the receiver gets notified of a +packet with a length of 0 bytes from the flow, updates it last +activity time, and drops the packet without waking up application +reads. Works with any type of traffic on the flow. 0-byte reads on the +receiver already have a semantic of a partial read that was completed +with exactly the buffer size[^2]. The sender can send 0-length +packets, but the effect will be that it is a purposeful keepalive +initiated at the sender. + +[^1]: Logically in the application. After all packets are + acknowledged, the application will exit and the IRMd will just + wait for the remaining timeout before telling the IPCP to + deallocate the flow. This is also a leak of the timeout from the + application to the IRMd, but it's an optimization that is really + needed. Nobody wants to wait 4 minutes for an application to + terminate after hitting Ctrl-C. This isn't really a clear-cut + "layer violation" as the IRMd should be considered part of the + Operating System. It's similar to TCP connections being in + TIME_WAIT in the kernel for 2 MSL. + + +[^2]: If flow\_read(fd, buf, 128) returns 128, it should be called + again. If it returns 0, it means that the message was 128 bytes + long, if it returns another value, it is still part of the + previous message. \ No newline at end of file -- cgit v1.2.3