diff options
| author | Dimitri Staessens <dimitri@ouroboros.rocks> | 2026-05-10 19:06:21 +0200 |
|---|---|---|
| committer | Sander Vrijders <sander@ouroboros.rocks> | 2026-05-20 08:17:07 +0200 |
| commit | 63d3aa9ab8d8b0b6d8a10362e112a431dcb5b4e9 (patch) | |
| tree | 88f0827466b40d0e83da7954123d00cbb5f6c676 /cmake/config | |
| parent | f33769c818cb1f01079405f543b36aa294764112 (diff) | |
| download | ouroboros-63d3aa9ab8d8b0b6d8a10362e112a431dcb5b4e9.tar.gz ouroboros-63d3aa9ab8d8b0b6d8a10362e112a431dcb5b4e9.zip | |
lib: Update FRCP implementation
The Flow and Retransmission Control Protocol (FRCP) runs end-to-end
between two peers over a flow. It provides reliability, in-order
delivery, flow control, and liveness. Note that congestion avoidance
is orthogonal to FRCP and handled in the IPCP.
A fixed 16-octet header, network byte order, is prefixed to every FRCP
packet:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| flags | hcs |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| window |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| seqno |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ackno |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| payload (variable) ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
hcs is a CRC-16-CCITT-FALSE checksum over the PCI (and the stream
extension when present), verified before any flag-driven dispatch. A
single packet can simultaneously carry DATA + ACK + FC + RXM by OR-ing
flag bits. An optional CRC trailer covers the body on DATA when qs.ber
== 0, and on every SACK packet; an optional AEAD wrap (per-flow keys)
sits outermost.
Flag bits (MSB-first; bits 13..15 reserved, MUST be zero):
+------+--------+--------+----------------------------------------+
| Bit | Mask | Name | Meaning |
+------+--------+--------+----------------------------------------+
| 0 | 0x8000 | DATA | Carries caller payload |
| 1 | 0x4000 | DRF | Start of a fresh data run |
| 2 | 0x2000 | ACK | ackno field valid |
| 3 | 0x1000 | NACK | Pre-DRF nudge (seqno informational) |
| 4 | 0x0800 | FC | window field valid (rwe advertisement) |
| 5 | 0x0400 | RDVS | Rendezvous probe (window-closed) |
| 6 | 0x0200 | FFGM | First Fragment of a multi-fragment SDU |
| 7 | 0x0100 | LFGM | Last Fragment of a multi-fragment SDU |
| 8 | 0x0080 | RXM | Retransmission |
| 9 | 0x0040 | SACK | Block list follows in payload |
| 10 | 0x0020 | RTTP | RTT probe / echo (payload follows) |
| 11 | 0x0010 | KA | Keepalive |
| 12 | 0x0008 | FIN | End of stream marker |
| 13-15| -- | -- | Reserved (MUST be zero) |
+------+--------+--------+----------------------------------------+
(FFGM, LFGM) encodes the fragment role of a DATA packet (SCTP-style
B/E): 11=SOLE, 10=FIRST, 00=MID, 01=LAST. Each fragment carries its
own seqno; Retransmission recovers fragments individually, reassembly
runs at consume time. In stream mode FFGM/LFGM are unused; per-byte
position is carried by the stream extension below and end-of-stream is
signalled by FIN on a 0-byte DATA packet.
SACK payload (FRCT_ACK | FRCT_FC | FRCT_SACK):
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| n_blocks | padding (2 octets) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| start[0] |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| end[0] |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
... n_blocks pairs total ...
Each block describes a *present* (received) range strictly above the
cumulative ACK in the PCI ackno. D-SACK (RFC 2883) is signalled
in-band as block[0] - no flag bit, no extra framing - and consumed by
the RACK reo_wnd_mult scaler (RFC 8985 sec. 7.2).
RTTP payload (FRCT_RTTP only; 24 octets):
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| probe_id |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| echo_id |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ nonce (16 octets, echoed verbatim) +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Stream PCI extension (in_order == STREAM only; 8 octets after the base
PCI on every DATA packet):
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| start |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| end |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
start, end are monotonic 32-bit byte offsets; end - start equals the
on-wire payload length. Stream mode is negotiated at flow allocation;
the extension is present iff stream mode is in use, never on a
per-packet basis.
Service modes are an orthogonal (in_order, loss, ber) vector selected
at flow_alloc; the cubes above map to the axes:
+----------------+---------+------+-----+-----------------------+
| Cube | in_order| loss | ber | Engaged |
+----------------+---------+------+-----+-----------------------+
| qos_raw | 0 | 1 | 1 | Raw passthrough |
| qos_raw_safe | 0 | 1 | 0 | Raw + CRC trailer |
| qos_rt | 1 | 1 | 1 | FRCP, no FRTX, no CRC |
| qos_rt_safe | 1 | 1 | 0 | FRCP, no FRTX, CRC |
| qos_msg | 1 | 0 | 0 | FRCP + FRTX |
| qos_stream | 2 | 0 | 0 | FRCP + FRTX, stream |
+----------------+---------+------+-----+-----------------------+
in_order=0 sends raw datagrams with no PCI (UDP-equivalent);
in_order=1 engages FRCP with SDU framing; in_order=2 (stream) requires
loss=0 and is rejected otherwise. loss=0 engages the FRTX retransmit
machinery. ber=0 appends the CRC-32 trailer; QOS_DISABLE_CRC at build
time forces ber=1 for development. Encryption is a separate per-flow
attribute layered as an AEAD wrap outside the FRCP packet.
Heritage: delta-t (Watson 1981) supplies timer-based connection
management - no SYN/FIN handshake, the DRF marker, the t_mpl / t_a /
t_r timers. RINA (Day 2008) supplies the unified flow_alloc(name, qos,
...) primitive and the orthogonal QoS-cube axes. Loss detection
follows TCP/QUIC practice (RFCs 2018, 2883, 6582, 6298, 8985); RTT
probing is nonce-authenticated like QUIC PATH_CHALLENGE.
Adds oftp, a minimal file-transfer tool over an FRCP stream flow. The
client reads from stdin or --in FILE and writes through a
flow_alloc(qos_stream); the server (--listen) calls flow_accept and
writes to stdout or --out FILE. Both sides compute a CRC-64/NVMe over
the bytes they handle and print the result. The server rejects flows
whose negotiated qs.in_order != STREAM.
Two FRCP knobs are exposed via env vars on either side:
OFTP_FRCT_RTO_MIN fccntl FRCTSRTOMIN (ns)
OFTP_FRCT_STREAM_RING_SZ fccntl FRCTSRRINGSZ (octets)
The ocbr_client gains an OCBR_QOS env var to pick the cube the client
uses for flow_alloc; recognised values are raw, safe, rt, rt_safe,
msg, stream. Unknown values fall back to raw with a warning on
stderr. Without the env set behaviour is unchanged.
Removes the deprecated lib/timerwheel.c
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
Diffstat (limited to 'cmake/config')
| -rw-r--r-- | cmake/config/ipcp/local.cmake | 2 | ||||
| -rw-r--r-- | cmake/config/lib.cmake | 24 | ||||
| -rw-r--r-- | cmake/config/ssm.cmake | 59 |
3 files changed, 66 insertions, 19 deletions
diff --git a/cmake/config/ipcp/local.cmake b/cmake/config/ipcp/local.cmake index 60ff23ee..70423cd1 100644 --- a/cmake/config/ipcp/local.cmake +++ b/cmake/config/ipcp/local.cmake @@ -2,7 +2,7 @@ set(IPCP_LOCAL_TARGET ipcpd-local) -set(IPCP_LOCAL_MPL 100 CACHE STRING +set(IPCP_LOCAL_MPL 50 CACHE STRING "Default maximum packet lifetime for the Local IPCP, in ms") set(IPCP_LOCAL_MTU 65000 CACHE STRING diff --git a/cmake/config/lib.cmake b/cmake/config/lib.cmake index 1e863013..25130519 100644 --- a/cmake/config/lib.cmake +++ b/cmake/config/lib.cmake @@ -39,18 +39,17 @@ else() message(STATUS "CRC-64/NVMe backend: byte table (no acceleration)") endif() -# Delta-t protocol timers -set(DELTA_T_MPL 60 CACHE STRING - "Maximum packet lifetime (s)") -set(DELTA_T_ACK 10 CACHE STRING - "Maximum time to acknowledge a packet (s)") -set(DELTA_T_RTX 120 CACHE STRING - "Maximum time to retransmit a packet (s)") +# Delta-t protocol timers (Watson bound: 3*MPL + A + R). +# MPL is reported per IPCP (IPCP_*_MPL); A and R are FRCT-wide. +set(DELTA_T_ACK 1000 CACHE STRING + "Maximum time to acknowledge a packet (ms)") +set(DELTA_T_RTX 30000 CACHE STRING + "Maximum time to retransmit a packet (ms)") # FRCT configuration -set(FRCT_REORDER_QUEUE_SIZE 256 CACHE STRING +set(FRCT_REORDER_QUEUE_SIZE 128 CACHE STRING "Size of the reordering queue, must be a power of 2") -set(FRCT_START_WINDOW 64 CACHE STRING +set(FRCT_START_WINDOW 128 CACHE STRING "Start window, must be a power of 2") set(FRCT_LINUX_RTT_ESTIMATOR TRUE CACHE BOOL "Use Linux RTT estimator formula instead of the TCP RFC formula") @@ -59,13 +58,13 @@ set(FRCT_RTO_MDEV_MULTIPLIER 2 CACHE STRING set(FRCT_RTO_INC_FACTOR 0 CACHE STRING "Divisor for RTO increase after timeout: RTO += RTX >> X, 0: Karn/Partridge") set(FRCT_RTO_MIN 250 CACHE STRING - "Minimum Retransmission Timeout (RTO) for FRCT (us)") + "Hard floor for Retransmission Timeout (RTO) for FRCT (us)") set(FRCT_TICK_TIME 5000 CACHE STRING "Tick time for FRCT activity (retransmission, acknowledgments) (us)") +set(FRCT_DEBUG_STDOUT FALSE CACHE BOOL + "Print FRCT final counters to stdout at flow teardown") # Retransmission (RXM) configuration -set(RXM_BLOCKING TRUE CACHE BOOL - "Use blocking writes for retransmission") set(RXM_MIN_RESOLUTION 20 CACHE STRING "Minimum retransmission delay (ns), as a power to 2") set(RXM_WHEEL_MULTIPLIER 4 CACHE STRING @@ -101,3 +100,4 @@ if(HAVE_FUSE) message(STATUS "Application flow statistics disabled") endif() endif() + diff --git a/cmake/config/ssm.cmake b/cmake/config/ssm.cmake index aa369e53..913396ec 100644 --- a/cmake/config/ssm.cmake +++ b/cmake/config/ssm.cmake @@ -19,8 +19,18 @@ set(SSM_PK_BUFF_HEADSPACE 256 CACHE STRING "Bytes of headspace to reserve for future headers") set(SSM_PK_BUFF_TAILSPACE 32 CACHE STRING "Bytes of tailspace to reserve for future tails") -set(SSM_RBUFF_SIZE 1024 CACHE STRING +# Sized to absorb burst arrivals from fragmented SDUs without +# overflowing at the eth->FRCT boundary. Must hold at least one +# full FRCT reorder window plus margin for transient app-thread +# unavailability; 4x FRCT_REORDER_QUEUE_SIZE leaves comfortable +# burst headroom. Floor at 1024 for small RQ configs. +math(EXPR _SSM_RBUFF_DEFAULT "${FRCT_REORDER_QUEUE_SIZE} * 4") +if(_SSM_RBUFF_DEFAULT LESS 1024) + set(_SSM_RBUFF_DEFAULT 1024) +endif() +set(SSM_RBUFF_SIZE ${_SSM_RBUFF_DEFAULT} CACHE STRING "Number of blocks in rbuff buffer, must be a power of 2") +unset(_SSM_RBUFF_DEFAULT) set(SSM_RBUFF_PREFIX "/${SHM_PREFIX}.rbuff." CACHE INTERNAL "Prefix for rbuff POSIX shared memory filenames") set(SSM_FLOW_SET_PREFIX "/${SHM_PREFIX}.set." CACHE INTERNAL @@ -34,7 +44,7 @@ set(SSM_POOL_SHARDS 4 CACHE STRING # Shared by all processes in 'ouroboros' group (~60 MB total) set(SSM_GSPP_256_BLOCKS 1024 CACHE STRING "GSPP: Number of 256B blocks") -set(SSM_GSPP_512_BLOCKS 768 CACHE STRING +set(SSM_GSPP_512_BLOCKS 2048 CACHE STRING "GSPP: Number of 512B blocks") set(SSM_GSPP_1K_BLOCKS 512 CACHE STRING "GSPP: Number of 1KB blocks") @@ -53,13 +63,13 @@ set(SSM_GSPP_1M_BLOCKS 16 CACHE STRING # Per-User Pool (PUP) - for unprivileged applications # Each unprivileged app gets its own smaller pool (~7.5 MB total) -set(SSM_PUP_256_BLOCKS 128 CACHE STRING +set(SSM_PUP_256_BLOCKS 512 CACHE STRING "PUP: Number of 256B blocks") -set(SSM_PUP_512_BLOCKS 96 CACHE STRING +set(SSM_PUP_512_BLOCKS 512 CACHE STRING "PUP: Number of 512B blocks") -set(SSM_PUP_1K_BLOCKS 64 CACHE STRING +set(SSM_PUP_1K_BLOCKS 512 CACHE STRING "PUP: Number of 1KB blocks") -set(SSM_PUP_2K_BLOCKS 48 CACHE STRING +set(SSM_PUP_2K_BLOCKS 512 CACHE STRING "PUP: Number of 2KB blocks") set(SSM_PUP_4K_BLOCKS 32 CACHE STRING "PUP: Number of 4KB blocks") @@ -72,6 +82,23 @@ set(SSM_PUP_256K_BLOCKS 2 CACHE STRING set(SSM_PUP_1M_BLOCKS 0 CACHE STRING "PUP: Number of 1MB blocks") +# Zero classes too small for spb header + HEADSPACE + TAILSPACE + 1 B. +math(EXPR _SSM_MIN_USEFUL_CLASS + "32 + ${SSM_PK_BUFF_HEADSPACE} + ${SSM_PK_BUFF_TAILSPACE}") +foreach(_pair "256:256" "512:512" "1K:1024" "2K:2048") + string(REPLACE ":" ";" _p "${_pair}") + list(GET _p 0 _suffix) + list(GET _p 1 _size) + if(_size LESS _SSM_MIN_USEFUL_CLASS) + set(SSM_GSPP_${_suffix}_BLOCKS 0) + set(SSM_PUP_${_suffix}_BLOCKS 0) + endif() +endforeach() +unset(_SSM_MIN_USEFUL_CLASS) +unset(_p) +unset(_suffix) +unset(_size) + # SSM pool size calculations include(utils/HumanReadable) @@ -127,3 +154,23 @@ message(STATUS " Blocks: ${SSM_PUP_256_BLOCKS}, ${SSM_PUP_512_BLOCKS}, " "${SSM_PUP_1K_BLOCKS}, ${SSM_PUP_2K_BLOCKS}, ${SSM_PUP_4K_BLOCKS}, " "${SSM_PUP_16K_BLOCKS}, ${SSM_PUP_64K_BLOCKS}, ${SSM_PUP_256K_BLOCKS}, " "${SSM_PUP_1M_BLOCKS}") + +# FRCT reorder queue must fit in every enabled size class. If RQ_SIZE +# >= any backing pool, the receiver advertises a window the pool +# cannot back; np1_flow_write fails under load and a single dropped +# fragment wedges the flow. Auto-zeroed classes are skipped. +foreach(_class 256 512 1K 2K) + if(SSM_PUP_${_class}_BLOCKS GREATER 0 + AND NOT FRCT_REORDER_QUEUE_SIZE LESS SSM_PUP_${_class}_BLOCKS) + message(FATAL_ERROR + "FRCT_REORDER_QUEUE_SIZE (${FRCT_REORDER_QUEUE_SIZE}) must be " + "< SSM_PUP_${_class}_BLOCKS (${SSM_PUP_${_class}_BLOCKS}): " + "the FC window cannot exceed the pool that backs OOO stashing.") + endif() + if(SSM_GSPP_${_class}_BLOCKS GREATER 0 + AND NOT FRCT_REORDER_QUEUE_SIZE LESS SSM_GSPP_${_class}_BLOCKS) + message(FATAL_ERROR + "FRCT_REORDER_QUEUE_SIZE (${FRCT_REORDER_QUEUE_SIZE}) must be " + "< SSM_GSPP_${_class}_BLOCKS (${SSM_GSPP_${_class}_BLOCKS}).") + endif() +endforeach() |
