Flow and Retransmission Control Protocol

From Ouroboros
Revision as of 15:24, 17 May 2026 by Dimitri (talk | contribs) (→‎1.3. SACK payload)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search


FRCP runs end-to-end between two peers over a flow. It delivers reliability, in-order delivery, flow control, and liveness. Congestion Control (CC) is not in FRCP - that lives in the IPC Process (IPCP) Congestion Avoidance (CA) policies, orthogonal to FRCP. Flow allocation, naming, and IPCP lifecycle are handled by the IPC Resource Manager daemon (IRMd).

FRCT (Flow and Retransmission Control Task) is the libouroboros implementation of FRCP; the task lives in src/lib/frct.c. The remainder of this document describes the FRCP wire protocol and the behaviour FRCT realises. Code symbols retain the FRCT_ prefix (FRCT_DATA, FRCT_RXM, ...) because they belong to the implementing task; this document references them verbatim.

The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 (Best Current Practice; RFC 2119, RFC 8174) when, and only when, they appear in all capitals.


Notation

u32, u8
Unsigned 32-bit / 8-bit integers (kernel-C style).
ns
Nanoseconds.

Modular sequence-number comparators (32-bit, modulo 2^32):

before(a, b)
(int32_t)(a - b) < 0
after(a, b)
before(b, a)

Used throughout for ackno / seqno ordering checks.

Round-Trip Time (RTT) abbreviations used throughout:

SRTT
Smoothed RTT estimate (RFC 6298).
mdev
Mean deviation of RTT (Linux variance estimator).
EWMA
Exponentially Weighted Moving Average.
RTO
Retransmission Timeout, max(RTO_MIN, srtt + (mdev << MDEV_MUL)).

Timer-bound symbols t_a (a-timer, ACK delay) and t_r (r-timer, retransmission window) are defined in Section 8; t_mpl (Maximum Packet Lifetime) is introduced in Section 2.1 (the inact field) with heritage in Section 15.

Wire-format diagrams follow the IETF convention: bit 0 is the leftmost (most significant) bit and fields are in network byte order unless stated otherwise.


1. Wire format

1.1. PCI header

Fixed 16-octet base Protocol-Control Information (PCI) header prefixed to every FRCP packet (RFC convention: bit 0 leftmost, most-significant bit first). All multi-byte fields except hcs are in network byte order; hcs is an opaque 16-bit value that the receiver recomputes from the wire bytes and compares to the in-place pci->hcs read, so its on-wire byte order need only match between peers running compatible builds. DATA packets on stream-mode flows carry an additional 8-octet extension (see Section 1.5); SACK and RTTP carry their own payloads after the base PCI.

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |             flags             |              hcs              |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                            window                             |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                            seqno                              |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                            ackno                              |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                     payload (variable) ...
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
flags
feature/type bitmap (see Section 1.2).
hcs
CRC-16-CCITT-FALSE Header Check Sequence (HCS) over flags + window + seqno + ackno (+ stream extension when present); the two octets of the hcs field itself are omitted from the CRC input. Verified on receive before any flag-driven dispatch.
window
receiver-advertised right window edge (valid iff FC).
seqno
per-flow sequence number.
ackno
cumulative Acknowledgement (ACK) (valid iff ACK).

A single packet can simultaneously carry DATA + ACK + FC (Flow Control) + RXM (Retransmission) by ORing flag bits; the PCI multiplexes control on the same wire frame in the spirit of SCTP chunk bundling (RFC 9260 sec. 6.10) and QUIC frame multiplexing (RFC 9000 sec. 12.4). DATA-bearing packets carry the caller's payload after the PCI; SACK (Selective Acknowledgement) and RTTP (Round-Trip Time Probe) carry their own typed payloads after the PCI.

Optional framing (per-flow, see Section 2.2). On the wire, the order from inside out is:

Layer Scope
[ PCI + body ] The FRCP packet.
[ PCI + body + CRC-32 ] CRC-32 covers the body only (PCI is in HCS); appended iff qs.ber == 0 on DATA, or on every SACK packet.
[ AEAD-wrap of above ] Iff Authenticated Encryption with Associated Data (AEAD) is enabled.
  • HCS in the PCI covers the header fields on every packet and is verified before any flag-driven dispatch.
  • The CRC-32 trailer (IEEE 802.3 / zlib reflected polynomial 0xEDB88320, init 0xFFFFFFFF, xor-out 0xFFFFFFFF) covers the body on DATA when qs.ber == 0 and on every SACK packet; the trailer is written as a raw uint32_t (the same convention as hcs: opaque on the wire as long as both peers run compatible builds). The PCI is not under the CRC (Cyclic Redundancy Check) because the HCS already protects it. It is appended before AEAD encryption and therefore rides inside the AEAD wrap when both are active; the AEAD tag (~2^-128 forgery probability) dominates the CRC (~2^-32) for integrity in that mode but the CRC trailer is currently retained.
  • When encryption is enabled, the entire (possibly-CRC'd) FRCP packet is wrapped with AEAD inside the shared-memory packet buffer (spb, struct ssm_pk_buff); the packet grows by the AEAD overhead, namely a leading nonce / Initialization Vector (IV) of headsz bytes (crypt_get_ivsz) and a trailing authentication tag of tailsz bytes (crypt_get_tagsz).

Both CRC and AEAD are layered around the FRCP wire format and are not visible to the FRCP machinery itself.


1.2. Flag bits

Flag bits are numbered most-significant-bit first to match the wire diagram (bit numbering per Section 1.1; bit 0 is the MSB of the 16-bit flags field and lands at wire-position 0 in network byte order). Bits 13..15 are reserved and MUST be transmitted as zero.

Bit Mask Name Meaning
0 0x8000 DATA Carries caller payload
1 0x4000 DRF Data Run Flag: start of a fresh run
2 0x2000 ACK Acknowledgement: ackno field valid
3 0x1000 NACK Negative ACK; seqno = arrival_seqno-1
4 0x0800 FC Flow Control: window field valid (rwe)
5 0x0400 RDVS Rendezvous probe (window-closed)
6 0x0200 FFGM First Fragment (role bit 0; see below)
7 0x0100 LFGM Last Fragment (role bit 1; see below)
8 0x0080 RXM Retransmission
9 0x0040 SACK Selective ACK block list in payload
10 0x0020 RTTP RTT Probe / echo (payload follows)
11 0x0010 KA Keepalive
12 0x0008 FIN End-of-stream marker (stream mode)
13-15 -- -- Reserved (MUST be zero)

The (FFGM, LFGM) pair encodes the fragment role of a DATA-bearing Service Data Unit (SDU), SCTP-style begin/end flags (RFC 9260 sec. 3.3.1):

FFGM LFGM Role
1 1 Sole / un-fragmented SDU (begin AND end)
1 0 First fragment of a multi-fragment SDU
0 0 Middle fragment
0 1 Last fragment

Each fragment is carried in its own FRCP packet with its own seqno; FRTX (the FRCT Retransmission service mode, see Section 2.2) recovers individual fragments via the normal Retransmission Timeout (RTO) / SACK / Recent Acknowledgement (RACK, RFC 8985) path. The receiver reassembles the SDU at consume time once the contiguous [FIRST .. LAST] run has fully arrived. On non-DATA packets the role bits are unused and MUST be transmitted as zero.

In stream mode (qos.service == SVC_STREAM, see Section 16) there are no SDU boundaries to encode, so FFGM and LFGM are unused and MUST be transmitted as zero. End-of-stream uses a dedicated bit (FIN, bit 12) carried on a 0-byte DATA packet, emitted at write-half close (fccntl to FLOWFRDONLY), during linger drain, and at flow_dealloc; emission is idempotent (first call wins). After contiguous delivery of the FIN-bearing slot, the receiver latches byte_fin at the FIN's start offset; flow_read returns 0 (end-of-file, EOF) once buffered bytes have been drained up to byte_fin. Per-byte position is carried by the [start, end) extension (Section 1.5).


1.3. SACK payload

A SACK packet has the FRCT_ACK | FRCT_FC | FRCT_SACK flag bits set (bit numbering per Section 1.1). Following the 16-octet PCI, the payload is a 2-octet block count (network byte order), 2 octets of padding to 4-byte align the block list, then n_blocks pairs of 32-bit start/end seqnos describing present (received) ranges above the cumulative ACK.

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |           n_blocks            |        padding (2 octets)     |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                           start[0]                            |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                            end[0]                             |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                           start[1]                            |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                       ... n_blocks pairs total ...

n_blocks <= SACK_MAX_BLOCKS (2048). The per-flow effective cap is further bounded by (frag_mtu - PCI - 4) / 8 blocks per packet; SACK packets carry no stream extension, so PCI here is the 16-octet base header even on stream-mode flows.

Wire invariant: every block produced by the receiver, except an optional leading Duplicate SACK (D-SACK) block as described below, describes a range strictly above the cumulative ACK carried in the PCI ackno field (after(start[i], ackno)). This makes the D-SACK convention below unambiguous; the receiver-side builder MUST preserve it.

Duplicate SACK (D-SACK, RFC 2883) is signalled in-band: no flag bit, no extra framing. Modular seqno arithmetic uses the before() / after() comparators defined in the Notation block.

Encoding. When a duplicate is observed the receiver arms a single-slot pending report (dsack_seqno + dsack_valid, latest-wins across multiple arms before the next emit). On the next outbound SACK the receiver prepends block[0] = [dsack_seqno, dsack_seqno + 1) - always a one-seqno range - and clears the flag. The three arm sites are listed in Section 10; case-1 sites yield dsack_seqno < rcv_cr.lwe (the next pci.ackno), and the case-2 site (rq_accept conflict) yields dsack_seqno in [rcv_cr.lwe, rcv_cr.rwe).

Detection. The sender classifies block[0] by its relation to pci.ackno:

case 1 (RFC 2883 sec. 4.1.1, full duplicate)
before(blocks[0].start, pci.ackno) AND pci.ackno - blocks[0].start <= MAX_DSACK_LAG (== RQ_SIZE). The lag bound rejects stale or spoofed reports beyond one receive window.
case 2 (RFC 2883 sec. 4.1.2, partial duplicate)
blocks[0] is a sub-range (with at least one endpoint differing) of some blocks[i>0] - i.e. the same packet's remaining SACK blocks already describe the duplicated seqno as received.

On detect, the sender:

  • bumps reo_wnd_mult by 1, capped at REO_WND_MULT_MAX (= 20), per RFC 8985 sec. 6.2 step 4;
  • snapshots dsack_lwe_snap = snd_cr.lwe, resetting the 16-cum-ACK halving counter so the multiplier doesn't decay while D-SACK evidence is still arriving;
  • excludes block[0] from the gap-marking loop (n_real = n - 1), so a D-SACK alone never enters NewReno-careful recovery (see Section 8); only non-D-SACK blocks count as gaps.

The reo_wnd_mult halving cadence (once per 16 cumulatively-ACK'd seqnos since the most-recent D-SACK arrival or halve event) and the reset-to-1 on a HoL RTO fire are both per the same RFC 8985 clause. The clamp-and-skip path in the regular SACK-mark loop is incidentally idempotent on any leftover case-1 or case-2 block (start < snd_cr.lwe clamps to snd_cr.lwe and the inner loop skips k == snd_cr.lwe; case-2 re-NULLs slots already marked received by later blocks), so block[0] is harmless even when fed to the loop.

1.4. RTTP payload

An RTTP (Round-Trip Time Probe) packet has only the FRCT_RTTP flag set (bit numbering per Section 1.1). Following the 16-octet PCI, the payload is 24 octets (packed):

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                          probe_id                             |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                          echo_id                              |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                                                               |
    +                  nonce (16 octets, echoed verbatim)           +
    |                                                               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
probe_id
sender counter, 0 on reply, 0 reserved.
echo_id
peer's probe_id, 0 on outbound probe.
nonce
random, echoed unmodified, memcmp'd to defeat spoof.


1.5. Stream PCI extension

A stream-mode flow (qos.service == SVC_STREAM) carries an extra 8-octet extension after the 16-octet base PCI on every DATA packet (bit numbering per Section 1.1):

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                            start                              |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                             end                               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
start
octet offset of the first payload byte in the stream.
end
octet offset one past the last payload byte; end - start equals the on-wire payload length.

Total stream-mode PCI for DATA packets is 24 octets (16 base + 8 extension); control packets (SACK, RTTP, bare ACK, KA, etc.) retain the 16-octet base PCI. Stream mode MUST be negotiated at flow allocation; the extension is present iff stream mode is in use, never on a per-packet basis. Both peers MUST treat start/end as monotonic 32-bit byte offsets; when a slot reaches the head of the contiguous run with start not equal to the prior packet's end the slot is silently dropped at delivery time (Section 16) rather than rejected at stash.

This is the QUIC STREAM-frame reassembly model (RFC 9000 sec. 19.8): each packet carries its packet seqno (this PCI's seqno field) and a separate stream byte position (start/end). Separating the two avoids TCP's conflation of packet identity with byte position which forces Karn's algorithm for Round-Trip Time (RTT) sampling (no RTT sample on retransmits, RFC 6298 sec. 3); FRCP applies the Karn-equivalent gate via a combination of per-packet FRCT_RXM, per-slot SND_RTX flags, and a sample-fence rtt_lwe (see Section 2.1 and Section 12). FRCP's fixed-32-bit start/end wrap at 4 GiB of wire bytes, narrower than QUIC's 62-bit varint offset (cf. RFC 9000 sec. 16); the on-wire wrap is handled by the same modular before() / after() comparators (Section 1.3) FRCP uses for seqnos, which remain unambiguous as long as the in-flight byte window stays strictly under 2 GiB (the half-range of the signed-int32 difference in before()). The default per-flow ring is 1 MiB; the implementation caps ring_sz at 128 MiB (FRCT_STREAM_RING_SZ_MAX), well below the 2 GiB half-range bound. The runtime byte counters exposed via FUSE (Filesystem in Userspace) in the Ouroboros Resource Information Base (RIB, a virtual-filesystem introspection bridge) are platform size_t and do not wrap on 64-bit hosts.


2. Per-flow state and service modes

2.1. Per-flow state

Each flow keeps a sender control record and a receiver control record:

lwe
u32
snd: oldest unacked seqno (cumulative ACK boundary as seen by sender); rcv: next in-order seqno expected
rwe
u32
snd: peer-advertised right window edge; rcv: locally-advertised right window edge
cflags
u8
per-direction feature flags: retransmission (FRCTFRTX), receiver flow control (FRCTFRESCNTL), linger-on-close (FRCTFLINGER); see <ouroboros/fccntl.h>
seqno
u32
snd: next seqno to send; rcv: force-ACK trigger - set on a stale or dup DATA so the next ack_snd emits a fresh cumulative ACK
ackno
u32
snd: seqno counter for standalone ACK-bearing control packets (delayed ACK, SACK, final ACK on dealloc); not bumped on piggybacked ACK riding a DATA packet (which uses the DATA seqno). Used by wire-dup ACK detection; rcv: incoming-ACK dedup tracker
act
ns
last activity (used by inactivity / DRF)
inact
ns
inactivity threshold; sender = 3*mpl + a + r + 1s, receiver = 2*mpl + a + r + 1s. mpl is the Maximum Packet Lifetime (delta-t terminology; see Section 15); a and r are the FRCT a-timer and r-timer bounds (see Section 8). The asymmetry is load-bearing for pre-DRF NACK (Section 9).

The sender holds a per-slot ring snd_slots[RQ_SIZE] keyed by (seqno mod RQ_SIZE). Each slot tracks its retransmit entry (rxm), last-send timestamp, and retransmit flag bits: SND_RTX (a retransmit is pending or has fired, gates the next RTT sample under Karn) and SND_FAST_RXM (one-shot fast-retransmit staged for this loss event).

The receiver holds a parallel reorder ring rcv_slots[RQ_SIZE] (referred to as rq[] in prose) holding stashed out-of-order packet-buffer indexes; both FRTX and best-effort flows share this path. The invariant rwe - lwe <= RQ_SIZE holds: on each consume the receiver advances rwe by the consumed count, capping the receive window at RQ_SIZE seqno slots.

A separate fence variable rtt_lwe is bumped on every retransmit (timer-fire, SACK-driven, fast-rxm, NACK-driven) and on every seqno_rotate (Section 4) to mark the seqno range whose RTT samples MUST be discarded.


2.2. Service modes (orthogonal axes)

FRCP exposes its wire features as a vector of independent QoS axes selected at flow allocation time. All flows go through the same flow_alloc(name, qos, ...) primitive; the qosspec_t passed in determines which protocol machinery engages on the wire. This contrasts with the POSIX BSD socket model where TCP and UDP require different socket types (SOCK_STREAM / SOCK_DGRAM).

The axes:

service
0 = unordered (no FRCP engagement: raw datagrams, no PCI on the wire, UDP-equivalent at this layer); 1 = message-ordered (FRCP engaged; SDU boundaries preserved across fragmentation); 2 = stream (byte-oriented, no SDU boundaries; FRTX required)
loss
0 = lossless service requested: FRTX retransmit machinery engages (Section 8); MUST be 0 for service=2. Non-zero = best-effort, FRTX off.
ber
Bit Error Rate tolerance. 0 = error-free service requested: a CRC trailer is appended after the body of DATA packets and verified on receive (added / checked outside the FRCP PCI; see Section 1.1). Non-zero = peer accepts errors; trailer omitted. SACK control packets carry a CRC32 trailer regardless of ber; the ber gate applies to DATA only.
timeout
Peer-timeout (ms); 0 disables the keepalive timer. Independent of FRCP engagement.

Encryption is a separate per-flow attribute set at flow setup; when enabled it wraps the FRCP packet (PCI + body, plus the CRC trailer if any) under AEAD, expanding the spb by headsz + tailsz octets (nonce / tag). The CRC trailer is currently kept inside the AEAD wrap (see Section 1.1).

Reachable combinations exported by include/ouroboros/qos.h:

Cube service loss ber Engaged
qos_raw 0 1 1 Raw passthrough
qos_raw_safe 0 1 0 Raw + CRC trailer
qos_rt 1 1 1 FRCP, no FRTX, no CRC
qos_rt_safe 1 1 0 FRCP, no FRTX, CRC
qos_msg 1 0 0 FRCP + FRTX
qos_stream 2 0 0 FRCP + FRTX, stream

Forced couplings actually enforced by the public API:

  • service == SVC_STREAM (2) requires loss == 0; flow_alloc / flow_accept reject the pair otherwise with -EINVAL.
  • FRTX requires FRCP engagement (service != SVC_RAW); requesting loss = 0 with service = SVC_RAW is structurally a no-op because no frcti is created.
  • The QOS_DISABLE_CRC build flag globally forces ber = 1. Note: this flag defaults to ON, so default builds ship with CRC disabled until QOS_DISABLE_CRC is set to OFF.

Caveat: the API does NOT force ber = 0 when service != SVC_RAW. qos_rt has service = SVC_MESSAGE with ber = 1, which means the PCI itself is not CRC-protected on that cube; the HCS (Section 1.1) remains the only integrity check on the header.

The FRCP-no-FRTX regime (service = SVC_MESSAGE, loss > 0) is meaningful and live: sequence numbering, in-order delivery, flow-control advertisement, KA, DRF rotation, and SDU fragmentation / reassembly (Section 7.2) all run. Lost packets are dropped rather than retransmitted; a permanently-lost mid-fragment is dropped via skip-past-gap once a later SDU is visible in the reorder ring.


3. Protocol parameters

Parameter Value Role
RQ_SIZE compile-time, power of 2 (default 128) Slot ring / rcv window width
START_WINDOW compile-time, power of 2 (default 128) Initial rwe-lwe after rotate
RTO_MIN MAX(250 us build-tunable, 1<<RXMQ_RES); per-flow via fccntl (FRCTSRTOMIN). Default ~1 ms with RXMQ_RES=20. RTO floor; also floored at the retransmit-wheel resolution (~1 ms by default).
MAX_RTO_MUL 20 Backoff shift cap
RACK window R MIN(reo_wnd_mult * min_RTT/4, SRTT) with MIN_REORDER_NS = 250 us floor; reo_wnd_mult scales on D-SACK, cap 20 Reorder window; per RFC 8985 sec. 6.2; reo_wnd_mult per sec. 6.2 step 4
MIN_RTT_WIN_NS 300 s (5 min, Linux tcp_min_rtt_wlen) min_RTT windowed re-anchor
REO_WND_MULT_MAX 20 (RFC 8985 sec. 6.2 step 4) reo_wnd_mult cap
REO_DECAY_PKTS 16 (RFC 8985 sec. 6.2 step 4 / RACK.reo_wnd_persist) Fresh-ACK'd seq count per halving
MAX_DSACK_LAG RQ_SIZE D-SACK sanity cap
RTT_QUARANTINE 32 (seqno steps) NewReno gate pad
SACK rate-limit SACK_MIN_GAP_NS (250 us, fixed) Min SACK gap
SACK_MAX_BLOCKS 2048 (wire cap; per-flow capped at (frag_mtu-PCI-4)/8) Per-SACK block cap
SACK_RXM_MAX 32 Per-pass staged retransmit cap
DUP_THRESH 3 (RFC 8985 default) Hybrid fast-rxm trigger (Section 8)
MDEV_MUL 2 (build-tunable via FRCT_RTO_MDEV_MULTIPLIER) mdev shift in RTO = srtt + (mdev << MDEV_MUL)
RTTP nonce 16 octets Echoed verbatim
RTTP_RING 8 In-flight probes
RTT clamp 16 * srtt Probe-sample upper bound (ACK-derived RTT samples gated by Karn / recovery only)
Cold-probe cadence 100 ms (rx-driven; see Section 12) Pre-srtt RTTP rate
DELT_RDV 100 ms RDVS emit cadence
MAX_RDV 1 s RDVS give-up
Delayed-ACK fire 2 * TICTIME (TICTIME = FRCT tick granularity, default 5 ms; 2*TICTIME = 10 ms by default) Fired after the first in-order DATA arrival; tick is build-tunable
NACK send cooldown srtt when an srtt sample exists, else 100 ms Pre-DRF NACK rate-limit
MAX_SDU 1 MiB Max reassembled SDU; configurable per flow

The per-flow fragment Maximum Transmission Unit (MTU) is computed at flow setup from the lower IPCP's mtu minus encryption headsz / tailsz and CRC trailer; there is no FRCT-level default or environment-variable override.


4. Sequence-number rotation (DRF)

The DRF (Data Run Flag) bit on an outbound packet means "this is the start of a fresh data run" and is set whenever the sender has nothing in flight (snd_cr.seqno == snd_cr.lwe).

Independently of that, if the sender has been idle longer than snd_cr.inact AND the pipe is empty (snd_cr.seqno == snd_cr.lwe), seqno_rotate() rolls a random new seqno before the send and resets

    snd_cr.seqno  = random()
    snd_cr.lwe    = snd_cr.seqno
    snd_cr.rwe    = snd_cr.seqno + START_WINDOW
    rtt_lwe       = snd_cr.seqno
    in_recovery   = false   (recovery state, see Section 8)
    recovery_high = snd_cr.seqno

The receiver, on observing rcv-side inactivity (now - rcv_cr.act > rcv_cr.inact), requires a DRF on the next DATA packet; otherwise it replies with a rate-limited NACK (see below). Non-DATA control packets pass through without the DRF requirement. On DRF the receiver releases the rq[] slots and rebases

    rcv_cr.lwe   = seqno
    rcv_cr.rwe   = seqno + RQ_SIZE
    rcv_cr.seqno = seqno

If the inactive packet has DATA but no DRF, a rate-limited NACK is fired back to the sender (cooldown per Section 3); non-DATA stale arrivals fall through to normal processing (no NACK, no drop).


5. Send path

  1. If the SDU exceeds (frag_mtu - data_hdr_len), the caller (dev.c) fans it out into ceil(count / (frag_mtu - data_hdr_len)) fragments, each emitted via frcti_snd as its own DATA packet with a per-fragment role (Section 7.2); both FRTX and best-effort flows fragment. Raw flows (no FRCP engagement, qos.service == SVC_RAW) carry no PCI and return -EMSGSIZE for any SDU larger than one packet at the layer below. An SDU that fits in a single packet is sent as SOLE. frcti_snd reserves PCI head room; sets DATA, plus DRF when the pipe is empty (snd_cr.seqno == snd_cr.lwe).
  2. seqno_rotate() if past sender inactivity and the pipe is empty (Section 4).
  3. Advertise FC (pci.window = frcti_advert_rwe(frcti), i.e. rcv_cr.rwe clamped to rcv_cr.lwe + ring_seq_cap in stream mode) when the receiver side is recent: now - rcv_cr.act < rcv_cr.inact.
  4. Reliable mode (FRTX): leave snd_cr.lwe where it is; reset the slot at RQ_SLOT(seqno) (snd_slots[p].time = now, snd_slots[p].flags = 0); queue an rxm_entry (saves a packet copy, arms a wheel timer at now + (rto << rto_mul)). Piggyback ACK (pci.ackno = rcv_cr.lwe) while the a-timer for the most recent received DATA packet has not yet expired (now - rcv_cr.act <= t_a); on piggyback, set rcv_cr.seqno = rcv_cr.lwe so the next delayed-ACK fire is suppressed. See Section 8 for t_a / t_r semantics.
  5. Best-effort mode (no FRTX): advance snd_cr.lwe immediately (snd_cr.lwe = snd_cr.lwe + 1, snd_cr.rwe = snd_cr.lwe + RQ_SIZE); no retransmit state. No send-side RTT probe is armed in this mode (rtt_probe_arm requires an in-flight seqno, which best-effort never has); the rx-driven cold seeder in frcti_rcv is the only probe path.
  6. In reliable mode, optionally arm an RTT probe (Section 12).


6. Receive path

6.1. Early-exit dispatch

Keepalive (KA), RTT probe (RTTP), pre-DRF NACK, and rendezvous (RDVS) packets short-circuit out of frcti_rcv before the locked main path; each handler takes its own lock internally.

      incoming packet
            |
            v
       +---------+
       | KA?     |---yes--> ka_rcv  ; return
       +---------+
            |no
            v
       +---------+
       | RTTP?   |---yes--> rttp_rcv; return
       +---------+
            |no
            v
       +---------+
       | NACK?   |---yes--> nack_rcv; return  (see Section 9)
       +---------+
            |no
            v
       +---------+
       | RDVS?   |---yes--> rdv_rcv ; return  (reply bare FC, ackno=0)
       +---------+
            |no
            v
       acquire wrlock; enter locked main path
KA
refresh t_ka_rcv, honour piggybacked ACK.
RTTP
probe (echo back nonce) or echo (verify nonce, sample RTT).
NACK
pre-DRF, sender-side handler. See Section 9.
RDVS
reply with a bare FC packet (ackno = 0); rdlock only.


6.2. Locked main path

Steps below run with the per-flow frcti.lock held for writing (pthread_rwlock_wrlock) unless noted.

rcv_inact_check
Only meaningful when the receive side is stale. On DRF (Data Run Flag): release rq[] slots, rebase rcv_cr, continue. On stale DATA without DRF: fire a pre-DRF NACK if cooldown allows (Section 9), then discard the packet; on cooldown, drop without sending a NACK (a pending cumulative ACK from drop_packet may still go out). Non-DATA, non-DRF arrivals bypass rcv_inact_check entirely; pure-DRF stale arrivals fall through after the DRF rebase branch.
DATA-only act refresh
Refresh rcv_cr.act only when FRCT_DATA is set, so that non-DATA packets never block the next DRF rebase.
Wire-dup gate
Before flag-driven dispatch, drop wire-duplicate ACKs and wire-duplicate DATA (is_dup_ack / is_dup_data). The DATA check is bypassed for FRCT_RXM-bearing arrivals so the piggybacked ACK / SACK / FC carried on a retransmitted DATA at an already-ACK'd seqno is still applied; the stale-in-window branch below then drops the packet.
ACK
Drop ACKs whose ackno falls outside (snd_cr.lwe, snd_cr.seqno]. If ackno == snd_cr.lwe (non-advancing cumulative ACK), drive RACK fast-retransmit consideration (Section 8). Otherwise advance snd_cr.lwe = ackno, collapse rto_mul to 0 (Karn-gated by SND_RTX on the just-acknowledged slot, the old head-of-line), reset dup_thresh to 0, update t_latest_ack to the send-time of the slot at ackno-1 (consumed by RACK and SACK below), decay reo_wnd_mult per RFC 8985 sec. 6.2 step 4, exit NewReno-careful recovery (see Section 8) on ackno >= recovery_high or ackno == snd_cr.seqno, and feed an RTT sample if eligible (Section 12).
SACK
Walk the block list. For each block (a present range above lwe) NULL out snd_slots[k].rxm, clear the slot's per-send flags, and advance t_latest_ack to the latest send-time covered (the Forward Acknowledgement / fack equivalent, Mathis & Mahdavi 1996); the first block whose start clamps to snd_cr.lwe skips this fack update so that a head-of-line clamp does not falsely advance fack. For un-SACKed gaps below hi_sacked, stage a retransmit per slot that is (1) still owned (rxm != NULL), (2) not already SND_FAST_RXM, (3) not aged out past t_r, and (4) either outside the RACK reorder window R OR with dup_thresh >= DUP_THRESH (the RFC 8985 sec. 6.2 hybrid trigger). Mark the slot SND_FAST_RXM and NULL the rxm at stage time. Capped at SACK_RXM_MAX staged retransmits per receive pass; what's left rides the next SACK.
FC
Bump snd_cr.rwe (clamped to lwe + RQ_SIZE, never shrinks) and mark window open.
DATA
Bounds-check seqno against window. On stale-dup (seqno < rcv_cr.lwe), set rcv_cr.seqno = seqno to force a fresh ACK on the next ack_snd, then drop. On accept: both FRTX and best-effort stash the packet-buffer index into rq[seqno mod RQ_SIZE]. Fragments stash unchanged - the role bits are inspected only at consume time (Section 7.2). On out-of-order arrival, build a SACK reply if not rate-limited (per Section 3) and not deduplicated against the previous (rcv_cr.lwe, n_blocks) pair; D-SACK reports always bypass the dedup. If both rate-limit and dedup suppress the reply, neither SACK nor delayed-ACK fires (the sender picks up the gap on its next ACK). On in-order arrival, arm the delayed-ACK timer.
drop_packet exit
Releases the per-packet shared-memory buffer (spb), then calls ack_snd synchronously after the spb release to surface any pending cumulative ACK.


7. Read path and reassembly

7.1. Read path

flow_read returns a full reassembled SDU (Service Data Unit) via frcti_consume on every FRCP SDU-mode flow (FRTX or best-effort); stream-mode is covered in Section 16. An incomplete head-of-line (HoL) run yields -EAGAIN; an oversized run yields -EMSGSIZE (the run is dropped so the flow does not stall). On best-effort flows, a permanently-lost mid-fragment is dropped as soon as a later complete SDU becomes visible in the ring (Section 7.2 skip-past- gap).

Raw flows carry no frcti, so flow_read returns the next pending packet-buffer index directly, with no role-bit inspection. (Raw service is selected via qos.service == SVC_RAW at flow allocation, which suppresses frcti creation.)

frcti_pdu_ready is the no-advance peek used by fevent (the Ouroboros flow-event multiplexer, the poll(2)-equivalent on flows). It returns ready only when the head-of-line run is complete and the lead packet (a Protocol Data Unit, here one FRCP packet) is present at rcv_cr.rwe - RQ_SIZE; any other state (including the best-effort skip-past-gap case) returns not ready, and frcti_consume is left to drop the broken prefix and re- inspect.


7.2. Fragmentation and reassembly

Send side (flow_write_frag). An SDU larger than (frag_mtu - PCI) is split into ceil(count / (frag_mtu - PCI)) fragments; each fragment is its own FRCP packet with its own seqno and a per-fragment role flag pair (Section 1.2). Roles are assigned at emit time:

i Role
n=1 SOLE
i=0 FIRST
i=n-1 LAST
else MID

A mid-loop allocation or transmit failure may yield a partial write: the call returns the bytes already enqueued (off > 0) or the underlying error (off == 0). Best-effort flows fragment identically; on the receiver, a partial run with a permanently- lost fragment is dropped when a later complete SDU is visible in the ring (see skip-past-gap below). Raw flows carry no PCI and refuse anything larger than the layer's user MTU (-EMSGSIZE).

Wire-level recovery is fragment-agnostic on FRTX flows: each fragment's seqno flows through SACK / RACK / RTO / NACK exactly as for a SOLE DATA packet, and reassembly does not re-enter the loss-detection path. Best-effort flows run the same seqno machinery (DRF, FC, ACK piggyback, pre-DRF NACK emit) but queue no rxm state at the sender, so a lost MID is unrecoverable; skip-past-gap handles it (below).

Receive side. Fragments stash into rq[seqno] unchanged; role bits are read only at consume time. frag_run_inspect, called from frcti_consume, walks the ring starting at the oldest still- undelivered seqno base = rcv_cr.rwe - RQ_SIZE (equal to rcv_cr.lwe only when no partial run is in progress; during a partial run lwe has already advanced past base). It produces one of three outcomes:

Outcome Cause
DELIVER (n) rq[base]=SOLE (n=1), or rq[base]=FIRST and a LAST follows in slots [base+1..base+n-1] with all intermediate roles in {MID,FIRST,LAST} contiguous.
DROP (n) rq[base] is MID or LAST without a preceding FIRST (n=1); a FIRST..[non-LAST]..new-FIRST or new-SOLE mid-run (drop the broken prefix with n = run length minus 1, so the new FIRST/SOLE stays); or, on best-effort flows, a gap at base with a FIRST/SOLE later in the ring (drop up to the new run start).
NOT_READY rq[base] absent or FIRST..[non-LAST] with no later FIRST/SOLE in the ring (FRTX waits for retx; best-effort waits for arrival).

DELIVER triggers frag_gather: a scatter-gather memcpy of the n consecutive fragments at rq[base..base+n-1] directly into the caller's buffer; each per-packet shared-memory buffer (spb) is released and rwe advances by n. lwe was already advanced incrementally as each contiguous fragment arrived; frag_gather only restores the fixed-width invariant rwe == lwe + RQ_SIZE. No intermediate reassembly buffer is allocated.

DROP advances rwe past the broken prefix (releasing the spbs) and pulls lwe up to the new trailing edge if needed; the next consume retries from the new base. Oversize or arithmetically overflowing delivery (sum of fragment lengths > max_rcv_sdu, sum > caller's buffer, or running-sum overflow) also drops the run with -EMSGSIZE.

Skip-past-gap (best-effort only). On FRTX, a gap in the run means "waiting for retransmit" and frag_run_inspect returns NOT_READY. On best-effort flows the gap is permanent, so frag_run_inspect scans forward in the ring for the next FIRST or SOLE; if one is visible within RQ_SIZE, it returns DROP for the broken prefix and the consume loop retries at the new lwe. Memory hold is bounded by RQ_SIZE; the partial releases on the next consume call once a later complete run exists. Voice-like flows (one SOLE per SDU) see no extra wait: any later SOLE makes the prior gap droppable immediately.

The choice to defer reassembly to consume time keeps the receive path zero-copy: fragments stay in the shared-memory ring until the application pulls, and the SDU lands directly in the caller's buffer.


8. Retransmission

FRCP is bounded by two delta-t-derived timers (Watson 1981, see Section 15):

  • t_a (a-timer): upper bound on ACK delay. An ACK for a received DATA packet MUST be emitted within t_a of receipt; an attempt to send an ACK after the a-timer has expired is suppressed (the sender's RTO is already in motion).
  • t_r (r-timer): upper bound on retransmission. A given DATA packet MUST NOT be retransmitted after t_r has elapsed since its first send (t0); when the bound is hit, the flow is declared down (raising the Ouroboros asynchronous flow condition ACL_FLOWDOWN, which marks the flow dead to both endpoints) rather than retransmitted again.

Each in-flight FRTX seqno owns one rxm_entry, armed in a hashed timing wheel; the wheel deadline is the slot's next eligible retransmit time.

RTO timer
On fire (rxm_due), re-emit with FRCT_RXM, mark SND_RTX (Karn-suppress next ACK's RTT sample), and (for the head-of-line (HoL) slot only) bump rto_mul up to MAX_RTO_MUL. Wheel deadline is t_send + (rto << rto_mul). Re-armed unless consumed. The RTO timer also clears SND_FAST_RXM (re-arming fast-retransmit eligibility), resets reo_wnd_mult to 1 on a HoL fire (RFC 8985 sec. 6.2 step 4 reset clause), and marks the flow ACL_FLOWDOWN if its frct_tx call fails.
r-timer guard
Before any retransmit attempt, check (now - t0) against t_r. If exceeded, the slot is no longer eligible for retransmit. Only the RTO timer (rxm_due) treats r-timer expiry as terminal: it marks the flow ACL_FLOWDOWN (peer unreachable). Fast-retransmit, SACK-driven retransmit, and NACK-driven head-of-line re-emit silently skip aged-out slots and defer the flow-down decision to the next RTO fire.
Fast retransmit (hybrid trigger, RFC 8985 sec. 6.2)
On a non-advancing cumulative ACK with the scoreboard advanced, fire one fast retransmit when EITHER (a) the head-of-line slot's latest send is older than the RACK reorder window R (Section 3) and not yet aged out, OR (b) the SACK dup-thresh count above snd_cr.lwe reaches DUP_THRESH (= 3, RFC 8985 sec. 6.2 step 4). Fires at most once per non-advancing cumulative-ACK value, gated by rack_fired_lwe (the snd_cr.lwe at which fast-retransmit last fired). Set SND_FAST_RXM on the slot (one-shot per-slot gate) and enter NewReno-style careful recovery (see NewReno below in this section).
The RACK reorder window R uses the RFC 8985 sec. 6.2 form R = MIN(reo_wnd_mult * min_RTT / 4, SRTT) with a MIN_REORDER_NS = 250 us floor. Before the first RTT sample seeds min_rtt, R falls back to MIN(reo_wnd_mult * SRTT / 4, SRTT), still floored at MIN_REORDER_NS (consistent with the windowed-minimum fallback described in Section 12). min_rtt is a windowed minimum over the last MIN_RTT_WIN_NS = 5 min of RTT samples (matches the Linux tcp_min_rtt_wlen default) so a route change to a longer path eventually re-anchors the reorder window without relying on reo_wnd_mult growth alone.
SACK-driven retransmit
For each gap below hi_sacked whose slot is (1) still owned, (2) not already SND_FAST_RXM, (3) not aged out past t_r, and (4) either outside the RACK window R OR with dup_thresh >= DUP_THRESH (same hybrid as fast-retransmit, see Section 6.2), re-emit. Each SACK-driven retransmit re-arms a fresh rxm so a lost retransmit can still be recovered by its own RTO timer.
NewReno
On entry, recovery_high = snd_cr.seqno + RTT_QUARANTINE. Exit when ackno >= recovery_high or ackno == snd_cr.seqno (the latter means everything sent has been acknowledged). seqno_rotate also clears recovery.


9. Pre-DRF NACK

The two sides have different inactivity thresholds (snd_cr.inact > rcv_cr.inact), so a receiver can detect "stale data run" before the sender's own DRF logic kicks in. NACK is the receiver-driven nudge that asks the sender to re-transmit the head of the run.

Send (frcti_nack_snd, called by frcti_rcv when rcv_inact_check returns FRCT_INACT_NEED_NACK)
When an incoming DATA packet has no DRF and rcv-side activity is older than rcv_cr.inact, the receiver emits a bare packet with flags = FRCT_NACK and seqno = arrival_seqno - 1 (informational only, not consulted by the receive handler). The cooldown in Section 3 rate-limits the burst. Non-DATA non-DRF arrivals bypass rcv_inact_check entirely; non-DATA DRF still rebases via the DRF branch.
Receive (frcti_nack_rcv)
Dispatched in the early-exit branch (Section 6.1), before rcv_inact_check. The sender copies the head-of-line (HoL) rxm packet, marks the slot SND_RTX | SND_FAST_RXM (Karn-suppress next ACK, one-shot fast-rxm gate), sets rtt_lwe = snd_cr.lwe + 1, and re-emits via fast_rxm_send with FRCT_RXM and a refreshed ackno. The original rxm_entry and its RTO timer are left armed - the NACK emit is additive to the normal retransmit machinery, not a replacement. No-op if nothing is in flight, the HoL slot has aged past t_r, or the HoL rxm pointer has been cleared by SACK or RACK.

NACK has exactly one role: lost first-of-run (DRF) packet recovery. Until the DRF packet arrives, the receiver cannot rebase its window, so any subsequent in-flight packets look stale to the receiver. The NACK fires the moment a stale receiver sees DATA without DRF, telling the sender to re-emit the head-of-line (DRF) packet at NACK-cooldown latency rather than waiting for the initial RTO (which is the configured default until srtt is seeded by the first probe round-trip). Mid-stream loss is NOT NACK-driven; it is recovered by the sender's RTO, fast retransmit, and SACK-driven retransmit paths (Section 8) only.

The existing rxm_entry and its RTO timer are left armed on a NACK re-emit, so the RTO path remains the eventual fallback.

10. Cumulative + selective ACK

Cumulative ACK is ackno = rcv_cr.lwe. On out-of-order arrival the receiver also emits a SACK packet (Section 1.3) whose payload lists present blocks above lwe (analogous to TCP SACK / QUIC ACK ranges). SACKs are rate-limited per Section 3 and suppressed when neither lwe nor block count has changed since the last SACK.

D-SACK reports (RFC 2883) are emitted in-band as block[0] of an otherwise normal SACK frame (see Section 1.3 for the encoding). Two receiver triggers arm a pending D-SACK report (single-slot, latest-wins):

  • DATA arrival with seqno < rcv_cr.lwe, both wire-dup (no RXM, is_dup_data path) and retransmit (RXM, post-FC branch) (RFC 2883 sec. 4.1.1, full duplicate)
  • rq_accept conflict, slot already occupied in [lwe, rwe) (RFC 2883 sec. 4.1.2, partial duplicate)

When a D-SACK is pending and the standard scoreboard SACK would be suppressed by dedup or rate-limit, the report is emitted as a stand-alone SACK frame through the normal ack_snd path; when a D-SACK report is pending the path bypasses dedup and the TICTIME rate-limit, but the a-timer suppression on rcv inactivity still applies.

Bare ACKs are deferred via a per-flow delayed-ACK timer (one in flight at a time, atomic test-and-set dedup; fires per Section 3 after the first in-order arrival). Suppressed if (1) no new seqno, (2) rcv side is inactive (older than t_a), or (3) the sender just sent within TICTIME. A pending D-SACK ride-through bypasses (1) and (3); the a-timer gate (2) is unconditional.


11. Flow control

The receiver advertises rwe in every FC field. The sender treats its snd_cr.rwe as the absolute right edge: when snd_cr.seqno >= snd_cr.rwe the window is closed and flow_write yields. While closed, the sender periodically emits RDVS (rendezvous) packets (cadence DELT_RDV); the receiver replies with a bare FC packet (ackno = 0) that reopens the window. Once the window has been closed for longer than MAX_RDV the sender stops emitting RDVS but does not tear the flow down - the writer keeps blocking until either a peer-driven FC arrives or the KA (keepalive) / r-timer marks the flow.

rwe is clamped to lwe + RQ_SIZE on receipt and MUST NOT shrink: a backward rwe is silently clamped to the current snd_cr.rwe; the FC packet still reopens the window.


12. RTT estimation

Active RTTP probes (Section 1.4) carry a 32-bit probe_id (0 reserved) and a 16-byte random nonce echoed verbatim - defends against spoofed replies. A ring of RTTP_RING in-flight probes is kept; an echo whose (id, nonce) doesn't match the ring slot is dropped. A single RTTP sample is clamped to RTT_CLAMP_MUL * srtt (compile-time RTT_CLAMP_MUL = 16) once srtt is seeded; the first cold-probe sample feeds rtt_update raw.

Probe arming gates:

Cold (no srtt yet)
the receive path arms at most one probe per 100 ms via frcti_rcv_probe (PROBE_DUE_COLD); arming requires an incoming packet. Active send-path arming bails while srtt == 0.
Warm (rtt_probe_arm, called from frcti_snd)
outstanding data (snd_cr.seqno > snd_cr.lwe), AND at least 2 * srtt since t_rcv_rtt (last RTT receive of any kind), AND at least srtt since t_snd_probe (last probe emit).

Sample feeds either Linux's asymmetric mdev estimator (FRCT_LINUX_RTT_ESTIMATOR, default ON) or RFC 6298 symmetric EWMA (compile option). srtt is floored at 10 ms when seeded from a hint, at 1 us after every update (including the first seeding sample); mdev floored at 100 ns.

RTO = max(rto_min, 2 * srtt, srtt + (mdev << MDEV_MUL))

(the 2 * srtt floor is an FRCT addition not in RFC 6298). Effective wheel deadline capped per Section 3.

ACK-derived samples (frcti_ack_rcv -> rtt_sample_eligible), beyond the cum-ACK advance gate in frcti_ack_rcv (ackno > lwe and ackno <= seqno), require all of: not in recovery; ACK packet does not carry FRCT_RXM; HoL slot's SND_RTX bit clear; slot's rxm pointer non-NULL (not SACK-consumed); lwe not below the rtt_lwe fence; srtt already seeded by an RTTP probe. There is no ACK-only seeding.

Every eligible sample also feeds RACK.min_RTT (RFC 8985 sec. 6.2) via a windowed minimum: replace whenever the sample is strictly smaller OR more than MIN_RTT_WIN_NS (5 min, matches Linux tcp_min_rtt_wlen) has elapsed since the current min was set. The downward branch is immediate (faster path picked up at once); the upward branch is gated on the window (a transient queue burst does not poison the estimate, but a sustained route change to a longer path re-anchors min_RTT after at most one window). Seeded from rtt_hint at rtt_init; 0 acts as the unset sentinel and the base in rack_reorder_window falls back from min_RTT to SRTT (so R = mult * SRTT/4, capped at SRTT, floored at MIN_REORDER_NS) until the first sample. See Section 6.2.


13. Liveness (keepalive)

When qs.timeout > 0 a per-flow KA (keepalive) timer is armed. Arming uses rcv_cr.act for the deadline computation:

deadline = min(snd_act + qs.timeout/4, rcv_act + qs.timeout)

(clamped to now + qs.timeout/4 if already past). The timer fires either on sender idleness (to send a KA) or on receiver idleness (to declare the peer dead). On fire (ka_snd) the peer-dead test uses max(rcv_cr.act, t_ka_rcv) so a recent KA reply counts even when no DATA has arrived:

  • If now - max(rcv_cr.act, t_ka_rcv) > qs.timeout, mark the flow ACL_FLOWPEER and notify the per-process flow-event set (proc.fqset) with FLOW_PEER.
  • Else if snd_idle > qs.timeout/4, emit a bare KA | ACK (ackno = rcv_cr.lwe) and re-arm.
  • Else just re-arm.

Note: rx_rb and tx_rb are the receive and transmit shared-memory ring buffers. The r-timer raises ACL_FLOWDOWN on both (route is broken); keepalive raises ACL_FLOWPEER on rx_rb only and notifies the flow-event set (peer is silent, writer keeps tx_rb usable) - distinct ACLs. qs.timeout == 0 disables keepalive entirely; a silent peer crash is then undetected.


14. Linger / teardown

On flow_dealloc, frcti_dealloc computes a grace timeout

max(rcv_cr.act + rcv_cr.inact, snd_cr.act + snd_cr.inact) - now

(floored at 0 and converted to seconds) and returns it; flow_dealloc forwards this to the IRMd as the dealloc grace. The IRMd, not FRCT, performs the wait. Before computing the timeout, FRCT may emit a final ACK when rcv_cr.lwe != rcv_cr.seqno (the peer has not been told the most recent cumulative ACK) AND the rcv side has been active within t_a (a-timer not aged out).

FRCTFLINGER is honoured only when snd_cr.lwe < edge, where edge = snd_fin_seqno after FIN has been sent in stream mode and snd_cr.seqno otherwise (data or FIN still in flight). The drain itself runs in flow_dealloc's while (FRCTI_LINGERING) loop, not in frcti_dealloc.

The fd is single-reader / single-writer (documented in the manpages). flow_write pumps rx_rb on every call (via flow_wait_window -> flow_drain_rx_nb) and additionally blocks on rx_rb when the send window is closed. A pure-writer thread thus consumes ACKs without a dedicated reader.


15. Heritage and adopted techniques

Delta-t (Watson, 1981) is the primary heritage; FRCP descends from the delta-t protocol family via the Recursive InterNetwork Architecture (RINA; Day, "Patterns in Network Architecture", 2008, ch. 9). Timer-based connection management (no SYN/FIN handshake, per-flow state born on first DATA and reclaimed after t_mpl + a + r of silence), the DRF marker, and the t_mpl / t_a / t_r timers all come from delta-t. See Watson, "Timer-Based Mechanisms in Reliable Transport Protocol Connection Management", Computer Networks 5 (1981).

The unified flow_alloc(name, qos, ...) primitive and its multi-axis QoS-cube argument (Section 2.2) also come from RINA (Day 2008, ch. 6; Grasa et al., "IRATI: investigating RINA as an alternative to TCP/IP", Computer Networks 92 (2015)) - reliability, ordering, CRC presence, and encryption are flow attributes, not separate sockets or protocols.

The table below summarises additional adopted techniques and their references.

FRCP mechanism Heritage Reference / note
Random new seqno on seqno_rotate TCP ISN RFC 6528 (Gont & Bellovin, 2012). QUIC PN-space reset (RFC 9000 sec. 12.3) is a structural analogue.
Cumulative ACK, left-window-edge advance TCP RFC 793 / RFC 9293
Receive window with non-shrink rule TCP RFC 793 sec. 3.7 / RFC 9293 sec. 3.8.6; RFC 1122 sec. 4.2.2.16 for the explicit non-shrink prohibition
Modular seqno arithmetic (before/after helpers) TCP RFC 793 sec. 3.3 / RFC 9293 sec. 3.4
Selective ACK block list TCP RFC 2018 (Mathis et al., 1996). Encoded as a typed FRCP packet rather than a TCP option, so framing is closer to QUIC ACK frames. D-SACK (RFC 2883) carried in-band as block[0]; see Section 1.3.
NewReno-careful recovery with recovery_high gate TCP RFC 6582 (Henderson et al., 2012); QUIC builds on the same model in RFC 9002 sec. 7.3.2. Cwnd half absent (CC in IPCP).
RACK reordering window for fast retransmit TCP RFC 8985 (Cheng et al., 2021). FRCP R = MIN(reo_wnd_mult * min_RTT / 4, SRTT) with a MIN_REORDER_NS = 250 us floor against srtt collapse; matches RFC 8985 sec. 6.2 and Linux tcp_rack_reo_wnd. DSACK-driven reo_wnd_mult (sec. 6.2 step 4) is adopted; see Section 1.3 for the wire encoding. The hybrid RACK-or-DUP_THRESH trigger from RFC 8985 sec. 6.2 step 4 is adopted (Section 8). QUIC's analogue in RFC 9002 sec. 6.1.2 uses max(srtt, latest_rtt) as the base.
Karn's algorithm: no RTT sample on retransmits, RTO-collapse freeze TCP Karn & Partridge, "Improving Round-Trip Time Estimates in Reliable Transport Protocols", SIGCOMM 1987; RFC 6298 sec. 3.
RTO formula RTO = max(RTO_MIN, srtt + (mdev << MDEV_MUL)) TCP RFC 6298 (Paxson et al., 2011). RTO_MIN = 250 us is below RFC 6298 sec. 2.4's 1 s SHOULD-floor - a recursive-layer choice.
Linux asymmetric mdev estimator (default) Linux kernel tcp_rtt_estimator() in net/ipv4/tcp_input.c; the if(delta<0) m>>=3 dampening is a kernel divergence from RFC 6298. RFC 6298 EWMA available behind a compile flag.
Delayed ACK with rate suppression TCP RFC 813 (Clark, 1982); RFC 1122 sec. 4.2.3.2; RFC 5681 sec. 4.2. Single-deadline coalescing rather than "ack-every-other-segment".
Zero-window-probe / persist-timer analogue (RDVS) TCP RFC 1122 sec. 4.2.2.17 / RFC 9293 sec. 3.8.6.1. RDVS solicits an FC reply, distinct from QUIC DATA_BLOCKED (RFC 9000 sec. 19.12), which is one-way notification. MAX_RDV give-up departs from TCP.
Multiplexed control on a single PCI SCTP / QUIC SCTP chunk bundling (RFC 9260 sec. 6.10); QUIC frame multiplexing (RFC 9000 sec. 12.4). Cleaner fit than TCP's separate-flag-bits design.
ACK ranges as multiple discontiguous acked blocks QUIC QUIC ACK frame (RFC 9000 sec. 19.3). FRCP SACK is conceptually QUIC-frame-shaped even though encoded as absolute [start,end] pairs.
Nonce-authenticated active RTT / liveness probing (RTTP) QUIC PATH_CHALLENGE PATH_CHALLENGE / PATH_RESPONSE (RFC 9000 sec. 8.2, sec. 19.17, sec. 19.18). WebRTC ICE consent-freshness (RFC 7675) is the same pattern. QUIC's nonce is 8 octets; FRCP chooses 16.
Probing distinct from keepalive QUIC KA timer answers "peer alive?", RTTP answers "path measurable?", as in QUIC PING (RFC 9000 sec. 19.2) vs PATH_CHALLENGE.
Bare KA + ACK keepalive packets QUIC / SCTP QUIC PING (RFC 9000 sec. 19.2); SCTP HEARTBEAT / HEARTBEAT-ACK (RFC 9260 sec. 8.3). SCTP HEARTBEAT also carries an opaque echoed blob, structurally similar to FRCP RTTP.
(FFGM, LFGM) fragment-role bits (Section 7.2) SCTP RFC 9260 sec. 3.3.1 DATA chunk B/E bits encode the same four states (B+E=SOLE, B-only=FIRST, neither=MID, E-only=LAST). Each fragment carries its own seqno/TSN and is independently retransmitted.
Stream byte-offset reassembly (Sections 1.5, 16) QUIC QUIC STREAM frame (RFC 9000 sec. 19.8) uses Offset + Length varints; FRCP uses fixed 32-bit start / end. One stream per flow vs QUIC's many streams multiplexed.
FIN end-of-stream marker (Sections 1.2, 16) TCP / QUIC TCP FIN flag (RFC 9293 sec. 3.1) closes one half of the byte stream; QUIC STREAM frame FIN bit (RFC 9000 sec. 19.8) does the same per stream with an immutable final-size invariance (RFC 9000 sec. 4.5: the final size is fixed once observed). FRCP's FIN consumes one packet seqno (not one byte of stream space) and is idempotent on the sender side.
Stream byte-credit flow control (Section 16) QUIC MAX_STREAM_DATA (RFC 9000 sec. 4.1, sec. 19.10). FRCP projects a per-flow byte budget onto the seqno-space rwe. Single stream per flow collapses QUIC's MAX_DATA / MAX_STREAM_DATA distinction.
Header protection (encrypted seqnos) QUIC QUIC RFC 9001 sec. 5.4 applies header protection on top of AEAD to mask the packet number. FRCP's per-flow AEAD wrap (Section 16) is wider: it encrypts the entire PCI including seqno because the IPCP below already routes, so no destination connection-ID needs to stay in clear (cf. RFC 9000 sec. 5.2).
Two-bit fragment role polarity SCTP The (FFGM, LFGM) pair follows SCTP B/E (begin = 1 / end = 1) rather than IPv4 MF (RFC 791 sec. 3.2), which has the inverse polarity (MF = 1 means NOT last).
Orthogonal reliability / ordering axes (Section 2.2) SCTP PR-SCTP (RFC 3758, per-message partial reliability) and SCTP DATA U-bit (RFC 9260 sec. 3.3.1, per-message unordered) are the closest precedents for decoupling reliability from ordering; FRCP sets them per-flow rather than per-message.
Orthogonal CRC (qs.ber == 0) UDP-Lite RFC 3828 (Larzon et al., 2004) lets the sender pick a per-packet Checksum Coverage and the receiver enforce a locally configured minimum (no in-band negotiation; sec. 3.1, sec. 3.3). FRCP gates a full CRC trailer on qs.ber == 0 at flow setup. Contrast TCP / SCTP (mandatory checksum) and QUIC (AEAD subsumes CRC).
Setup-time service negotiation DCCP / SCTP / QUIC DCCP Service Codes (RFC 4340 sec. 8.1.2, RFC 5595); SCTP INIT parameters (RFC 9260 sec. 3.3.2); QUIC transport parameters (RFC 9000 sec. 7.4). All negotiate service properties at connection setup; only RINA's QoS cube exposes them as an orthogonal vector.


15.1. Original to FRCP (no clean prior art)

  • Pre-DRF NACK (Section 9): receiver-driven nudge exploiting snd_cr.inact > rcv_cr.inact. Closest analogues are SCTP Gap Ack Blocks (RFC 9260 sec. 3.3.4) and DCCP Ack Vector (RFC 4340 sec. 11.4) - both let the receiver describe gaps to the sender, but neither targets the cross-epoch / pre-DRF case.
  • MAX_RDV window-probe give-up: neither TCP (persist-timer probes until application or R2 abort, RFC 9293 sec. 3.8.6.1) nor QUIC has an explicit FC-give-up counter. A recursive-network choice: outer layers can drop the flow.
  • Skip-past-gap reassembly (Section 7.2): SCTP fragments and reassembles every flow regardless of reliability/ordering, using its own per-stream reassembly queue; QUIC fragments via STREAM offsets. FRCP fragments best-effort flows too, but the receiver drops the broken prefix the moment a later run-start (FIRST or SOLE role) is visible inside the RQ_SIZE-wide reorder ring - no IP-frag-style timeout, no SCTP-style explicit abort. If no later run-start arrives within the ring, frag_run_inspect returns NOT_READY and the partial run keeps its slots; the next inspect retries. The trade-off: a permanently-lost MID in a long isolated run holds slots until either a later FIRST/SOLE appears in the ring or the writer stops, at which point the slots are reclaimed on flow teardown.
  • Reassembly deferred to consume time (Section 7.2), message mode only (qos.service == SVC_MESSAGE): SCTP (RFC 9260 sec. 6.9), QUIC (RFC 9000 sec. 2.2), and TCP (RFC 9293) all hold reassembly state at the receive boundary. FRCP message-mode leaves fragments in the shared-memory ring until flow_read pulls and lands the SDU directly in the caller's buffer. Stream mode (Section 16) uses the standard QUIC-style direct ring placement on receive and does not defer. The optimisation is enabled by the Shared-Memory Subsystem (SSM) packet-buffer ring (see struct ssm_pk_buff at Section 1.1); the analogue is OS-level scatter-gather I/O (recvmsg+iovec), not a transport-layer prior art.
  • TLP-equivalent tail-loss recovery (RFC 8985 sec. 7; RFC 9002 sec. 6.2): FRCP does not emit an explicit Tail Loss Probe packet, but the same goal is met implicitly by RACK loss detection (Section 8) firing on a non-advancing cumulative ACK once the head-of-line slot ages past the RACK reorder window R = MIN(reo_wnd_mult * min_RTT / 4, SRTT) - well below RTO = max(2 * SRTT, SRTT + (mdev << MDEV_MUL)). A receiver-driven nudge is also available via the pre-DRF NACK (Section 9).


15.2. Not adopted

  • Slow start, congestion window (cwnd), Additive Increase / Multiplicative Decrease (AIMD), NewReno cwnd inflation. Congestion control lives in the IPCP CA policies and is driven by Explicit Congestion Notification (ECN, RFC 3168).
  • Nagle / silly-window-syndrome (SWS) avoidance (RFC 896, RFC 1122 sec. 4.2.3.4). (Deferred work, not adopted in the current spec.)
  • TCP Timestamps (RFC 7323) / Protection Against Wrapped Sequences (PAWS) - RTT measurement uses RTTP, not per-segment timestamps. A peer-supplied timestamp echoed on every ACK lets a malicious peer drive the srtt estimate arbitrarily low, collapsing the RTO and triggering a self-inflicted retransmit storm. RTTP confines RTT measurement to nonce-authenticated probe round-trips, where a forged echo is rejected before it can reach the estimator.
  • ECN (Explicit Congestion Notification) response inside FRCP (consumed by IPCP Congestion Avoidance / CA).
  • IP-style fragment-offset reassembly (RFC 791 sec. 3.2; RFC 8200 sec. 4.5). Message-mode FRCP relies on the FRCT rq[] reorder ring keyed by seqno (shared by FRTX and best-effort flows) to put fragments back in order; no separate offset field is needed and no IP-style hole-list reassembly buffer is kept. Stream-mode FRCP does carry [start, end) byte offsets (Section 1.5) for direct ring placement on receive.
  • QUIC STREAM offset+length framing on every flow (RFC 9000 sec. 19.8). Message-mode FRCP uses the SCTP-style B/E flag-bit encoding (FFGM/LFGM) and skips the offsets; stream-mode FRCP adopts the QUIC offset model (heritage table above).

16. Stream-mode flows

When a flow is allocated with qos.service == SVC_STREAM both peers switch to byte-stream semantics, layered on top of the FRTX reorder machinery already described in Sections 6-8.

16.1. Send

The sender splits the caller's octets into chunks of at most (frag_mtu - base PCI - stream PCI extension) octets (Sections 1.1 and 1.5). Each chunk is one DATA packet with its own seqno and a [start, end) byte range copied from a monotonic stream counter. In stream mode FFGM and LFGM are unused and MUST be transmitted as zero; the per-byte position is carried by the [start, end) extension instead.

End-of-stream is signalled with a 0-byte DATA packet that has FIN (bit 12) set, emitted on the FIN triggers listed in Section 1.2 (WR-half close, flow_dealloc, and any other path that yields the final byte). The sender MUST emit at most one FIN per flow; its [start, end) MUST equal [final-byte, final-byte) (i.e., empty interval at the final byte position; final-size invariance, analogous to QUIC RFC 9000 sec. 4.5). Idempotency is enforced by an snd_fin_sent guard.

16.2. Receive

On arrival the receiver places the payload directly into a per-flow byte-indexed receive ring of width ring_sz (octets) at the position indicated by start, with a two-segment memcpy across the ring boundary if needed. Receipt is recorded in the FRTX reorder machinery (Section 6.2) augmented with the packet's start, end, and FIN bit per slot. When a packet's [start, end) front-overlaps bytes already at or below the byte high-water mark, the overlap is trimmed before placement so the same byte is never written twice. After stashing, the receiver advances lwe and the byte high-water mark across any newly-contiguous prefix. Each slot advanced MUST satisfy start == the last-delivered slot's end; a slot whose start does not equal that end is silently dropped at delivery time (the seqno is consumed, no stream bytes contributed) and the high- water mark does not advance past it. The stream byte-stream stalls at that point - there is no flow-tear-down on mismatch. This filters spliced or off-path-injected slots that fall in window without strong cryptographic authentication.

A FIN slot marks end-of-stream at advance time only if its byte position equals the last-delivered slot's end; otherwise the FIN is ignored and the corresponding seqno occupies a slot but contributes no stream bytes. No packet buffer is held after the ring copy.

16.3. Read

flow_read returns up to count octets from the contiguous prefix [next, high-water), where next is the byte the application has already consumed up to and high-water is the rightmost contiguous byte received. When the stream is fully drained AND end-of-stream (EOS) was observed (next == EOS byte position), flow_read returns 0 (EOF) - the same shape POSIX read(2) uses on TCP after a peer FIN.

16.4. Flow control

ACK / SACK / RACK / RTO machinery is unchanged; the FRTX reorder ring is reused as a per-seqno received-bitmap. Let per_pkt = (frag_mtu - base PCI - stream PCI extension), the maximum stream- byte payload one DATA packet can carry (Section 16.1). The receive window advertised in FC is clamped so the byte window (ring_sz) cannot be overrun: the seqno-space rwe is at most rcv_cr.lwe + ring_sz / per_pkt.

This is the QUIC byte-credit flow-control model (MAX_STREAM_DATA, RFC 9000 sec. 4.1 and sec. 19.10) projected onto seqno space. With one stream per flow there is no MAX_DATA / MAX_STREAM_DATA distinction. Receiver-side silly-window-syndrome (SWS) avoidance (RFC 9293 sec. 3.8.6.2.2) is achieved by combining the consume-time rwe bump with the global non-shrink rule from Section 11.

16.5. Security considerations

Threat model. An attacker that can observe (on-path passive) or predict (off-path blind) the flow's seqnos and byte offsets on an unencrypted stream flow can inject DATA or FIN at any in-window position. The in-line consistency checks above (start == prior end on advance; FIN MUST be 0-byte; FIN MUST sit at the final byte position) realise the spirit of RFC 5961's "sequence-window plus exact-position match for control bits" without an explicit challenge-ACK probe; they make a few specific blind attack shapes harder but are not cryptographic authentication. This is comparable to TCP without the TCP Authentication Option (TCP-AO, RFC 5925), tighter than a pre-RFC-5961 TCP stack, and roughly equivalent to a modern RFC 5961 stack against blind off-path injection - none of these help once the attacker can sniff. TLS over TCP (RFC 8446) encrypts only the TCP payload and leaves TCP seqnos, ACKs, FIN, and RST in the clear, so TLS does NOT defend against TCP-header- level injection; QUIC (RFC 9000) hides packet numbers under header protection (RFC 9001 sec. 5.4), so this specific weakness does not apply to QUIC.

Mitigation: AEAD. When the flow has encryption enabled the recommended AEAD ciphers (AES-GCM, RFC 5288; or ChaCha20-Poly1305, RFC 8439) wrap the entire FRCP packet on the wire - PCI, stream extension, body, and the CRC trailer when ber == 0 - under a per-flow symmetric key derived from the flow's own key exchange (Section 1.1). The AEAD tag (~2^-128 forgery probability) dominates the CRC (~2^-32) for integrity in this mode but the CRC trailer is currently retained inside the wrap (see Section 1.1). Implementations MUST NOT rely on the security properties below when a non-AEAD cipher (e.g. AES-CTR alone) is negotiated; non- AEAD modes provide confidentiality only and the threat-model claims do not hold.

With an AEAD cipher in use, seqnos, byte offsets, and the FIN bit are both authenticated and confidential. Against an off-path or on-path-passive attacker this is:

  • Stronger than TCP+TLS (TCP header in the clear).
  • Stronger than TCP+TCP-AO (header authenticated but visible).
  • Comparable to IPsec ESP transport mode (RFC 4303), which similarly authenticates and encrypts the upper-layer header plus payload, and to QUIC packet protection (RFC 9001 sec. 5), with the difference that QUIC must leave the destination connection ID in the clear for routing whereas FRCP relies on the IPCP below for delivery and can therefore encrypt its entire PCI.

Keying granularity. Ouroboros flow allocation runs key exchange (kex) per flow, so each flow_alloc yields independent symmetric keys. This is finer-grained than QUIC (per-connection, RFC 9001, where one handshake covers all multiplexed streams) and finer-grained than typical IPsec deployment (per-host-pair Security Associations, SAs). Forward secrecy follows from the kex when an ephemeral Diffie-Hellman exchange (DHE), or a hybrid mode (classical DH + post-quantum Key Encapsulation Mechanism / KEM), is selected.

Replay protection. The AEAD layer itself does NOT carry an explicit anti-replay window (unlike IPsec ESP, RFC 4303 sec. 3.4.3, or DTLS, RFC 9147 sec. 4.5.1). For FRCP-engaged flows the seqno-space duplicate-suppression in Section 6.2 rejects replayed DATA after the AEAD strips the wrap, because the AEAD authenticates the seqno and a replay re-presents an old seqno that is then discarded either as a duplicate (still inside the receive window or as outside the receive window, depending on how far lwe has advanced since the original packet was delivered. RAW (qos.service == SVC_RAW) flows have no FRCP layer and therefore no replay protection at the AEAD layer either; deployments that need replay rejection on RAW flows SHOULD use SVC_MESSAGE.

Layering. The AEAD wrap sits below FRCP on the data path, so RAW best-effort flows (qos.service == SVC_RAW, the UDP-equivalent service of Section 2.2) inherit the same per-flow integrity + confidentiality scope as FRCP-engaged flows - whatever the process and FRCP (if any) put on the wire is what the AEAD authenticates. No DTLS-equivalent layering is required for confidentiality and integrity; replay protection above AEAD is a separate concern as noted above.

17. References

This section lists the IETF documents, published works, and source-code references cited inline elsewhere in this document. IETF documents are cited inline as "RFC NNNN sec. X.Y"; books, journal papers, and source-code references are cited inline by author and year (or by file and function name) and are listed here for convenience.


17.1. IETF documents

[RFC 791]
J. Postel, "Internet Protocol", STD 5, RFC 791, September 1981.
[RFC 793]
J. Postel, "Transmission Control Protocol", STD 7, RFC 793, September 1981. Obsoleted by RFC 9293.
[RFC 813]
D. D. Clark, "Window and Acknowledgement Strategy in TCP", RFC 813, July 1982.
[RFC 896]
J. Nagle, "Congestion Control in IP/TCP Internetworks", RFC 896, January 1984.
[RFC 1122]
R. Braden (ed.), "Requirements for Internet Hosts -- Communication Layers", STD 3, RFC 1122, October 1989.
[RFC 2018]
M. Mathis, J. Mahdavi, S. Floyd, A. Romanow, "TCP Selective Acknowledgment Options", RFC 2018, October 1996.
[RFC 2119]
S. Bradner, "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC 2883]
S. Floyd, J. Mahdavi, M. Mathis, M. Podolsky, "An Extension to the Selective Acknowledgement (SACK) Option for TCP", RFC 2883, July 2000.
[RFC 3758]
R. Stewart, M. Ramalho, Q. Xie, M. Tuexen, P. Conrad, "Stream Control Transmission Protocol (SCTP) Partial Reliability Extension", RFC 3758, May 2004.
[RFC 3828]
L-A. Larzon, M. Degermark, S. Pink, L-E. Jonsson (ed.), G. Fairhurst (ed.), "The Lightweight User Datagram Protocol (UDP-Lite)", RFC 3828, July 2004.
[RFC 4303]
S. Kent, "IP Encapsulating Security Payload (ESP)", RFC 4303, December 2005.
[RFC 4340]
E. Kohler, M. Handley, S. Floyd, "Datagram Congestion Control Protocol (DCCP)", RFC 4340, March 2006.
[RFC 5288]
J. Salowey, A. Choudhury, D. McGrew, "AES Galois Counter Mode (GCM) Cipher Suites for TLS", RFC 5288, August 2008.
[RFC 5595]
G. Fairhurst, "The Datagram Congestion Control Protocol (DCCP) Service Codes", RFC 5595, September 2009.
[RFC 5681]
M. Allman, V. Paxson, E. Blanton, "TCP Congestion Control", RFC 5681, September 2009.
[RFC 5925]
J. Touch, A. Mankin, R. Bonica, "The TCP Authentication Option", RFC 5925, June 2010.
[RFC 5961]
A. Ramaiah, R. Stewart, M. Dalal, "Improving TCP's Robustness to Blind In-Window Attacks", RFC 5961, August 2010.
[RFC 6298]
V. Paxson, M. Allman, J. Chu, M. Sargent, "Computing TCP's Retransmission Timer", RFC 6298, June 2011.
[RFC 6528]
F. Gont, S. Bellovin, "Defending against Sequence Number Attacks", RFC 6528, February 2012. Obsoletes RFC 1948.
[RFC 6582]
T. Henderson, S. Floyd, A. Gurtov, Y. Nishida, "The NewReno Modification to TCP's Fast Recovery Algorithm", RFC 6582, April 2012.
[RFC 7323]
D. Borman, B. Braden, V. Jacobson, R. Scheffenegger (ed.), "TCP Extensions for High Performance", RFC 7323, September 2014.
[RFC 7675]
M. Perumal, D. Wing, R. Ravindranath, T. Reddy, M. Thomson, "Session Traversal Utilities for NAT (STUN) Usage for Consent Freshness", RFC 7675, October 2015.
[RFC 8174]
B. Leiba, "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, May 2017.
[RFC 8200]
S. Deering, R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", STD 86, RFC 8200, July 2017.
[RFC 8439]
Y. Nir, A. Langley, "ChaCha20 and Poly1305 for IETF Protocols", RFC 8439, June 2018.
[RFC 8446]
E. Rescorla, "The Transport Layer Security (TLS) Protocol Version 1.3", RFC 8446, August 2018.
[RFC 8985]
Y. Cheng, N. Cardwell, N. Dukkipati, P. Jha, "The RACK-TLP Loss Detection Algorithm for TCP", RFC 8985, February 2021.
[RFC 9000]
J. Iyengar (ed.), M. Thomson (ed.), "QUIC: A UDP-Based Multiplexed and Secure Transport", RFC 9000, May 2021.
[RFC 9001]
M. Thomson (ed.), S. Turner (ed.), "Using TLS to Secure QUIC", RFC 9001, May 2021.
[RFC 9002]
J. Iyengar (ed.), I. Swett (ed.), "QUIC Loss Detection and Congestion Control", RFC 9002, May 2021.
[RFC 9147]
E. Rescorla, H. Tschofenig, N. Modadugu, "The Datagram Transport Layer Security (DTLS) Protocol Version 1.3", RFC 9147, April 2022.
[RFC 9260]
R. Stewart, M. Tuexen, K. Nielsen, "Stream Control Transmission Protocol", RFC 9260, June 2022. Obsoletes RFC 4960.
[RFC 9293]
W. Eddy (ed.), "Transmission Control Protocol (TCP)", STD 7, RFC 9293, August 2022. Obsoletes RFC 793 and several follow-ons; updates RFC 1122 and others.


17.2. Books and journal papers

[Day08]
J. Day, "Patterns in Network Architecture: A Return to Fundamentals", Prentice Hall, 2008.
[Grasa15]
E. Grasa et al., "IRATI: investigating RINA as an alternative to TCP/IP", Computer Networks, Vol. 92, December 2015.
[KP87]
P. Karn, C. Partridge, "Improving Round-Trip Time Estimates in Reliable Transport Protocols", ACM SIGCOMM, August 1987.
[Wat81]
R. W. Watson, "Timer-Based Mechanisms in Reliable Transport Protocol Connection Management", Computer Networks, Vol. 5, 1981.


17.3. Source-code references

[Linux-RTT]
tcp_rtt_estimator() in net/ipv4/tcp_input.c of the Linux kernel, defining the asymmetric mdev variance update used as FRCP's default RTT estimator (Section 12). Line-stable browseable copy at https://elixir.bootlin.com/linux/latest/source/net/ipv4/tcp_input.c.