Flow and Retransmission Control Protocol
FRCP runs end-to-end between two peers over a flow. It delivers
reliability, in-order delivery, flow control, and liveness.
Congestion Control (CC) is not in FRCP - that lives in the IPC
Process (IPCP) Congestion Avoidance (CA) policies, orthogonal to
FRCP. Flow allocation, naming, and IPCP lifecycle are handled by
the IPC Resource Manager daemon (IRMd).
FRCT (Flow and Retransmission Control Task) is the libouroboros
implementation of FRCP; the task lives in src/lib/frct.c. The
remainder of this document describes the FRCP wire protocol and the
behaviour FRCT realises. Code symbols retain the FRCT_ prefix
(FRCT_DATA, FRCT_RXM, ...) because they belong to the implementing
task; this document references them verbatim.
The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in
this document are to be interpreted as described in BCP 14 (Best
Current Practice; RFC 2119, RFC 8174) when, and only when, they
appear in all capitals.
Notation
u32,u8- Unsigned 32-bit / 8-bit integers (kernel-C style).
ns- Nanoseconds.
Modular sequence-number comparators (32-bit, modulo 2^32):
before(a, b)(int32_t)(a - b) < 0after(a, b)before(b, a)
Used throughout for ackno / seqno ordering checks.
Round-Trip Time (RTT) abbreviations used throughout:
SRTT- Smoothed RTT estimate (RFC 6298).
mdev- Mean deviation of RTT (Linux variance estimator).
EWMA- Exponentially Weighted Moving Average.
RTO- Retransmission Timeout,
max(RTO_MIN, srtt + (mdev << MDEV_MUL)).
Timer-bound symbols t_a (a-timer, ACK delay) and t_r (r-timer,
retransmission window) are defined in Section 8; t_mpl (Maximum
Packet Lifetime) is introduced in Section 2.1 (the inact field)
with heritage in Section 15.
Wire-format diagrams follow the IETF convention: bit 0 is the leftmost (most significant) bit and fields are in network byte order unless stated otherwise.
1. Wire format
1.1. PCI header
Fixed 16-octet base Protocol-Control Information (PCI) header
prefixed to every FRCP packet (RFC convention: bit 0 leftmost,
most-significant bit first). All multi-byte fields except hcs
are in network byte order; hcs is an opaque 16-bit value that
the receiver recomputes from the wire bytes and compares to the
in-place pci->hcs read, so its on-wire byte order need only
match between peers running compatible builds. DATA packets on
stream-mode flows carry an additional 8-octet extension (see
Section 1.5); SACK and RTTP carry their own payloads after the
base PCI.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| flags | hcs |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| window |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| seqno |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ackno |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| payload (variable) ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
flags- feature/type bitmap (see Section 1.2).
hcs- CRC-16-CCITT-FALSE Header Check Sequence (HCS) over
flags+window+seqno+ackno(+ stream extension when present); the two octets of thehcsfield itself are omitted from the CRC input. Verified on receive before any flag-driven dispatch. window- receiver-advertised right window edge (valid iff FC).
seqno- per-flow sequence number.
ackno- cumulative Acknowledgement (ACK) (valid iff ACK).
A single packet can simultaneously carry DATA + ACK + FC (Flow Control) + RXM (Retransmission) by ORing flag bits; the PCI multiplexes control on the same wire frame in the spirit of SCTP chunk bundling (RFC 9260 sec. 6.10) and QUIC frame multiplexing (RFC 9000 sec. 12.4). DATA-bearing packets carry the caller's payload after the PCI; SACK (Selective Acknowledgement) and RTTP (Round-Trip Time Probe) carry their own typed payloads after the PCI.
Optional framing (per-flow, see Section 2.2). On the wire, the order from inside out is:
| Layer | Scope |
|---|---|
[ PCI + body ]
|
The FRCP packet. |
[ PCI + body + CRC-32 ]
|
CRC-32 covers the body only (PCI is in HCS); appended iff qs.ber == 0 on DATA, or on every SACK packet.
|
[ AEAD-wrap of above ]
|
Iff Authenticated Encryption with Associated Data (AEAD) is enabled. |
- HCS in the PCI covers the header fields on every packet and is verified before any flag-driven dispatch.
- The CRC-32 trailer (IEEE 802.3 / zlib reflected polynomial
0xEDB88320, init0xFFFFFFFF, xor-out0xFFFFFFFF) covers the body on DATA whenqs.ber == 0and on every SACK packet; the trailer is written as a rawuint32_t(the same convention ashcs: opaque on the wire as long as both peers run compatible builds). The PCI is not under the CRC (Cyclic Redundancy Check) because the HCS already protects it. It is appended before AEAD encryption and therefore rides inside the AEAD wrap when both are active; the AEAD tag (~2^-128 forgery probability) dominates the CRC (~2^-32) for integrity in that mode but the CRC trailer is currently retained. - When encryption is enabled, the entire (possibly-CRC'd) FRCP packet is wrapped with AEAD inside the shared-memory packet buffer (
spb,struct ssm_pk_buff); the packet grows by the AEAD overhead, namely a leading nonce / Initialization Vector (IV) ofheadszbytes (crypt_get_ivsz) and a trailing authentication tag oftailszbytes (crypt_get_tagsz).
Both CRC and AEAD are layered around the FRCP wire format and are not visible to the FRCP machinery itself.
1.2. Flag bits
Flag bits are numbered most-significant-bit first to match the wire
diagram (bit numbering per Section 1.1; bit 0 is the MSB of the
16-bit flags field and lands at wire-position 0 in network byte
order). Bits 13..15 are reserved and MUST be transmitted as zero.
| Bit | Mask | Name | Meaning |
|---|---|---|---|
| 0 | 0x8000 |
DATA |
Carries caller payload |
| 1 | 0x4000 |
DRF |
Data Run Flag: start of a fresh run |
| 2 | 0x2000 |
ACK |
Acknowledgement: ackno field valid
|
| 3 | 0x1000 |
NACK |
Negative ACK; seqno = arrival_seqno-1
|
| 4 | 0x0800 |
FC |
Flow Control: window field valid (rwe)
|
| 5 | 0x0400 |
RDVS |
Rendezvous probe (window-closed) |
| 6 | 0x0200 |
FFGM |
First Fragment (role bit 0; see below) |
| 7 | 0x0100 |
LFGM |
Last Fragment (role bit 1; see below) |
| 8 | 0x0080 |
RXM |
Retransmission |
| 9 | 0x0040 |
SACK |
Selective ACK block list in payload |
| 10 | 0x0020 |
RTTP |
RTT Probe / echo (payload follows) |
| 11 | 0x0010 |
KA |
Keepalive |
| 12 | 0x0008 |
FIN |
End-of-stream marker (stream mode) |
| 13-15 | -- | -- | Reserved (MUST be zero) |
The (FFGM, LFGM) pair encodes the fragment role of a DATA-bearing
Service Data Unit (SDU), SCTP-style begin/end flags (RFC 9260
sec. 3.3.1):
| FFGM | LFGM | Role |
|---|---|---|
| 1 | 1 | Sole / un-fragmented SDU (begin AND end) |
| 1 | 0 | First fragment of a multi-fragment SDU |
| 0 | 0 | Middle fragment |
| 0 | 1 | Last fragment |
Each fragment is carried in its own FRCP packet with its own seqno;
FRTX (the FRCT Retransmission service mode, see Section 2.2)
recovers individual fragments via the normal Retransmission Timeout
(RTO) / SACK / Recent Acknowledgement (RACK, RFC 8985) path. The
receiver reassembles the SDU at consume time once the contiguous
[FIRST .. LAST] run has fully arrived. On non-DATA packets the role
bits are unused and MUST be transmitted as zero.
In stream mode (qos.service == SVC_STREAM, see Section 16) there are
no SDU boundaries to encode, so FFGM and LFGM are unused and MUST
be transmitted as zero. End-of-stream uses a dedicated bit (FIN,
bit 12) carried on a 0-byte DATA packet, emitted at write-half close
(fccntl to FLOWFRDONLY), during linger drain, and at flow_dealloc;
emission is idempotent (first call wins). After contiguous delivery
of the FIN-bearing slot, the receiver latches byte_fin at the FIN's
start offset; flow_read returns 0 (end-of-file, EOF) once buffered
bytes have been drained up to byte_fin. Per-byte position is
carried by the [start, end) extension (Section 1.5).
1.3. SACK payload
A SACK packet has the FRCT_ACK | FRCT_FC | FRCT_SACK flag bits set
(bit numbering per Section 1.1). Following the 16-octet PCI, the
payload is a 2-octet block count (network byte order), 2 octets of
padding to 4-byte align the block list, then n_blocks pairs of
32-bit start/end seqnos describing present (received) ranges
above the cumulative ACK.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| n_blocks | padding (2 octets) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| start[0] |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| end[0] |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| start[1] |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
... n_blocks pairs total ...
n_blocks <= SACK_MAX_BLOCKS (2048). The per-flow effective cap is
further bounded by (frag_mtu - PCI - 4) / 8 blocks per packet; SACK
packets carry no stream extension, so PCI here is the 16-octet base
header even on stream-mode flows.
Wire invariant: every block produced by the receiver, except an
optional leading Duplicate SACK (D-SACK) block as described below,
describes a range strictly above the cumulative ACK carried in the
PCI ackno field (after(start[i], ackno)). This makes the D-SACK
convention below unambiguous; the receiver-side builder MUST
preserve it.
Duplicate SACK (D-SACK, RFC 2883) is signalled in-band: no flag
bit, no extra framing. Modular seqno arithmetic uses the
before() / after() comparators defined in the Notation block.
Encoding. When a duplicate is observed the receiver arms a
single-slot pending report (dsack_seqno + dsack_valid,
latest-wins across multiple arms before the next emit). On the
next outbound SACK the receiver prepends block[0] = [dsack_seqno,
dsack_seqno + 1) - always a one-seqno range - and clears the
flag. The three arm sites are listed in Section 10; case-1 sites
yield dsack_seqno < rcv_cr.lwe (the next pci.ackno), and the
case-2 site (rq_accept conflict) yields dsack_seqno in
[rcv_cr.lwe, rcv_cr.rwe).
Detection. The sender classifies block[0] by its relation to
pci.ackno:
- case 1 (RFC 2883 sec. 4.1.1, full duplicate)
before(blocks[0].start, pci.ackno)ANDpci.ackno - blocks[0].start <= MAX_DSACK_LAG(== RQ_SIZE). The lag bound rejects stale or spoofed reports beyond one receive window.- case 2 (RFC 2883 sec. 4.1.2, partial duplicate)
blocks[0]is a sub-range (with at least one endpoint differing) of someblocks[i>0]- i.e. the same packet's remaining SACK blocks already describe the duplicatedseqnoas received.
On detect, the sender:
- bumps
reo_wnd_multby 1, capped atREO_WND_MULT_MAX(= 20), per RFC 8985 sec. 6.2 step 4; - snapshots
dsack_lwe_snap = snd_cr.lwe, resetting the 16-cum-ACK halving counter so the multiplier doesn't decay while D-SACK evidence is still arriving; - excludes
block[0]from the gap-marking loop (n_real = n - 1), so a D-SACK alone never enters NewReno-careful recovery (see Section 8); only non-D-SACK blocks count as gaps.
The reo_wnd_mult halving cadence (once per 16 cumulatively-ACK'd
seqnos since the most-recent D-SACK arrival or halve event) and
the reset-to-1 on a HoL RTO fire are both per the same RFC 8985
clause. The clamp-and-skip path in the regular SACK-mark loop is
incidentally idempotent on any leftover case-1 or case-2 block
(start < snd_cr.lwe clamps to snd_cr.lwe and the inner loop
skips k == snd_cr.lwe; case-2 re-NULLs slots already marked
received by later blocks), so block[0] is harmless even when fed
to the loop.
1.4. RTTP payload
An RTTP (Round-Trip Time Probe) packet has only the FRCT_RTTP flag
set (bit numbering per Section 1.1). Following the 16-octet PCI,
the payload is 24 octets (packed):
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| probe_id |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| echo_id |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ nonce (16 octets, echoed verbatim) +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
probe_id- sender counter, 0 on reply, 0 reserved.
echo_id- peer's
probe_id, 0 on outbound probe. nonce- random, echoed unmodified, memcmp'd to defeat spoof.
1.5. Stream PCI extension
A stream-mode flow (qos.service == SVC_STREAM) carries an extra
8-octet extension after the 16-octet base PCI on every DATA packet
(bit numbering per Section 1.1):
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| start |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| end |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
start- octet offset of the first payload byte in the stream.
end- octet offset one past the last payload byte;
end - startequals the on-wire payload length.
Total stream-mode PCI for DATA packets is 24 octets (16 base + 8
extension); control packets (SACK, RTTP, bare ACK, KA, etc.) retain
the 16-octet base PCI. Stream mode MUST be negotiated at flow
allocation; the extension is present iff stream mode is in use,
never on a per-packet basis. Both peers MUST treat start/end as
monotonic 32-bit byte offsets; when a slot reaches the head of the
contiguous run with start not equal to the prior packet's end the
slot is silently dropped at delivery time (Section 16) rather
than rejected at stash.
This is the QUIC STREAM-frame reassembly model (RFC 9000 sec. 19.8):
each packet carries its packet seqno (this PCI's seqno field) and a
separate stream byte position (start/end). Separating the two
avoids TCP's conflation of packet identity with byte position which
forces Karn's algorithm for Round-Trip Time (RTT) sampling (no RTT
sample on retransmits, RFC 6298 sec. 3); FRCP applies the
Karn-equivalent gate via a combination of per-packet FRCT_RXM,
per-slot SND_RTX flags, and a sample-fence rtt_lwe (see Section 2.1
and Section 12). FRCP's fixed-32-bit start/end wrap at 4 GiB of
wire bytes, narrower than QUIC's 62-bit varint offset (cf. RFC 9000
sec. 16); the on-wire wrap is handled by the same modular before()
/ after() comparators (Section 1.3) FRCP uses for seqnos, which
remain unambiguous as long as the in-flight byte window stays
strictly under 2 GiB (the half-range of the signed-int32 difference
in before()). The default per-flow ring is 1 MiB; the
implementation caps ring_sz at 128 MiB (FRCT_STREAM_RING_SZ_MAX),
well below the 2 GiB half-range bound. The runtime byte counters
exposed via FUSE (Filesystem in Userspace) in the Ouroboros
Resource Information Base (RIB, a virtual-filesystem introspection
bridge) are platform size_t and do not wrap on 64-bit hosts.
2. Per-flow state and service modes
2.1. Per-flow state
Each flow keeps a sender control record and a receiver control record:
lweu32- snd: oldest unacked
seqno(cumulative ACK boundary as seen by sender); rcv: next in-orderseqnoexpected rweu32- snd: peer-advertised right window edge; rcv: locally-advertised right window edge
cflagsu8- per-direction feature flags: retransmission (
FRCTFRTX), receiver flow control (FRCTFRESCNTL), linger-on-close (FRCTFLINGER); see<ouroboros/fccntl.h> seqnou32- snd: next
seqnoto send; rcv: force-ACK trigger - set on a stale or dup DATA so the nextack_sndemits a fresh cumulative ACK acknou32- snd:
seqnocounter for standalone ACK-bearing control packets (delayed ACK, SACK, final ACK on dealloc); not bumped on piggybacked ACK riding a DATA packet (which uses the DATAseqno). Used by wire-dup ACK detection; rcv: incoming-ACK dedup tracker actns- last activity (used by inactivity / DRF)
inactns- inactivity threshold; sender =
3*mpl + a + r + 1s, receiver =2*mpl + a + r + 1s.mplis the Maximum Packet Lifetime (delta-t terminology; see Section 15);aandrare the FRCT a-timer and r-timer bounds (see Section 8). The asymmetry is load-bearing for pre-DRF NACK (Section 9).
The sender holds a per-slot ring snd_slots[RQ_SIZE] keyed by
(seqno mod RQ_SIZE). Each slot tracks its retransmit entry (rxm),
last-send timestamp, and retransmit flag bits: SND_RTX (a
retransmit is pending or has fired, gates the next RTT sample
under Karn) and SND_FAST_RXM (one-shot fast-retransmit staged for
this loss event).
The receiver holds a parallel reorder ring rcv_slots[RQ_SIZE]
(referred to as rq[] in prose) holding stashed out-of-order
packet-buffer indexes; both FRTX and best-effort flows share this
path. The invariant rwe - lwe <= RQ_SIZE holds: on each consume
the receiver advances rwe by the consumed count, capping the
receive window at RQ_SIZE seqno slots.
A separate fence variable rtt_lwe is bumped on every retransmit
(timer-fire, SACK-driven, fast-rxm, NACK-driven) and on every
seqno_rotate (Section 4) to mark the seqno range whose RTT samples
MUST be discarded.
2.2. Service modes (orthogonal axes)
FRCP exposes its wire features as a vector of independent QoS
axes selected at flow allocation time. All flows go through the
same flow_alloc(name, qos, ...) primitive; the qosspec_t passed
in determines which protocol machinery engages on the wire. This
contrasts with the POSIX BSD socket model where TCP and UDP
require different socket types (SOCK_STREAM / SOCK_DGRAM).
The axes:
service- 0 = unordered (no FRCP engagement: raw datagrams, no PCI on the wire, UDP-equivalent at this layer); 1 = message-ordered (FRCP engaged; SDU boundaries preserved across fragmentation); 2 = stream (byte-oriented, no SDU boundaries; FRTX required)
loss- 0 = lossless service requested: FRTX retransmit machinery engages (Section 8); MUST be 0 for
service=2. Non-zero = best-effort, FRTX off. ber- Bit Error Rate tolerance. 0 = error-free service requested: a CRC trailer is appended after the body of DATA packets and verified on receive (added / checked outside the FRCP PCI; see Section 1.1). Non-zero = peer accepts errors; trailer omitted. SACK control packets carry a CRC32 trailer regardless of
ber; thebergate applies to DATA only. timeout- Peer-timeout (ms); 0 disables the keepalive timer. Independent of FRCP engagement.
Encryption is a separate per-flow attribute set at flow setup;
when enabled it wraps the FRCP packet (PCI + body, plus the CRC
trailer if any) under AEAD, expanding the spb by headsz + tailsz
octets (nonce / tag). The CRC trailer is currently kept inside
the AEAD wrap (see Section 1.1).
Reachable combinations exported by include/ouroboros/qos.h:
| Cube | service |
loss |
ber |
Engaged |
|---|---|---|---|---|
qos_raw |
0 | 1 | 1 | Raw passthrough |
qos_raw_safe |
0 | 1 | 0 | Raw + CRC trailer |
qos_rt |
1 | 1 | 1 | FRCP, no FRTX, no CRC |
qos_rt_safe |
1 | 1 | 0 | FRCP, no FRTX, CRC |
qos_msg |
1 | 0 | 0 | FRCP + FRTX |
qos_stream |
2 | 0 | 0 | FRCP + FRTX, stream |
Forced couplings actually enforced by the public API:
service == SVC_STREAM(2) requiresloss == 0;flow_alloc/flow_acceptreject the pair otherwise with-EINVAL.- FRTX requires FRCP engagement (
service != SVC_RAW); requestingloss = 0withservice = SVC_RAWis structurally a no-op because nofrctiis created. - The
QOS_DISABLE_CRCbuild flag globally forcesber = 1. Note: this flag defaults to ON, so default builds ship with CRC disabled untilQOS_DISABLE_CRCis set to OFF.
Caveat: the API does NOT force ber = 0 when service != SVC_RAW.
qos_rt has service = SVC_MESSAGE with ber = 1, which means the PCI
itself is not CRC-protected on that cube; the HCS (Section 1.1)
remains the only integrity check on the header.
The FRCP-no-FRTX regime (service = SVC_MESSAGE, loss > 0) is meaningful
and live: sequence numbering, in-order delivery, flow-control
advertisement, KA, DRF rotation, and SDU fragmentation /
reassembly (Section 7.2) all run. Lost packets are dropped
rather than retransmitted; a permanently-lost mid-fragment is
dropped via skip-past-gap once a later SDU is visible in the
reorder ring.
3. Protocol parameters
| Parameter | Value | Role |
|---|---|---|
RQ_SIZE |
compile-time, power of 2 (default 128) | Slot ring / rcv window width |
START_WINDOW |
compile-time, power of 2 (default 128) | Initial rwe-lwe after rotate
|
RTO_MIN |
MAX(250 us build-tunable, 1<<RXMQ_RES); per-flow via fccntl (FRCTSRTOMIN). Default ~1 ms with RXMQ_RES=20. |
RTO floor; also floored at the retransmit-wheel resolution (~1 ms by default). |
MAX_RTO_MUL |
20 | Backoff shift cap |
RACK window R |
MIN(reo_wnd_mult * min_RTT/4, SRTT) with MIN_REORDER_NS = 250 us floor; reo_wnd_mult scales on D-SACK, cap 20 |
Reorder window; per RFC 8985 sec. 6.2; reo_wnd_mult per sec. 6.2 step 4
|
MIN_RTT_WIN_NS |
300 s (5 min, Linux tcp_min_rtt_wlen) |
min_RTT windowed re-anchor
|
REO_WND_MULT_MAX |
20 (RFC 8985 sec. 6.2 step 4) | reo_wnd_mult cap
|
REO_DECAY_PKTS |
16 (RFC 8985 sec. 6.2 step 4 / RACK.reo_wnd_persist) |
Fresh-ACK'd seq count per halving |
MAX_DSACK_LAG |
RQ_SIZE |
D-SACK sanity cap |
RTT_QUARANTINE |
32 (seqno steps) |
NewReno gate pad |
| SACK rate-limit | SACK_MIN_GAP_NS (250 us, fixed) |
Min SACK gap |
SACK_MAX_BLOCKS |
2048 (wire cap; per-flow capped at (frag_mtu-PCI-4)/8) |
Per-SACK block cap |
SACK_RXM_MAX |
32 | Per-pass staged retransmit cap |
DUP_THRESH |
3 (RFC 8985 default) | Hybrid fast-rxm trigger (Section 8) |
MDEV_MUL |
2 (build-tunable via FRCT_RTO_MDEV_MULTIPLIER) |
mdev shift in RTO = srtt + (mdev << MDEV_MUL)
|
| RTTP nonce | 16 octets | Echoed verbatim |
RTTP_RING |
8 | In-flight probes |
| RTT clamp | 16 * srtt |
Probe-sample upper bound (ACK-derived RTT samples gated by Karn / recovery only) |
| Cold-probe cadence | 100 ms (rx-driven; see Section 12) | Pre-srtt RTTP rate
|
DELT_RDV |
100 ms | RDVS emit cadence |
MAX_RDV |
1 s | RDVS give-up |
| Delayed-ACK fire | 2 * TICTIME (TICTIME = FRCT tick granularity, default 5 ms; 2*TICTIME = 10 ms by default) |
Fired after the first in-order DATA arrival; tick is build-tunable |
| NACK send cooldown | srtt when an srtt sample exists, else 100 ms |
Pre-DRF NACK rate-limit |
MAX_SDU |
1 MiB | Max reassembled SDU; configurable per flow |
The per-flow fragment Maximum Transmission Unit (MTU) is computed
at flow setup from the lower IPCP's mtu minus encryption
headsz / tailsz and CRC trailer; there is no FRCT-level default or
environment-variable override.
4. Sequence-number rotation (DRF)
The DRF (Data Run Flag) bit on an outbound packet means "this is
the start of a fresh data run" and is set whenever the sender has
nothing in flight (snd_cr.seqno == snd_cr.lwe).
Independently of that, if the sender has been idle longer than
snd_cr.inact AND the pipe is empty (snd_cr.seqno == snd_cr.lwe),
seqno_rotate() rolls a random new seqno before the send and
resets
snd_cr.seqno = random()
snd_cr.lwe = snd_cr.seqno
snd_cr.rwe = snd_cr.seqno + START_WINDOW
rtt_lwe = snd_cr.seqno
in_recovery = false (recovery state, see Section 8)
recovery_high = snd_cr.seqno
The receiver, on observing rcv-side inactivity
(now - rcv_cr.act > rcv_cr.inact), requires a DRF on the next
DATA packet; otherwise it replies with a rate-limited NACK (see
below). Non-DATA control packets pass through without the DRF
requirement. On DRF the receiver releases the rq[] slots and
rebases
rcv_cr.lwe = seqno
rcv_cr.rwe = seqno + RQ_SIZE
rcv_cr.seqno = seqno
If the inactive packet has DATA but no DRF, a rate-limited NACK is fired back to the sender (cooldown per Section 3); non-DATA stale arrivals fall through to normal processing (no NACK, no drop).
5. Send path
- If the SDU exceeds
(frag_mtu - data_hdr_len), the caller (dev.c) fans it out intoceil(count / (frag_mtu - data_hdr_len))fragments, each emitted viafrcti_sndas its own DATA packet with a per-fragment role (Section 7.2); both FRTX and best-effort flows fragment. Raw flows (no FRCP engagement,qos.service == SVC_RAW) carry no PCI and return-EMSGSIZEfor any SDU larger than one packet at the layer below. An SDU that fits in a single packet is sent as SOLE.frcti_sndreserves PCI head room; sets DATA, plus DRF when the pipe is empty (snd_cr.seqno == snd_cr.lwe). seqno_rotate()if past sender inactivity and the pipe is empty (Section 4).- Advertise FC (
pci.window = frcti_advert_rwe(frcti), i.e.rcv_cr.rweclamped torcv_cr.lwe + ring_seq_capin stream mode) when the receiver side is recent:now - rcv_cr.act < rcv_cr.inact. - Reliable mode (FRTX): leave
snd_cr.lwewhere it is; reset the slot atRQ_SLOT(seqno)(snd_slots[p].time = now,snd_slots[p].flags = 0); queue anrxm_entry(saves a packet copy, arms a wheel timer atnow + (rto << rto_mul)). Piggyback ACK (pci.ackno = rcv_cr.lwe) while the a-timer for the most recent received DATA packet has not yet expired (now - rcv_cr.act <= t_a); on piggyback, setrcv_cr.seqno = rcv_cr.lweso the next delayed-ACK fire is suppressed. See Section 8 fort_a/t_rsemantics. - Best-effort mode (no FRTX): advance
snd_cr.lweimmediately (snd_cr.lwe = snd_cr.lwe + 1,snd_cr.rwe = snd_cr.lwe + RQ_SIZE); no retransmit state. No send-side RTT probe is armed in this mode (rtt_probe_armrequires an in-flightseqno, which best-effort never has); the rx-driven cold seeder infrcti_rcvis the only probe path. - In reliable mode, optionally arm an RTT probe (Section 12).
6. Receive path
6.1. Early-exit dispatch
Keepalive (KA), RTT probe (RTTP), pre-DRF NACK, and rendezvous
(RDVS) packets short-circuit out of frcti_rcv before the locked
main path; each handler takes its own lock internally.
incoming packet
|
v
+---------+
| KA? |---yes--> ka_rcv ; return
+---------+
|no
v
+---------+
| RTTP? |---yes--> rttp_rcv; return
+---------+
|no
v
+---------+
| NACK? |---yes--> nack_rcv; return (see Section 9)
+---------+
|no
v
+---------+
| RDVS? |---yes--> rdv_rcv ; return (reply bare FC, ackno=0)
+---------+
|no
v
acquire wrlock; enter locked main path
KA- refresh
t_ka_rcv, honour piggybacked ACK. RTTP- probe (echo back nonce) or echo (verify nonce, sample RTT).
NACK- pre-DRF, sender-side handler. See Section 9.
RDVS- reply with a bare FC packet (
ackno = 0);rdlockonly.
6.2. Locked main path
Steps below run with the per-flow frcti.lock held for writing
(pthread_rwlock_wrlock) unless noted.
rcv_inact_check- Only meaningful when the receive side is stale. On DRF (Data Run Flag): release
rq[]slots, rebasercv_cr, continue. On stale DATA without DRF: fire a pre-DRF NACK if cooldown allows (Section 9), then discard the packet; on cooldown, drop without sending a NACK (a pending cumulative ACK fromdrop_packetmay still go out). Non-DATA, non-DRF arrivals bypassrcv_inact_checkentirely; pure-DRF stale arrivals fall through after the DRF rebase branch.
- DATA-only act refresh
- Refresh
rcv_cr.actonly whenFRCT_DATAis set, so that non-DATA packets never block the next DRF rebase.
- Wire-dup gate
- Before flag-driven dispatch, drop wire-duplicate ACKs and wire-duplicate DATA (
is_dup_ack/is_dup_data). The DATA check is bypassed forFRCT_RXM-bearing arrivals so the piggybacked ACK / SACK / FC carried on a retransmitted DATA at an already-ACK'dseqnois still applied; the stale-in-window branch below then drops the packet.
ACK- Drop ACKs whose
acknofalls outside(snd_cr.lwe, snd_cr.seqno]. Ifackno == snd_cr.lwe(non-advancing cumulative ACK), drive RACK fast-retransmit consideration (Section 8). Otherwise advancesnd_cr.lwe = ackno, collapserto_multo 0 (Karn-gated bySND_RTXon the just-acknowledged slot, the old head-of-line), resetdup_threshto 0, updatet_latest_ackto the send-time of the slot atackno-1(consumed by RACK and SACK below), decayreo_wnd_multper RFC 8985 sec. 6.2 step 4, exit NewReno-careful recovery (see Section 8) onackno >= recovery_highorackno == snd_cr.seqno, and feed an RTT sample if eligible (Section 12).
SACK- Walk the block list. For each block (a present range above
lwe) NULL outsnd_slots[k].rxm, clear the slot's per-send flags, and advancet_latest_ackto the latest send-time covered (the Forward Acknowledgement / fack equivalent, Mathis & Mahdavi 1996); the first block whose start clamps tosnd_cr.lweskips this fack update so that a head-of-line clamp does not falsely advance fack. For un-SACKed gaps belowhi_sacked, stage a retransmit per slot that is (1) still owned (rxm != NULL), (2) not alreadySND_FAST_RXM, (3) not aged out pastt_r, and (4) either outside the RACK reorder windowROR withdup_thresh >= DUP_THRESH(the RFC 8985 sec. 6.2 hybrid trigger). Mark the slotSND_FAST_RXMand NULL therxmat stage time. Capped atSACK_RXM_MAXstaged retransmits per receive pass; what's left rides the next SACK.
FC- Bump
snd_cr.rwe(clamped tolwe + RQ_SIZE, never shrinks) and mark window open.
DATA- Bounds-check
seqnoagainst window. On stale-dup (seqno < rcv_cr.lwe), setrcv_cr.seqno = seqnoto force a fresh ACK on the nextack_snd, then drop. On accept: both FRTX and best-effort stash the packet-buffer index intorq[seqno mod RQ_SIZE]. Fragments stash unchanged - the role bits are inspected only at consume time (Section 7.2). On out-of-order arrival, build a SACK reply if not rate-limited (per Section 3) and not deduplicated against the previous(rcv_cr.lwe, n_blocks)pair; D-SACK reports always bypass the dedup. If both rate-limit and dedup suppress the reply, neither SACK nor delayed-ACK fires (the sender picks up the gap on its next ACK). On in-order arrival, arm the delayed-ACK timer.
drop_packetexit- Releases the per-packet shared-memory buffer (
spb), then callsack_sndsynchronously after thespbrelease to surface any pending cumulative ACK.
7. Read path and reassembly
7.1. Read path
flow_read returns a full reassembled SDU (Service Data Unit) via
frcti_consume on every FRCP SDU-mode flow (FRTX or best-effort);
stream-mode is covered in Section 16. An incomplete head-of-line
(HoL) run yields -EAGAIN; an oversized run yields -EMSGSIZE (the
run is dropped so the flow does not stall). On best-effort flows,
a permanently-lost mid-fragment is dropped as soon as a later
complete SDU becomes visible in the ring (Section 7.2 skip-past-
gap).
Raw flows carry no frcti, so flow_read returns the next pending
packet-buffer index directly, with no role-bit inspection. (Raw
service is selected via qos.service == SVC_RAW at flow allocation,
which suppresses frcti creation.)
frcti_pdu_ready is the no-advance peek used by fevent (the
Ouroboros flow-event multiplexer, the poll(2)-equivalent on
flows). It returns ready only when the head-of-line run is
complete and the lead packet (a Protocol Data Unit, here one FRCP
packet) is present at rcv_cr.rwe - RQ_SIZE; any other state
(including the best-effort skip-past-gap case) returns not ready,
and frcti_consume is left to drop the broken prefix and re-
inspect.
7.2. Fragmentation and reassembly
Send side (flow_write_frag). An SDU larger than
(frag_mtu - PCI) is split into ceil(count / (frag_mtu - PCI))
fragments; each fragment is its own FRCP packet with its own
seqno and a per-fragment role flag pair (Section 1.2). Roles are
assigned at emit time:
| i | Role |
|---|---|
n=1 |
SOLE
|
i=0 |
FIRST
|
i=n-1 |
LAST
|
| else | MID
|
A mid-loop allocation or transmit failure may yield a partial
write: the call returns the bytes already enqueued (off > 0) or
the underlying error (off == 0). Best-effort flows fragment
identically; on the receiver, a partial run with a permanently-
lost fragment is dropped when a later complete SDU is visible in
the ring (see skip-past-gap below). Raw flows carry no PCI and
refuse anything larger than the layer's user MTU (-EMSGSIZE).
Wire-level recovery is fragment-agnostic on FRTX flows: each
fragment's seqno flows through SACK / RACK / RTO / NACK exactly
as for a SOLE DATA packet, and reassembly does not re-enter the
loss-detection path. Best-effort flows run the same seqno
machinery (DRF, FC, ACK piggyback, pre-DRF NACK emit) but queue
no rxm state at the sender, so a lost MID is unrecoverable;
skip-past-gap handles it (below).
Receive side. Fragments stash into rq[seqno] unchanged; role bits
are read only at consume time. frag_run_inspect, called from
frcti_consume, walks the ring starting at the oldest still-
undelivered seqno base = rcv_cr.rwe - RQ_SIZE (equal to rcv_cr.lwe
only when no partial run is in progress; during a partial run lwe
has already advanced past base). It produces one of three
outcomes:
| Outcome | Cause |
|---|---|
DELIVER (n)
|
rq[base]=SOLE (n=1), or rq[base]=FIRST and a LAST follows in slots [base+1..base+n-1] with all intermediate roles in {MID,FIRST,LAST} contiguous.
|
DROP (n)
|
rq[base] is MID or LAST without a preceding FIRST (n=1); a FIRST..[non-LAST]..new-FIRST or new-SOLE mid-run (drop the broken prefix with n = run length minus 1, so the new FIRST/SOLE stays); or, on best-effort flows, a gap at base with a FIRST/SOLE later in the ring (drop up to the new run start).
|
NOT_READY
|
rq[base] absent or FIRST..[non-LAST] with no later FIRST/SOLE in the ring (FRTX waits for retx; best-effort waits for arrival).
|
DELIVER triggers frag_gather: a scatter-gather memcpy of the n
consecutive fragments at rq[base..base+n-1] directly into the
caller's buffer; each per-packet shared-memory buffer (spb) is
released and rwe advances by n. lwe was already advanced
incrementally as each contiguous fragment arrived; frag_gather
only restores the fixed-width invariant rwe == lwe + RQ_SIZE.
No intermediate reassembly buffer is allocated.
DROP advances rwe past the broken prefix (releasing the spbs)
and pulls lwe up to the new trailing edge if needed; the next
consume retries from the new base. Oversize or arithmetically
overflowing delivery (sum of fragment lengths > max_rcv_sdu, sum
> caller's buffer, or running-sum overflow) also drops the run
with -EMSGSIZE.
Skip-past-gap (best-effort only). On FRTX, a gap in the run means
"waiting for retransmit" and frag_run_inspect returns NOT_READY.
On best-effort flows the gap is permanent, so frag_run_inspect
scans forward in the ring for the next FIRST or SOLE; if one is
visible within RQ_SIZE, it returns DROP for the broken prefix and
the consume loop retries at the new lwe. Memory hold is bounded
by RQ_SIZE; the partial releases on the next consume call once a
later complete run exists. Voice-like flows (one SOLE per SDU)
see no extra wait: any later SOLE makes the prior gap droppable
immediately.
The choice to defer reassembly to consume time keeps the receive path zero-copy: fragments stay in the shared-memory ring until the application pulls, and the SDU lands directly in the caller's buffer.
8. Retransmission
FRCP is bounded by two delta-t-derived timers (Watson 1981, see Section 15):
t_a(a-timer): upper bound on ACK delay. An ACK for a received DATA packet MUST be emitted withint_aof receipt; an attempt to send an ACK after the a-timer has expired is suppressed (the sender's RTO is already in motion).t_r(r-timer): upper bound on retransmission. A given DATA packet MUST NOT be retransmitted aftert_rhas elapsed since its first send (t0); when the bound is hit, the flow is declared down (raising the Ouroboros asynchronous flow conditionACL_FLOWDOWN, which marks the flow dead to both endpoints) rather than retransmitted again.
Each in-flight FRTX seqno owns one rxm_entry, armed in a hashed
timing wheel; the wheel deadline is the slot's next eligible
retransmit time.
- RTO timer
- On fire (
rxm_due), re-emit withFRCT_RXM, markSND_RTX(Karn-suppress next ACK's RTT sample), and (for the head-of-line (HoL) slot only) bumprto_mulup toMAX_RTO_MUL. Wheel deadline ist_send + (rto << rto_mul). Re-armed unless consumed. The RTO timer also clearsSND_FAST_RXM(re-arming fast-retransmit eligibility), resetsreo_wnd_multto 1 on a HoL fire (RFC 8985 sec. 6.2 step 4 reset clause), and marks the flowACL_FLOWDOWNif itsfrct_txcall fails.
- r-timer guard
- Before any retransmit attempt, check
(now - t0)againstt_r. If exceeded, the slot is no longer eligible for retransmit. Only the RTO timer (rxm_due) treats r-timer expiry as terminal: it marks the flowACL_FLOWDOWN(peer unreachable). Fast-retransmit, SACK-driven retransmit, and NACK-driven head-of-line re-emit silently skip aged-out slots and defer the flow-down decision to the next RTO fire.
- Fast retransmit (hybrid trigger, RFC 8985 sec. 6.2)
- On a non-advancing cumulative ACK with the scoreboard advanced, fire one fast retransmit when EITHER (a) the head-of-line slot's latest send is older than the RACK reorder window
R(Section 3) and not yet aged out, OR (b) the SACKdup-threshcount abovesnd_cr.lwereachesDUP_THRESH(= 3, RFC 8985 sec. 6.2 step 4). Fires at most once per non-advancing cumulative-ACK value, gated byrack_fired_lwe(thesnd_cr.lweat which fast-retransmit last fired). SetSND_FAST_RXMon the slot (one-shot per-slot gate) and enter NewReno-style careful recovery (see NewReno below in this section). - The RACK reorder window
Ruses the RFC 8985 sec. 6.2 formR = MIN(reo_wnd_mult * min_RTT / 4, SRTT)with aMIN_REORDER_NS = 250 usfloor. Before the first RTT sample seedsmin_rtt,Rfalls back toMIN(reo_wnd_mult * SRTT / 4, SRTT), still floored atMIN_REORDER_NS(consistent with the windowed-minimum fallback described in Section 12).min_rttis a windowed minimum over the lastMIN_RTT_WIN_NS= 5 min of RTT samples (matches the Linuxtcp_min_rtt_wlendefault) so a route change to a longer path eventually re-anchors the reorder window without relying onreo_wnd_multgrowth alone.
- SACK-driven retransmit
- For each gap below
hi_sackedwhose slot is (1) still owned, (2) not alreadySND_FAST_RXM, (3) not aged out pastt_r, and (4) either outside the RACK windowROR withdup_thresh >= DUP_THRESH(same hybrid as fast-retransmit, see Section 6.2), re-emit. Each SACK-driven retransmit re-arms a freshrxmso a lost retransmit can still be recovered by its own RTO timer.
- NewReno
- On entry,
recovery_high = snd_cr.seqno + RTT_QUARANTINE. Exit whenackno >= recovery_highorackno == snd_cr.seqno(the latter means everything sent has been acknowledged).seqno_rotatealso clears recovery.
9. Pre-DRF NACK
The two sides have different inactivity thresholds (snd_cr.inact > rcv_cr.inact), so a receiver can detect "stale data run" before the sender's own DRF logic kicks in. NACK is the receiver-driven nudge that asks the sender to re-transmit the head of the run.
- Send (
frcti_nack_snd, called byfrcti_rcvwhenrcv_inact_checkreturnsFRCT_INACT_NEED_NACK) - When an incoming DATA packet has no DRF and rcv-side activity is older than
rcv_cr.inact, the receiver emits a bare packet withflags = FRCT_NACKandseqno = arrival_seqno - 1(informational only, not consulted by the receive handler). The cooldown in Section 3 rate-limits the burst. Non-DATA non-DRF arrivals bypassrcv_inact_checkentirely; non-DATA DRF still rebases via the DRF branch.
- Receive (
frcti_nack_rcv) - Dispatched in the early-exit branch (Section 6.1), before
rcv_inact_check. The sender copies the head-of-line (HoL)rxmpacket, marks the slotSND_RTX | SND_FAST_RXM(Karn-suppress next ACK, one-shot fast-rxm gate), setsrtt_lwe = snd_cr.lwe + 1, and re-emits viafast_rxm_sendwithFRCT_RXMand a refreshedackno. The originalrxm_entryand its RTO timer are left armed - the NACK emit is additive to the normal retransmit machinery, not a replacement. No-op if nothing is in flight, the HoL slot has aged pastt_r, or the HoLrxmpointer has been cleared by SACK or RACK.
NACK has exactly one role: lost first-of-run (DRF) packet recovery. Until the DRF packet arrives, the receiver cannot rebase its window, so any subsequent in-flight packets look stale to the receiver. The NACK fires the moment a stale receiver sees DATA without DRF, telling the sender to re-emit the head-of-line (DRF) packet at NACK-cooldown latency rather than waiting for the initial RTO (which is the configured default until srtt is seeded by the first probe round-trip). Mid-stream loss is NOT NACK-driven; it is recovered by the sender's RTO, fast retransmit, and SACK-driven retransmit paths (Section 8) only.
The existing rxm_entry and its RTO timer are left armed on a NACK re-emit, so the RTO path remains the eventual fallback.
10. Cumulative + selective ACK
Cumulative ACK is ackno = rcv_cr.lwe. On out-of-order arrival the
receiver also emits a SACK packet (Section 1.3) whose payload lists
present blocks above lwe (analogous to TCP SACK / QUIC ACK
ranges). SACKs are rate-limited per Section 3 and suppressed when
neither lwe nor block count has changed since the last SACK.
D-SACK reports (RFC 2883) are emitted in-band as block[0] of an
otherwise normal SACK frame (see Section 1.3 for the encoding).
Two receiver triggers arm a pending D-SACK report (single-slot,
latest-wins):
- DATA arrival with
seqno < rcv_cr.lwe, both wire-dup (no RXM,is_dup_datapath) and retransmit (RXM, post-FC branch) (RFC 2883 sec. 4.1.1, full duplicate) rq_acceptconflict, slot already occupied in[lwe, rwe)(RFC 2883 sec. 4.1.2, partial duplicate)
When a D-SACK is pending and the standard scoreboard SACK would be
suppressed by dedup or rate-limit, the report is emitted as a
stand-alone SACK frame through the normal ack_snd path; when a
D-SACK report is pending the path bypasses dedup and the TICTIME
rate-limit, but the a-timer suppression on rcv inactivity still
applies.
Bare ACKs are deferred via a per-flow delayed-ACK timer (one in
flight at a time, atomic test-and-set dedup; fires per Section 3
after the first in-order arrival). Suppressed if (1) no new
seqno, (2) rcv side is inactive (older than t_a), or (3) the
sender just sent within TICTIME. A pending D-SACK ride-through
bypasses (1) and (3); the a-timer gate (2) is unconditional.
11. Flow control
The receiver advertises rwe in every FC field. The sender treats
its snd_cr.rwe as the absolute right edge: when
snd_cr.seqno >= snd_cr.rwe the window is closed and flow_write
yields. While closed, the sender periodically emits RDVS
(rendezvous) packets (cadence DELT_RDV); the receiver replies with
a bare FC packet (ackno = 0) that reopens the window. Once the
window has been closed for longer than MAX_RDV the sender stops
emitting RDVS but does not tear the flow down - the writer keeps
blocking until either a peer-driven FC arrives or the KA
(keepalive) / r-timer marks the flow.
rwe is clamped to lwe + RQ_SIZE on receipt and MUST NOT shrink:
a backward rwe is silently clamped to the current snd_cr.rwe;
the FC packet still reopens the window.
12. RTT estimation
Active RTTP probes (Section 1.4) carry a 32-bit probe_id (0
reserved) and a 16-byte random nonce echoed verbatim - defends
against spoofed replies. A ring of RTTP_RING in-flight probes is
kept; an echo whose (id, nonce) doesn't match the ring slot is
dropped. A single RTTP sample is clamped to RTT_CLAMP_MUL * srtt
(compile-time RTT_CLAMP_MUL = 16) once srtt is seeded; the first
cold-probe sample feeds rtt_update raw.
Probe arming gates:
- Cold (no
srttyet) - the receive path arms at most one probe per 100 ms via
frcti_rcv_probe(PROBE_DUE_COLD); arming requires an incoming packet. Active send-path arming bails whilesrtt == 0. - Warm (
rtt_probe_arm, called fromfrcti_snd) - outstanding data (
snd_cr.seqno > snd_cr.lwe), AND at least2 * srttsincet_rcv_rtt(last RTT receive of any kind), AND at leastsrttsincet_snd_probe(last probe emit).
Sample feeds either Linux's asymmetric mdev estimator
(FRCT_LINUX_RTT_ESTIMATOR, default ON) or RFC 6298 symmetric EWMA
(compile option). srtt is floored at 10 ms when seeded from a
hint, at 1 us after every update (including the first seeding
sample); mdev floored at 100 ns.
RTO = max(rto_min, 2 * srtt, srtt + (mdev << MDEV_MUL))
(the 2 * srtt floor is an FRCT addition not in RFC 6298).
Effective wheel deadline capped per Section 3.
ACK-derived samples (frcti_ack_rcv -> rtt_sample_eligible), beyond
the cum-ACK advance gate in frcti_ack_rcv (ackno > lwe and
ackno <= seqno), require all of: not in recovery; ACK packet does
not carry FRCT_RXM; HoL slot's SND_RTX bit clear; slot's rxm
pointer non-NULL (not SACK-consumed); lwe not below the rtt_lwe
fence; srtt already seeded by an RTTP probe. There is no ACK-only
seeding.
Every eligible sample also feeds RACK.min_RTT (RFC 8985 sec. 6.2)
via a windowed minimum: replace whenever the sample is strictly
smaller OR more than MIN_RTT_WIN_NS (5 min, matches Linux
tcp_min_rtt_wlen) has elapsed since the current min was set. The
downward branch is immediate (faster path picked up at once); the
upward branch is gated on the window (a transient queue burst does
not poison the estimate, but a sustained route change to a longer
path re-anchors min_RTT after at most one window). Seeded from
rtt_hint at rtt_init; 0 acts as the unset sentinel and the base
in rack_reorder_window falls back from min_RTT to SRTT (so
R = mult * SRTT/4, capped at SRTT, floored at MIN_REORDER_NS)
until the first sample. See Section 6.2.
13. Liveness (keepalive)
When qs.timeout > 0 a per-flow KA (keepalive) timer is armed.
Arming uses rcv_cr.act for the deadline computation:
deadline = min(snd_act + qs.timeout/4, rcv_act + qs.timeout)
(clamped to now + qs.timeout/4 if already past). The timer fires
either on sender idleness (to send a KA) or on receiver idleness
(to declare the peer dead). On fire (ka_snd) the peer-dead test
uses max(rcv_cr.act, t_ka_rcv) so a recent KA reply counts even
when no DATA has arrived:
- If
now - max(rcv_cr.act, t_ka_rcv) > qs.timeout, mark the flowACL_FLOWPEERand notify the per-process flow-event set (proc.fqset) withFLOW_PEER. - Else if
snd_idle > qs.timeout/4, emit a bareKA | ACK(ackno = rcv_cr.lwe) and re-arm. - Else just re-arm.
Note: rx_rb and tx_rb are the receive and transmit shared-memory
ring buffers. The r-timer raises ACL_FLOWDOWN on both (route is
broken); keepalive raises ACL_FLOWPEER on rx_rb only and notifies
the flow-event set (peer is silent, writer keeps tx_rb usable) -
distinct ACLs. qs.timeout == 0 disables keepalive entirely; a
silent peer crash is then undetected.
14. Linger / teardown
On flow_dealloc, frcti_dealloc computes a grace timeout
max(rcv_cr.act + rcv_cr.inact, snd_cr.act + snd_cr.inact) - now
(floored at 0 and converted to seconds) and returns it; flow_dealloc
forwards this to the IRMd as the dealloc grace. The IRMd, not FRCT,
performs the wait. Before computing the timeout, FRCT may emit a
final ACK when rcv_cr.lwe != rcv_cr.seqno (the peer has not been
told the most recent cumulative ACK) AND the rcv side has been
active within t_a (a-timer not aged out).
FRCTFLINGER is honoured only when snd_cr.lwe < edge, where edge =
snd_fin_seqno after FIN has been sent in stream mode and
snd_cr.seqno otherwise (data or FIN still in flight). The drain
itself runs in flow_dealloc's while (FRCTI_LINGERING) loop, not in
frcti_dealloc.
The fd is single-reader / single-writer (documented in the
manpages). flow_write pumps rx_rb on every call (via
flow_wait_window -> flow_drain_rx_nb) and additionally blocks on
rx_rb when the send window is closed. A pure-writer thread thus
consumes ACKs without a dedicated reader.
15. Heritage and adopted techniques
Delta-t (Watson, 1981) is the primary heritage; FRCP descends from
the delta-t protocol family via the Recursive InterNetwork
Architecture (RINA; Day, "Patterns in Network Architecture", 2008,
ch. 9). Timer-based connection management
(no SYN/FIN handshake, per-flow state born on first DATA and
reclaimed after t_mpl + a + r of silence), the DRF marker, and the
t_mpl / t_a / t_r timers all come from delta-t. See Watson,
"Timer-Based Mechanisms in Reliable Transport Protocol Connection
Management", Computer Networks 5 (1981).
The unified flow_alloc(name, qos, ...) primitive and its
multi-axis QoS-cube argument (Section 2.2) also come from RINA
(Day 2008, ch. 6; Grasa et al., "IRATI: investigating RINA as an
alternative to TCP/IP", Computer Networks 92 (2015)) - reliability,
ordering, CRC presence, and encryption are flow attributes, not
separate sockets or protocols.
The table below summarises additional adopted techniques and their references.
| FRCP mechanism | Heritage | Reference / note |
|---|---|---|
Random new seqno on seqno_rotate |
TCP ISN | RFC 6528 (Gont & Bellovin, 2012). QUIC PN-space reset (RFC 9000 sec. 12.3) is a structural analogue. |
| Cumulative ACK, left-window-edge advance | TCP | RFC 793 / RFC 9293 |
| Receive window with non-shrink rule | TCP | RFC 793 sec. 3.7 / RFC 9293 sec. 3.8.6; RFC 1122 sec. 4.2.2.16 for the explicit non-shrink prohibition |
Modular seqno arithmetic (before/after helpers) |
TCP | RFC 793 sec. 3.3 / RFC 9293 sec. 3.4 |
| Selective ACK block list | TCP | RFC 2018 (Mathis et al., 1996). Encoded as a typed FRCP packet rather than a TCP option, so framing is closer to QUIC ACK frames. D-SACK (RFC 2883) carried in-band as block[0]; see Section 1.3.
|
NewReno-careful recovery with recovery_high gate |
TCP | RFC 6582 (Henderson et al., 2012); QUIC builds on the same model in RFC 9002 sec. 7.3.2. Cwnd half absent (CC in IPCP). |
| RACK reordering window for fast retransmit | TCP | RFC 8985 (Cheng et al., 2021). FRCP R = MIN(reo_wnd_mult * min_RTT / 4, SRTT) with a MIN_REORDER_NS = 250 us floor against srtt collapse; matches RFC 8985 sec. 6.2 and Linux tcp_rack_reo_wnd. DSACK-driven reo_wnd_mult (sec. 6.2 step 4) is adopted; see Section 1.3 for the wire encoding. The hybrid RACK-or-DUP_THRESH trigger from RFC 8985 sec. 6.2 step 4 is adopted (Section 8). QUIC's analogue in RFC 9002 sec. 6.1.2 uses max(srtt, latest_rtt) as the base.
|
| Karn's algorithm: no RTT sample on retransmits, RTO-collapse freeze | TCP | Karn & Partridge, "Improving Round-Trip Time Estimates in Reliable Transport Protocols", SIGCOMM 1987; RFC 6298 sec. 3. |
RTO formula RTO = max(RTO_MIN, srtt + (mdev << MDEV_MUL)) |
TCP | RFC 6298 (Paxson et al., 2011). RTO_MIN = 250 us is below RFC 6298 sec. 2.4's 1 s SHOULD-floor - a recursive-layer choice.
|
Linux asymmetric mdev estimator (default) |
Linux kernel | tcp_rtt_estimator() in net/ipv4/tcp_input.c; the if(delta<0) m>>=3 dampening is a kernel divergence from RFC 6298. RFC 6298 EWMA available behind a compile flag.
|
| Delayed ACK with rate suppression | TCP | RFC 813 (Clark, 1982); RFC 1122 sec. 4.2.3.2; RFC 5681 sec. 4.2. Single-deadline coalescing rather than "ack-every-other-segment". |
| Zero-window-probe / persist-timer analogue (RDVS) | TCP | RFC 1122 sec. 4.2.2.17 / RFC 9293 sec. 3.8.6.1. RDVS solicits an FC reply, distinct from QUIC DATA_BLOCKED (RFC 9000 sec. 19.12), which is one-way notification. MAX_RDV give-up departs from TCP.
|
| Multiplexed control on a single PCI | SCTP / QUIC | SCTP chunk bundling (RFC 9260 sec. 6.10); QUIC frame multiplexing (RFC 9000 sec. 12.4). Cleaner fit than TCP's separate-flag-bits design. |
| ACK ranges as multiple discontiguous acked blocks | QUIC | QUIC ACK frame (RFC 9000 sec. 19.3). FRCP SACK is conceptually QUIC-frame-shaped even though encoded as absolute [start,end] pairs.
|
| Nonce-authenticated active RTT / liveness probing (RTTP) | QUIC PATH_CHALLENGE |
PATH_CHALLENGE / PATH_RESPONSE (RFC 9000 sec. 8.2, sec. 19.17, sec. 19.18). WebRTC ICE consent-freshness (RFC 7675) is the same pattern. QUIC's nonce is 8 octets; FRCP chooses 16.
|
| Probing distinct from keepalive | QUIC | KA timer answers "peer alive?", RTTP answers "path measurable?", as in QUIC PING (RFC 9000 sec. 19.2) vs PATH_CHALLENGE.
|
| Bare KA + ACK keepalive packets | QUIC / SCTP | QUIC PING (RFC 9000 sec. 19.2); SCTP HEARTBEAT / HEARTBEAT-ACK (RFC 9260 sec. 8.3). SCTP HEARTBEAT also carries an opaque echoed blob, structurally similar to FRCP RTTP. |
(FFGM, LFGM) fragment-role bits (Section 7.2) |
SCTP | RFC 9260 sec. 3.3.1 DATA chunk B/E bits encode the same four states (B+E=SOLE, B-only=FIRST, neither=MID, E-only=LAST). Each fragment carries its own seqno/TSN and is independently retransmitted.
|
| Stream byte-offset reassembly (Sections 1.5, 16) | QUIC | QUIC STREAM frame (RFC 9000 sec. 19.8) uses Offset + Length varints; FRCP uses fixed 32-bit start / end. One stream per flow vs QUIC's many streams multiplexed.
|
| FIN end-of-stream marker (Sections 1.2, 16) | TCP / QUIC | TCP FIN flag (RFC 9293 sec. 3.1) closes one half of the byte stream; QUIC STREAM frame FIN bit (RFC 9000 sec. 19.8) does the same per stream with an immutable final-size invariance (RFC 9000 sec. 4.5: the final size is fixed once observed). FRCP's FIN consumes one packet seqno (not one byte of stream space) and is idempotent on the sender side.
|
| Stream byte-credit flow control (Section 16) | QUIC | MAX_STREAM_DATA (RFC 9000 sec. 4.1, sec. 19.10). FRCP projects a per-flow byte budget onto the seqno-space rwe. Single stream per flow collapses QUIC's MAX_DATA / MAX_STREAM_DATA distinction.
|
| Header protection (encrypted seqnos) | QUIC | QUIC RFC 9001 sec. 5.4 applies header protection on top of AEAD to mask the packet number. FRCP's per-flow AEAD wrap (Section 16) is wider: it encrypts the entire PCI including seqno because the IPCP below already routes, so no destination connection-ID needs to stay in clear (cf. RFC 9000 sec. 5.2).
|
| Two-bit fragment role polarity | SCTP | The (FFGM, LFGM) pair follows SCTP B/E (begin = 1 / end = 1) rather than IPv4 MF (RFC 791 sec. 3.2), which has the inverse polarity (MF = 1 means NOT last).
|
| Orthogonal reliability / ordering axes (Section 2.2) | SCTP | PR-SCTP (RFC 3758, per-message partial reliability) and SCTP DATA U-bit (RFC 9260 sec. 3.3.1, per-message unordered) are the closest precedents for decoupling reliability from ordering; FRCP sets them per-flow rather than per-message. |
Orthogonal CRC (qs.ber == 0) |
UDP-Lite | RFC 3828 (Larzon et al., 2004) lets the sender pick a per-packet Checksum Coverage and the receiver enforce a locally configured minimum (no in-band negotiation; sec. 3.1, sec. 3.3). FRCP gates a full CRC trailer on qs.ber == 0 at flow setup. Contrast TCP / SCTP (mandatory checksum) and QUIC (AEAD subsumes CRC).
|
| Setup-time service negotiation | DCCP / SCTP / QUIC | DCCP Service Codes (RFC 4340 sec. 8.1.2, RFC 5595); SCTP INIT parameters (RFC 9260 sec. 3.3.2); QUIC transport parameters (RFC 9000 sec. 7.4). All negotiate service properties at connection setup; only RINA's QoS cube exposes them as an orthogonal vector. |
15.1. Original to FRCP (no clean prior art)
- Pre-DRF NACK (Section 9): receiver-driven nudge exploiting
snd_cr.inact > rcv_cr.inact. Closest analogues are SCTP Gap Ack Blocks (RFC 9260 sec. 3.3.4) and DCCP Ack Vector (RFC 4340 sec. 11.4) - both let the receiver describe gaps to the sender, but neither targets the cross-epoch / pre-DRF case. MAX_RDVwindow-probe give-up: neither TCP (persist-timer probes until application or R2 abort, RFC 9293 sec. 3.8.6.1) nor QUIC has an explicit FC-give-up counter. A recursive-network choice: outer layers can drop the flow.- Skip-past-gap reassembly (Section 7.2): SCTP fragments and reassembles every flow regardless of reliability/ordering, using its own per-stream reassembly queue; QUIC fragments via STREAM offsets. FRCP fragments best-effort flows too, but the receiver drops the broken prefix the moment a later run-start (
FIRSTorSOLErole) is visible inside theRQ_SIZE-wide reorder ring - no IP-frag-style timeout, no SCTP-style explicit abort. If no later run-start arrives within the ring,frag_run_inspectreturnsNOT_READYand the partial run keeps its slots; the next inspect retries. The trade-off: a permanently-lostMIDin a long isolated run holds slots until either a laterFIRST/SOLEappears in the ring or the writer stops, at which point the slots are reclaimed on flow teardown. - Reassembly deferred to consume time (Section 7.2), message mode only (
qos.service == SVC_MESSAGE): SCTP (RFC 9260 sec. 6.9), QUIC (RFC 9000 sec. 2.2), and TCP (RFC 9293) all hold reassembly state at the receive boundary. FRCP message-mode leaves fragments in the shared-memory ring untilflow_readpulls and lands the SDU directly in the caller's buffer. Stream mode (Section 16) uses the standard QUIC-style direct ring placement on receive and does not defer. The optimisation is enabled by the Shared-Memory Subsystem (SSM) packet-buffer ring (seestruct ssm_pk_buffat Section 1.1); the analogue is OS-level scatter-gather I/O (recvmsg+iovec), not a transport-layer prior art. - TLP-equivalent tail-loss recovery (RFC 8985 sec. 7; RFC 9002 sec. 6.2): FRCP does not emit an explicit Tail Loss Probe packet, but the same goal is met implicitly by RACK loss detection (Section 8) firing on a non-advancing cumulative ACK once the head-of-line slot ages past the RACK reorder window
R = MIN(reo_wnd_mult * min_RTT / 4, SRTT)- well belowRTO = max(2 * SRTT, SRTT + (mdev << MDEV_MUL)). A receiver-driven nudge is also available via the pre-DRF NACK (Section 9).
15.2. Not adopted
- Slow start, congestion window (cwnd), Additive Increase / Multiplicative Decrease (AIMD), NewReno cwnd inflation. Congestion control lives in the IPCP CA policies and is driven by Explicit Congestion Notification (ECN, RFC 3168).
- Nagle / silly-window-syndrome (SWS) avoidance (RFC 896, RFC 1122 sec. 4.2.3.4). (Deferred work, not adopted in the current spec.)
- TCP Timestamps (RFC 7323) / Protection Against Wrapped Sequences (PAWS) - RTT measurement uses RTTP, not per-segment timestamps. A peer-supplied timestamp echoed on every ACK lets a malicious peer drive the
srttestimate arbitrarily low, collapsing the RTO and triggering a self-inflicted retransmit storm. RTTP confines RTT measurement to nonce-authenticated probe round-trips, where a forged echo is rejected before it can reach the estimator. - ECN (Explicit Congestion Notification) response inside FRCP (consumed by IPCP Congestion Avoidance / CA).
- IP-style fragment-offset reassembly (RFC 791 sec. 3.2; RFC 8200 sec. 4.5). Message-mode FRCP relies on the FRCT
rq[]reorder ring keyed byseqno(shared by FRTX and best-effort flows) to put fragments back in order; no separate offset field is needed and no IP-style hole-list reassembly buffer is kept. Stream-mode FRCP does carry[start, end)byte offsets (Section 1.5) for direct ring placement on receive. - QUIC STREAM offset+length framing on every flow (RFC 9000 sec. 19.8). Message-mode FRCP uses the SCTP-style B/E flag-bit encoding (
FFGM/LFGM) and skips the offsets; stream-mode FRCP adopts the QUIC offset model (heritage table above).
16. Stream-mode flows
When a flow is allocated with qos.service == SVC_STREAM both peers
switch to byte-stream semantics, layered on top of the FRTX reorder
machinery already described in Sections 6-8.
16.1. Send
The sender splits the caller's octets into chunks of at most
(frag_mtu - base PCI - stream PCI extension) octets (Sections 1.1
and 1.5). Each chunk is one DATA packet with its own seqno and a
[start, end) byte range copied from a monotonic stream counter.
In stream mode FFGM and LFGM are unused and MUST be transmitted as
zero; the per-byte position is carried by the [start, end)
extension instead.
End-of-stream is signalled with a 0-byte DATA packet that has FIN
(bit 12) set, emitted on the FIN triggers listed in Section 1.2
(WR-half close, flow_dealloc, and any other path that yields the
final byte). The sender MUST emit at most one FIN per flow; its
[start, end) MUST equal [final-byte, final-byte) (i.e., empty
interval at the final byte position; final-size invariance,
analogous to QUIC RFC 9000 sec. 4.5). Idempotency is enforced by
an snd_fin_sent guard.
16.2. Receive
On arrival the receiver places the payload directly into a per-flow
byte-indexed receive ring of width ring_sz (octets) at the position
indicated by start, with a two-segment memcpy across the ring
boundary if needed. Receipt is recorded in the FRTX reorder
machinery (Section 6.2) augmented with the packet's start, end, and
FIN bit per slot. When a packet's [start, end) front-overlaps
bytes already at or below the byte high-water mark, the overlap is
trimmed before placement so the same byte is never written twice.
After stashing, the receiver advances lwe and the byte high-water
mark across any newly-contiguous prefix. Each slot advanced MUST
satisfy start == the last-delivered slot's end; a slot whose
start does not equal that end is silently dropped at delivery time
(the seqno is consumed, no stream bytes contributed) and the high-
water mark does not advance past it. The stream byte-stream
stalls at that point - there is no flow-tear-down on mismatch.
This filters spliced or off-path-injected slots that fall in
window without strong cryptographic authentication.
A FIN slot marks end-of-stream at advance time only if its byte
position equals the last-delivered slot's end; otherwise the FIN
is ignored and the corresponding seqno occupies a slot but
contributes no stream bytes. No packet buffer is held after the
ring copy.
16.3. Read
flow_read returns up to count octets from the contiguous prefix
[next, high-water), where next is the byte the application has
already consumed up to and high-water is the rightmost contiguous
byte received. When the stream is fully drained AND end-of-stream
(EOS) was observed (next == EOS byte position), flow_read returns
0 (EOF) - the same shape POSIX read(2) uses on TCP after a peer
FIN.
16.4. Flow control
ACK / SACK / RACK / RTO machinery is unchanged; the FRTX reorder
ring is reused as a per-seqno received-bitmap. Let per_pkt =
(frag_mtu - base PCI - stream PCI extension), the maximum stream-
byte payload one DATA packet can carry (Section 16.1). The
receive window advertised in FC is clamped so the byte window
(ring_sz) cannot be overrun: the seqno-space rwe is at most
rcv_cr.lwe + ring_sz / per_pkt.
This is the QUIC byte-credit flow-control model
(MAX_STREAM_DATA, RFC 9000 sec. 4.1 and sec. 19.10) projected onto
seqno space. With one stream per flow there is no MAX_DATA /
MAX_STREAM_DATA distinction. Receiver-side silly-window-syndrome
(SWS) avoidance (RFC 9293 sec. 3.8.6.2.2) is achieved by combining
the consume-time rwe bump with the global non-shrink rule from
Section 11.
16.5. Security considerations
Threat model. An attacker that can observe (on-path passive) or
predict (off-path blind) the flow's seqnos and byte offsets on an
unencrypted stream flow can inject DATA or FIN at any in-window
position. The in-line consistency checks above (start == prior
end on advance; FIN MUST be 0-byte; FIN MUST sit at the final
byte position) realise the spirit of RFC 5961's "sequence-window
plus exact-position match for control bits" without an explicit
challenge-ACK probe; they make a few specific blind attack shapes
harder but are not cryptographic authentication. This is
comparable to TCP without the TCP Authentication Option (TCP-AO,
RFC 5925), tighter than a
pre-RFC-5961 TCP stack, and roughly equivalent to a modern
RFC 5961 stack against blind off-path injection - none of these
help once the attacker can sniff. TLS over TCP (RFC 8446)
encrypts only the TCP payload and leaves TCP seqnos, ACKs, FIN,
and RST in the clear, so TLS does NOT defend against TCP-header-
level injection; QUIC (RFC 9000) hides packet numbers under
header protection (RFC 9001 sec. 5.4), so this specific weakness
does not apply to QUIC.
Mitigation: AEAD. When the flow has encryption enabled the
recommended AEAD ciphers (AES-GCM, RFC 5288; or ChaCha20-Poly1305,
RFC 8439) wrap the entire FRCP packet on the wire - PCI, stream
extension, body, and the CRC trailer when ber == 0 - under a
per-flow symmetric key derived from the flow's own key exchange
(Section 1.1). The AEAD tag (~2^-128 forgery probability)
dominates the CRC (~2^-32) for integrity in this mode but the CRC
trailer is currently retained inside the wrap (see Section 1.1).
Implementations MUST NOT rely on the security properties below
when a non-AEAD cipher (e.g. AES-CTR alone) is negotiated; non-
AEAD modes provide confidentiality only and the threat-model
claims do not hold.
With an AEAD cipher in use, seqnos, byte offsets, and the FIN bit are both authenticated and confidential. Against an off-path or on-path-passive attacker this is:
- Stronger than TCP+TLS (TCP header in the clear).
- Stronger than TCP+TCP-AO (header authenticated but visible).
- Comparable to IPsec ESP transport mode (RFC 4303), which similarly authenticates and encrypts the upper-layer header plus payload, and to QUIC packet protection (RFC 9001 sec. 5), with the difference that QUIC must leave the destination connection ID in the clear for routing whereas FRCP relies on the IPCP below for delivery and can therefore encrypt its entire PCI.
Keying granularity. Ouroboros flow allocation runs key exchange (kex) per flow, so each flow_alloc yields independent symmetric keys. This is finer-grained than QUIC (per-connection, RFC 9001, where one handshake covers all multiplexed streams) and finer-grained than typical IPsec deployment (per-host-pair Security Associations, SAs). Forward secrecy follows from the kex when an ephemeral Diffie-Hellman exchange (DHE), or a hybrid mode (classical DH + post-quantum Key Encapsulation Mechanism / KEM), is selected.
Replay protection. The AEAD layer itself does NOT carry an explicit anti-replay window (unlike IPsec ESP, RFC 4303 sec. 3.4.3, or DTLS, RFC 9147 sec. 4.5.1). For FRCP-engaged flows the seqno-space duplicate-suppression in Section 6.2 rejects replayed
DATA after the AEAD strips the wrap, because the AEAD authenticates the seqno and a replay re-presents an old seqno that is then discarded either as a duplicate (still inside the receive window or as outside the receive window, depending on how far lwe has advanced since the original packet was delivered. RAW (qos.service == SVC_RAW) flows have no FRCP layer and therefore no replay protection at the AEAD layer either; deployments that need replay rejection on RAW flows SHOULD use SVC_MESSAGE.
Layering. The AEAD wrap sits below FRCP on the data path, so RAW best-effort flows (qos.service == SVC_RAW, the UDP-equivalent service of Section 2.2) inherit the same per-flow integrity + confidentiality scope as FRCP-engaged flows - whatever the process and FRCP (if any) put on the wire is what the AEAD authenticates. No DTLS-equivalent layering is required for confidentiality and integrity; replay protection above AEAD is a separate concern as noted above.
17. References
This section lists the IETF documents, published works, and source-code references cited inline elsewhere in this document. IETF documents are cited inline as "RFC NNNN sec. X.Y"; books, journal papers, and source-code references are cited inline by author and year (or by file and function name) and are listed here for convenience.
17.1. IETF documents
- [RFC 791]
- J. Postel, "Internet Protocol", STD 5, RFC 791, September 1981.
- [RFC 793]
- J. Postel, "Transmission Control Protocol", STD 7, RFC 793, September 1981. Obsoleted by RFC 9293.
- [RFC 813]
- D. D. Clark, "Window and Acknowledgement Strategy in TCP", RFC 813, July 1982.
- [RFC 896]
- J. Nagle, "Congestion Control in IP/TCP Internetworks", RFC 896, January 1984.
- [RFC 1122]
- R. Braden (ed.), "Requirements for Internet Hosts -- Communication Layers", STD 3, RFC 1122, October 1989.
- [RFC 2018]
- M. Mathis, J. Mahdavi, S. Floyd, A. Romanow, "TCP Selective Acknowledgment Options", RFC 2018, October 1996.
- [RFC 2119]
- S. Bradner, "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.
- [RFC 2883]
- S. Floyd, J. Mahdavi, M. Mathis, M. Podolsky, "An Extension to the Selective Acknowledgement (SACK) Option for TCP", RFC 2883, July 2000.
- [RFC 3758]
- R. Stewart, M. Ramalho, Q. Xie, M. Tuexen, P. Conrad, "Stream Control Transmission Protocol (SCTP) Partial Reliability Extension", RFC 3758, May 2004.
- [RFC 3828]
- L-A. Larzon, M. Degermark, S. Pink, L-E. Jonsson (ed.), G. Fairhurst (ed.), "The Lightweight User Datagram Protocol (UDP-Lite)", RFC 3828, July 2004.
- [RFC 4303]
- S. Kent, "IP Encapsulating Security Payload (ESP)", RFC 4303, December 2005.
- [RFC 4340]
- E. Kohler, M. Handley, S. Floyd, "Datagram Congestion Control Protocol (DCCP)", RFC 4340, March 2006.
- [RFC 5288]
- J. Salowey, A. Choudhury, D. McGrew, "AES Galois Counter Mode (GCM) Cipher Suites for TLS", RFC 5288, August 2008.
- [RFC 5595]
- G. Fairhurst, "The Datagram Congestion Control Protocol (DCCP) Service Codes", RFC 5595, September 2009.
- [RFC 5681]
- M. Allman, V. Paxson, E. Blanton, "TCP Congestion Control", RFC 5681, September 2009.
- [RFC 5925]
- J. Touch, A. Mankin, R. Bonica, "The TCP Authentication Option", RFC 5925, June 2010.
- [RFC 5961]
- A. Ramaiah, R. Stewart, M. Dalal, "Improving TCP's Robustness to Blind In-Window Attacks", RFC 5961, August 2010.
- [RFC 6298]
- V. Paxson, M. Allman, J. Chu, M. Sargent, "Computing TCP's Retransmission Timer", RFC 6298, June 2011.
- [RFC 6528]
- F. Gont, S. Bellovin, "Defending against Sequence Number Attacks", RFC 6528, February 2012. Obsoletes RFC 1948.
- [RFC 6582]
- T. Henderson, S. Floyd, A. Gurtov, Y. Nishida, "The NewReno Modification to TCP's Fast Recovery Algorithm", RFC 6582, April 2012.
- [RFC 7323]
- D. Borman, B. Braden, V. Jacobson, R. Scheffenegger (ed.), "TCP Extensions for High Performance", RFC 7323, September 2014.
- [RFC 7675]
- M. Perumal, D. Wing, R. Ravindranath, T. Reddy, M. Thomson, "Session Traversal Utilities for NAT (STUN) Usage for Consent Freshness", RFC 7675, October 2015.
- [RFC 8174]
- B. Leiba, "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, May 2017.
- [RFC 8200]
- S. Deering, R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", STD 86, RFC 8200, July 2017.
- [RFC 8439]
- Y. Nir, A. Langley, "ChaCha20 and Poly1305 for IETF Protocols", RFC 8439, June 2018.
- [RFC 8446]
- E. Rescorla, "The Transport Layer Security (TLS) Protocol Version 1.3", RFC 8446, August 2018.
- [RFC 8985]
- Y. Cheng, N. Cardwell, N. Dukkipati, P. Jha, "The RACK-TLP Loss Detection Algorithm for TCP", RFC 8985, February 2021.
- [RFC 9000]
- J. Iyengar (ed.), M. Thomson (ed.), "QUIC: A UDP-Based Multiplexed and Secure Transport", RFC 9000, May 2021.
- [RFC 9001]
- M. Thomson (ed.), S. Turner (ed.), "Using TLS to Secure QUIC", RFC 9001, May 2021.
- [RFC 9002]
- J. Iyengar (ed.), I. Swett (ed.), "QUIC Loss Detection and Congestion Control", RFC 9002, May 2021.
- [RFC 9147]
- E. Rescorla, H. Tschofenig, N. Modadugu, "The Datagram Transport Layer Security (DTLS) Protocol Version 1.3", RFC 9147, April 2022.
- [RFC 9260]
- R. Stewart, M. Tuexen, K. Nielsen, "Stream Control Transmission Protocol", RFC 9260, June 2022. Obsoletes RFC 4960.
- [RFC 9293]
- W. Eddy (ed.), "Transmission Control Protocol (TCP)", STD 7, RFC 9293, August 2022. Obsoletes RFC 793 and several follow-ons; updates RFC 1122 and others.
17.2. Books and journal papers
- [Day08]
- J. Day, "Patterns in Network Architecture: A Return to Fundamentals", Prentice Hall, 2008.
- [Grasa15]
- E. Grasa et al., "IRATI: investigating RINA as an alternative to TCP/IP", Computer Networks, Vol. 92, December 2015.
- [KP87]
- P. Karn, C. Partridge, "Improving Round-Trip Time Estimates in Reliable Transport Protocols", ACM SIGCOMM, August 1987.
- [Wat81]
- R. W. Watson, "Timer-Based Mechanisms in Reliable Transport Protocol Connection Management", Computer Networks, Vol. 5, 1981.
17.3. Source-code references
- [Linux-RTT]
tcp_rtt_estimator()innet/ipv4/tcp_input.cof the Linux kernel, defining the asymmetricmdevvariance update used as FRCP's default RTT estimator (Section 12). Line-stable browseable copy at https://elixir.bootlin.com/linux/latest/source/net/ipv4/tcp_input.c.