| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds congestion avoidance policies to the unicast IPCP. The
default policy is a multi-bit explicit congestion avoidance algorithm
based on data-center TCP congestion avoidance (DCTCP) to relay
information about the maximum queue depth that packets experienced to
the receiver. There's also a "nop" policy to disable congestion
avoidance for testing and benchmarking purposes.
The (initial) API for congestion avoidance policies is:
void * (* ctx_create)(void);
void (* ctx_destroy)(void * ctx);
These calls create / and or destroy a context for congestion control
for a specific flow. Thread-safety of the context is the
responsability of the flow allocator (operations on the ctx should be
performed under a lock).
ca_wnd_t (* ctx_update_snd)(void * ctx,
size_t len);
This is the sender call to update the context, and should be called
for every packet that is sent on the flow. The len parameter in this
API is the packet length, which allows calculating the bandwidth. It
returns an opaque union type that is used for the call to check/wait
if the congestion window is open or closed (and allowing to release
locks before waiting).
bool (* ctx_update_rcv)(void * ctx,
size_t len,
uint8_t ecn,
uint16_t * ece);
This is the call to update the flow congestion context on the receiver
side. It should be called for every received packet. It gets the ecn
value from the packet and its length, and returns the ECE (explicit
congestion experienced) value to be sent to the sender in case of
congestion. The boolean returned signals whether or not a congestion
update needs to be sent.
void (* ctx_update_ece)(void * ctx,
uint16_t ece);
This is the call for the sending side top update the context when it
receives an ECE update from the receiver.
void (* wnd_wait)(ca_wnd_t wnd);
This is a (blocking) call that waits for the congestion window to
clear. It should be stateless (to avoid waiting under locks). This may
change later on if passing the context is needed for different algorithms.
uint8_t (* calc_ecn)(int fd,
size_t len);
This is the call that intermediate IPCPs(routers) should use to update
the ECN field on passing packets.
The multi-bit ECN policy bases the value for the ECN field on the
depth of the rbuff queue packets will be sent on. I created another
call to grab the queue depth as fccntl is write-locking the
application. We can further optimize this to avoid most locking on the
rbuff.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds the rendez-vous mechanism to handle the case where the
sending window is closed and window updates get lost. If the sending
window is closed, the sender side will send an RDVS every DELT_RDV
time (100ms), and give up after MAX_RDV time (1 second). Upon
reception of a RDVS packet, a window update is sent immediately. We
can make this much more configurable later on (build options for
defaults, fccntl for runtime tuning).
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
|
| |
If the sending window for flow control is closed, the sending
application will now block until the window opens. Beware that until
the rendez-vous mechanism is implemented, shutting down a server while
the client is sending (with non-timed-out blocking write) will cause
the client to hang indefinitely because its window will close.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
| |
Refactor flow_write cleanup.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows configuring some parameters for FRCP at compile time, such
as default values for Delta-t and configuration of the timerwheel. The
timerwheel will now reschedule when it fails to create a packet,
instead of setting the flow down immediately. Some new things added
are options to store packets for retransmission on the heap, and using
non-blocking calls for retransmission. The defaults do not change the
current behaviour.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
| |
Flows should be locked when moving the timerwheel. For frcti_snd, a
rdlock is enough.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This completes the retransmission (automated repeat-request, ARQ)
logic, sending (delayed) ACK messages when needed.
On deallocation, flows will ACK try to retransmit any remaining
unacknowledged messages (unless the FRCTFLINGER flag is turned off;
this is on by default). Applications can safely shut down as soon as
everything is ACK'd (i.e. the current Delta-t run is done). The
activity timeout is now passed to the IPCP for it to sleep before
completing deallocation (and releasing the flow_id). That should be
moved to the IRMd in due time.
The timerwheel is revised to be multi-level to reduce memory
consumption. The resolution bumps by a factor of 1 << RXMQ_BUMP (16)
and each level has RXMQ_SLOTS (1 << 8) slots. The lowest level has a
resolution of (1 << RXMQ_RES) (20) ns, which is roughly a
millisecond. Currently, 3 levels are defined, so the largest delay we
can schedule at each level is:
Level 0: 256ms
Level 1: 4s
Level 2: about a minute.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds the logic to send a pure acknowledgment packet without any
data to send. This needed the event filter for the fqueue, as these
non-data packets should not trigger application PKT events. The
default timeout is now 10ms, until we have FRCP tuning as part of
fccntl.
Karn's algorithm seems to be very unstable with low (sub-ms) RTT
estimates. Doubling RTO (every RTO) seems still too slow to prevent
rtx storms when the measured rtt suddenly spikes several orders of
magnitude. Just assuming the ACK'd packet is the last one transmitted
seems to be a lot more stable. It can lead to temporary
underestimation, but this is not a throughput-killer in FRCP.
Changes most time units to nanoseconds for faster computation.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
| |
This is a small refactor of FRCT because I found some things a bit
hard to read. I tried to refactor frcti_rcv to always queue the
packet, but that causes unnecessarily retaking the lock when calling
queued_pdu and thus returning idx is a tiny bit faster.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
| |
The single retransmission wheel caused locking headaches as the calls
for different flows could block on the same rxmwheel. This stabilizes
the stack, but if the rdrbuff gets full there can now be big delays.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
| |
This is more in line with the write() system call and prepares for
partial writes. Partial writes are disabled by default (and not yet
implemented).
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
| |
The return type was still an int, but since it returns the number of
events, it should be an ssize_t.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
|
| |
The initial implementation for the ECDHE key exchange was doing the
key exchange after a flow was established. The public keys are now
sent allowg on the flow allocation messages, so that an encrypted
tunnel can be created within 1 RTT. The flow allocation steps had to
be extended to pass the opaque data ('piggybacking').
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
| |
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
| |
In fset_add, the flow_id was passed to the shm_flow_set without
checking if it was actually valid.
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a per-message symmetric encryption using the OpenSSL
library. At flow allocation, an Elliptic Curve Diffie-Hellman exchange
is performed to derive a shared secret, which is then hashed using
SHA3-256 to be used as a key for symmetric AES-256 encryption. Each
message on an encrypted flow adds a small crypto header that includes
a random 128-bit Initialization Vector (IV). If the server does not
have OpenSSL enabled, the flow allocation will fail with an -ECRYPT
error.
Future optimizations are to piggyback the public keys on the flow
allocation message, and to enable per-flow encryption that maintains
the context of the encryption over multiple packets and doesn't
require sending IVs.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
| |
The proper initialization of libgrypt requires a call to
gcry_check_version. The library initialization should first run a
check if the application (or some other library) hasn't already
initialized libgcrypt before attempting to initialize libgcrypt.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
| |
This allows setting the size of the rbuffs in a system independently
of the main packet buffer using SHM_RBUFF_SIZE. The benefit of setting
a smaller rbuff size is that a single process can't fully occupy the
main packet buffer.
Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
| |
The compiler flags for the SWIG target were added to the global
CMAKE_C_FLAGS used for the entire project. This sets the flags
uniquely for the SWIG target. The eth has a similar case for the c99
flag. There was a lingering include in dev.c that was removed.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
|
| |
The API calls for the IPCP to inform the IRMd of IPCP creation and
incoming flow request had the pid_t in the call. This pid_t is removed
and the getpid() call is now placed inside the function. Also
refactors the cleanup for the main() functions of some of the lower
IPCPs.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
| |
Updates the copyright notice in all sources to 2019.
Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be>
Signed-off-by: Sander Vrijders <sander.vrijders@ugent.be>
|
|
|
|
|
|
|
|
| |
This fixes the deallocation of non-initialized IPCP flows. These can
occur when some operations are not implemented.
Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be>
Signed-off-by: Sander Vrijders <sander.vrijders@ugent.be>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a new flow_join operaiton for broadcast, which is a much
safer solution than overloading destination name semantics. The
internal API now also has a different IPCP_FLOW_JOIN operation. The
IRMd doesn't need to query broadcasts IPCPs for the name, it can just
check if an IPCP with the layer name exists. The broadcast IPCP
doesn't need to implement the query proxy call anymore.
Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be>
Signed-off-by: Sander Vrijders <sander.vrijders@ugent.be>
|
|
|
|
|
|
|
|
|
| |
This moves the creation and destruction of shm_flow_set shared memory
structures from the init to the IRMd. Now the management of all shared
data objects is performed by the IRMd.
Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be>
Signed-off-by: Sander Vrijders <sander.vrijders@ugent.be>
|
|
|
|
|
|
|
| |
The fccntl call was reading from the RX queue instead of the TX queue.
Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be>
Signed-off-by: Sander Vrijders <sander.vrijders@ugent.be>
|
|
|
|
|
|
|
|
|
| |
This changes the API to the rdrbuff to treat it as a pool memory
allocator. The head and tailspace to allocate in a buffer is now set
system-wide instead of being passed as a parameter.
Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be>
Signed-off-by: Sander Vrijders <sander.vrijders@ugent.be>
|
|
|
|
|
|
|
|
|
| |
This will check if the flow id is valid instead of asserting. It
avoids assertion failures in the IPCP if an application crashes and
the IRMd deallocates the flow while the IPCP still has pending writes.
Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be>
Signed-off-by: Sander Vrijders <sander.vrijders@ugent.be>
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| | |
This initializes libgcrypt before use in the library. This fixes the
"called in non-operational state" error when CRC checking is enabled.
Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be>
Signed-off-by: Sander Vrijders <sander.vrijders@ugent.be>
|
|\| |
|
| |
| |
| |
| |
| |
| |
| |
| | |
This splits off the CRC from FRCT so it can be set
independently. Ouroboros now allows raw flows with error checking.
Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be>
Signed-off-by: Sander Vrijders <sander.vrijders@ugent.be>
|
| |
| |
| |
| |
| |
| |
| | |
Renames port_id to flow_id according to updated nomenclature.
Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be>
Signed-off-by: Sander Vrijders <sander.vrijders@ugent.be>
|
| |
| |
| |
| |
| |
| |
| |
| | |
This will change SDU (Service Data Unit) to packet everywhere. SDU is
OSI terminology, whereas packet is Ouroboros terminology.
Signed-off-by: Sander Vrijders <sander.vrijders@ugent.be>
Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The flow allocator now passes the full qos specification to the
endpoint, instead of just a cube. This is a more flexible
architecture, as it makes QoS cubes internal to the layers.
Adds endianness transforms for the flow allocator protocol in the
normal IPCP.
Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be>
Signed-off-by: Sander Vrijders <sander.vrijders@ugent.be>
|
| |
| |
| |
| |
| |
| |
| |
| | |
This removes configuration from the FRCT protocol to send it during
flow allocation.
Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be>
Signed-off-by: Sander Vrijders <sander.vrijders@ugent.be>
|
| |
| |
| |
| |
| |
| |
| |
| | |
This removes the _DEFAULT_SOURCE definition in the endian header as it
should not be there. This avoids double and conflicting definitions.
Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be>
Signed-off-by: Sander Vrijders <sander.vrijders@ugent.be>
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds rudimentary support for sending and processing
acknowledgments and doing retransmission.
It replaces the generic timerwheel with a specific one for
retransmission. This is currently a fixed wheel allowing
retransmissions to be scheduled up to about 32 seconds into the
future. It currently has an 8ms resolution. This could be made
configurable in the future. Failures of the flow (i.e. rtx not
working) are indicated by the rxmwheel_move() function returning a fd.
This is currently not yet handled (maybe just setting the state of the
flow to FLOWDOWN is a better solution).
The shm_rdrbuff tracks the number of users of a du_buff. One user is
the full stack, each retransmission will increment the refs counter
(which effectively acts as a semaphore). The refs counter is
decremented when a packet is acked. The du_buff is only allowed to be
removed if there is only one user left (the "stack").
When a packet is retransmitted, it is copied in the rdrbuff. This is
to ensure integrity of the packet when multiple layers do
retransmission and it is passed down the stack again.
Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be>
Signed-off-by: Sander Vrijders <sander.vrijders@ugent.be>
|
|
|
|
|
|
|
|
| |
This adds the infrastructure to actively react to flow up, down and
deallocated events.
Signed-off-by: Sander Vrijders <sander.vrijders@ugent.be>
Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be>
|
|\ |
|
| |
| |
| |
| |
| |
| |
| | |
This will mark flows down when they are finalized.
Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be>
Signed-off-by: Sander Vrijders <sander.vrijders@ugent.be>
|
|/
|
|
|
|
|
|
| |
This adds a data qos cube that is reliable. Reliable qos can be
selected by setting the loss parameter of the qosspec to 0.
Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be>
Signed-off-by: Sander Vrijders <sander.vrijders@ugent.be>
|
|
|
|
|
|
|
|
|
| |
This adds a QoS cube that allows sending packets directly over a raw
flow, without an FRCT state machine. Flow allocation with a NULL
qosspec will now default to such raw flows.
Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be>
Signed-off-by: Sander Vrijders <sander.vrijders@ugent.be>
|
|
|
|
|
|
|
|
| |
The reordering queue is replaced by a fixed ring buffer for speed and
simplicity.
Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be>
Signed-off-by: Sander Vrijders <sander.vrijders@ugent.be>
|
|
|
|
|
|
|
|
|
|
| |
The flow_fini() function was marking flows as wronly (so rdonly for
the application) when the flow was deallocated. With the recent
addition of the flowdown state, we can now mark them as down. Also
fixes some bounds checks and alignment.
Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be>
Signed-off-by: Sander Vrijders <sander.vrijders@ugent.be>
|
|
|
|
|
|
|
|
|
|
|
| |
The event type of the current event in the fqueue can now be requested
using the fqueue_type() command. Currently events for packets
(FLOW_PKT), flows (FLOW_UP, FLOW_DOWN) and allocation (FLOW_ALLOC,
FLOW_DEALLOC) are specified. The implementation only tracks FLOW_PKT
at this point.
Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be>
Signed-off-by: Sander Vrijders <sander.vrijders@ugent.be>
|
|
|
|
|
|
|
|
|
|
| |
If a program exits, it cleans its read buffer. However, another
process could still write a packet in that buffer, which would cause
the IPCP or IRMd to run into an assertion failure on shutdown. Setting
the rbuff to ACL_FLOWDOWN prevents this.
Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be>
Signed-off-by: Sander Vrijders <sander.vrijders@ugent.be>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The reg/unreg API is simplified to registering and unregistering a
single name with a single IPCP. The functionality associated with
registering names was moved from the IRMd to the irm tool. The
function to list IPCPs was simplified to return all IPCPs in the
system with their basic properties needed for management.
The above changes led to some needed changes in the irm tool and the
management functions that were depending on the previous behaviour of
list_ipcps.
Command line functionality to list IPCPs in the system is also added
to the irm tool.
Some older code was refactored.
Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be>
Signed-off-by: Sander Vrijders <sander.vrijders@ugent.be>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows disabling partial reads. It adds a flag FLOWFRNOPART that
disables partial reads. Partial read is different from partial
delivery (FRCTFPARTIAL), which allows delivery of fragments of an
incomplete packet and thus potentially corrupted data. FLOWFRNOPART
will never deliver corrupted data (unless FRCTFPARTIAL is also set).
If FLOWFRNOPART is set and the buffer provided to flow_read is too
small for the SDU, that SDU will be discarded and -EMSGSIZE is
returned;
Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be>
Signed-off-by: Sander Vrijders <sander.vrijders@ugent.be>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This implements partial read of packets if the buffer supplied to
flow_read() is smaller than the packet in the buffer. If the number of
bytes returned by flow_read equals the size of the buffer, the next
read() will deliver the next bytes of the packet (or 0 if the packet
was exactly the size of the buffer on the previous read).
Implements #7.
Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be>
Signed-off-by: Sander Vrijders <sander.vrijders@ugent.be>
|
|
|
|
|
|
|
|
|
|
| |
This completes the implementation of the SNDTIMEO for a blocking
write.
Fixes #6.
Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be>
Signed-off-by: Sander Vrijders <sander.vrijders@ugent.be>
|