| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
| |
Happy New Year, Ouroboros!
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
| |
DH key creation was returning -ECRYPT if opennssl is not installed,
instead of success (0).
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
| |
This causes builds to fail on systems where OpenSSL is not available.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds congestion avoidance policies to the unicast IPCP. The
default policy is a multi-bit explicit congestion avoidance algorithm
based on data-center TCP congestion avoidance (DCTCP) to relay
information about the maximum queue depth that packets experienced to
the receiver. There's also a "nop" policy to disable congestion
avoidance for testing and benchmarking purposes.
The (initial) API for congestion avoidance policies is:
void * (* ctx_create)(void);
void (* ctx_destroy)(void * ctx);
These calls create / and or destroy a context for congestion control
for a specific flow. Thread-safety of the context is the
responsability of the flow allocator (operations on the ctx should be
performed under a lock).
ca_wnd_t (* ctx_update_snd)(void * ctx,
size_t len);
This is the sender call to update the context, and should be called
for every packet that is sent on the flow. The len parameter in this
API is the packet length, which allows calculating the bandwidth. It
returns an opaque union type that is used for the call to check/wait
if the congestion window is open or closed (and allowing to release
locks before waiting).
bool (* ctx_update_rcv)(void * ctx,
size_t len,
uint8_t ecn,
uint16_t * ece);
This is the call to update the flow congestion context on the receiver
side. It should be called for every received packet. It gets the ecn
value from the packet and its length, and returns the ECE (explicit
congestion experienced) value to be sent to the sender in case of
congestion. The boolean returned signals whether or not a congestion
update needs to be sent.
void (* ctx_update_ece)(void * ctx,
uint16_t ece);
This is the call for the sending side top update the context when it
receives an ECE update from the receiver.
void (* wnd_wait)(ca_wnd_t wnd);
This is a (blocking) call that waits for the congestion window to
clear. It should be stateless (to avoid waiting under locks). This may
change later on if passing the context is needed for different algorithms.
uint8_t (* calc_ecn)(int fd,
size_t len);
This is the call that intermediate IPCPs(routers) should use to update
the ECN field on passing packets.
The multi-bit ECN policy bases the value for the ECN field on the
depth of the rbuff queue packets will be sent on. I created another
call to grab the queue depth as fccntl is write-locking the
application. We can further optimize this to avoid most locking on the
rbuff.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
|
| |
The timerwheel is checked during IPC calls (fevent, flow_read),
causing huge load on CPU consumption in IPCPs, since they have a lot
of fevent() threads for QoS. The timerwheel will need further
optimization), but for now I reduced the default tick time to 5 ms and
added a boolean to check that the wheel is actually used.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
| |
I mistakenly set the default to the (buggy) lockless rbuff
implementation instead of the pthread one in commit 3aec660e.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
| |
This reverts commit 978266fe4beba21292daad2d341fe5ff22e08aba.
We were incorrectly unmounting the directory under normal conditions.
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds the rendez-vous mechanism to handle the case where the
sending window is closed and window updates get lost. If the sending
window is closed, the sender side will send an RDVS every DELT_RDV
time (100ms), and give up after MAX_RDV time (1 second). Upon
reception of a RDVS packet, a window update is sent immediately. We
can make this much more configurable later on (build options for
defaults, fccntl for runtime tuning).
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
|
| |
If the sending window for flow control is closed, the sending
application will now block until the window opens. Beware that until
the rendez-vous mechanism is implemented, shutting down a server while
the client is sending (with non-timed-out blocking write) will cause
the client to hang indefinitely because its window will close.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
| |
Refactor flow_write cleanup.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
| |
This adds sending and receiving window updates for flow control. I
used the 8 pad bits as part of the window update field, so it's 24
bits, allowing for ~16 million packets in flight.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows configuring some parameters for FRCP at compile time, such
as default values for Delta-t and configuration of the timerwheel. The
timerwheel will now reschedule when it fails to create a packet,
instead of setting the flow down immediately. Some new things added
are options to store packets for retransmission on the heap, and using
non-blocking calls for retransmission. The defaults do not change the
current behaviour.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
| |
Flows should be locked when moving the timerwheel. For frcti_snd, a
rdlock is enough.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This completes the retransmission (automated repeat-request, ARQ)
logic, sending (delayed) ACK messages when needed.
On deallocation, flows will ACK try to retransmit any remaining
unacknowledged messages (unless the FRCTFLINGER flag is turned off;
this is on by default). Applications can safely shut down as soon as
everything is ACK'd (i.e. the current Delta-t run is done). The
activity timeout is now passed to the IPCP for it to sleep before
completing deallocation (and releasing the flow_id). That should be
moved to the IRMd in due time.
The timerwheel is revised to be multi-level to reduce memory
consumption. The resolution bumps by a factor of 1 << RXMQ_BUMP (16)
and each level has RXMQ_SLOTS (1 << 8) slots. The lowest level has a
resolution of (1 << RXMQ_RES) (20) ns, which is roughly a
millisecond. Currently, 3 levels are defined, so the largest delay we
can schedule at each level is:
Level 0: 256ms
Level 1: 4s
Level 2: about a minute.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds the logic to send a pure acknowledgment packet without any
data to send. This needed the event filter for the fqueue, as these
non-data packets should not trigger application PKT events. The
default timeout is now 10ms, until we have FRCP tuning as part of
fccntl.
Karn's algorithm seems to be very unstable with low (sub-ms) RTT
estimates. Doubling RTO (every RTO) seems still too slow to prevent
rtx storms when the measured rtt suddenly spikes several orders of
magnitude. Just assuming the ACK'd packet is the last one transmitted
seems to be a lot more stable. It can lead to temporary
underestimation, but this is not a throughput-killer in FRCP.
Changes most time units to nanoseconds for faster computation.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
| |
The sanitize function in the rdrbuff should only be compiled if robust
mutexes are present on the system.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
| |
This is a small refactor of FRCT because I found some things a bit
hard to read. I tried to refactor frcti_rcv to always queue the
packet, but that causes unnecessarily retaking the lock when calling
queued_pdu and thus returning idx is a tiny bit faster.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
| |
The retransmission was always disabling the DRF flag. This caused
problems with the loss of the first packet, which of course needs a
DRF flag set. The retransmitted packet will now contain a the original
DRF flag and an updated ack number.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
| |
The single retransmission wheel caused locking headaches as the calls
for different flows could block on the same rxmwheel. This stabilizes
the stack, but if the rdrbuff gets full there can now be big delays.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
| |
Fixes infinite rescheduling with RTO getting lower than the timerwheel
resolution. For very low RTO values we'd need a big packet buffer with
the current memory allocator implementation (rdrbuff). Setting a
(configurable) minimum RTO (250 us) reduces this need.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
| |
If Ouroboros crashed, the RIB directory might still be mounted. This
checks if this is the case, then unmounts it.
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
|
| |
There were a bunch of bugs in FRCP that urgently needed fixing. Now
data QoS is usable even with heavy packet loss (within some
parameters). The current RTT estimator is the IETF one. It should be
updated to the improved one used in the Linux kernel once the A-timer
(ACKs without data) and graceful shutdown are implemented.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
| |
The shm_flowset destroy was using the irmd pid, resulting in wrong
unlinks. The irmd was not cleaning up the process table, resulting in
shm leaks if there were still running processes on exit.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
| |
The thread pool manager wasn't counting working threads when deciding
to create new ones, resulting in constant starting of new threads when
threads were busy.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
| |
This is more in line with the write() system call and prepares for
partial writes. Partial writes are disabled by default (and not yet
implemented).
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
| |
The return type was still an int, but since it returns the number of
events, it should be an ssize_t.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This revises the naming API to treat names (or reg_name in the source)
as first-class citizens of the architecture. This is more in line with
the way they are described in the article.
Operations have been added to create/destroy names independently of
registering. This was previously done only as part of register, and
there was no way to delete a name from the IRMd. The create call now
allows specifying a policy for load-balancing incoming flows for a
name. The default is the new round-robin load-balancer, the previous
behaviour is still available as a spillover load-balancer.
The register calls will still create a name if it doesn't exist, with
the default round-robin load-balancer.
The tools now have a "name" section, so the format is now
irm name <operation> <name> ...
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
| |
There was a rare deadlock upon destruction of the threadpool manager
because the threads were cancelled/joined under lock.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
|
| |
The initial implementation for the ECDHE key exchange was doing the
key exchange after a flow was established. The public keys are now
sent allowg on the flow allocation messages, so that an encrypted
tunnel can be created within 1 RTT. The flow allocation steps had to
be extended to pass the opaque data ('piggybacking').
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
| |
The rbuff_destroy function asserts that we do not try to destroy an
rbuff that still contains packets. The test now empties the rbuff
before destroying it.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Packet Forwarding Function (PFF) was user-configurable using the
irm tool. However, this isn't really wanted since the PFF is dictated
by the routing algorithm. This moves the responsability for selecting
the correct PFF from the network admin to the unicast IPCP
implementation. Each routing policy now has to specify which PFF it
will use.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
| |
The hashtable is only used for forwarding tables in the unicast
IPCP. This moves the generic hashtable out of the library into the
unicast IPCP to prepare a more tailored implementation specific to
routing tables containing address lists.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
| |
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
| |
The node construction path is revised using gotos to avoid repetition.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
| |
In fset_add, the flow_id was passed to the shm_flow_set without
checking if it was actually valid.
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
| |
Ubuntu 16 comes with older versions of OpenSSL, glibc and
libgcrypt. Ouroboros will now fall back to OpenSSL even if the version
is <= 1.1.0.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
| |
The cryptographic functions require at least OpenSSL 1.1.0. The build
will now check for this version and disable OpenSSL support when this
requirement is not met.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
| |
The client and server side were swapped. This wasn't a big issue, but
now we are sure that the flow allocation response for the server has
arrived at the client (packet reordering could cause the server key to
arrive before the flow is allocated at the client).
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
| |
The wrong pointer was being free'd in case of a derivation error.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a per-message symmetric encryption using the OpenSSL
library. At flow allocation, an Elliptic Curve Diffie-Hellman exchange
is performed to derive a shared secret, which is then hashed using
SHA3-256 to be used as a key for symmetric AES-256 encryption. Each
message on an encrypted flow adds a small crypto header that includes
a random 128-bit Initialization Vector (IV). If the server does not
have OpenSSL enabled, the flow allocation will fail with an -ECRYPT
error.
Future optimizations are to piggyback the public keys on the flow
allocation message, and to enable per-flow encryption that maintains
the context of the encryption over multiple packets and doesn't
require sending IVs.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
| |
The proper initialization of libgrypt requires a call to
gcry_check_version. The library initialization should first run a
check if the application (or some other library) hasn't already
initialized libgcrypt before attempting to initialize libgcrypt.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
| |
This completes the renaming of the normal IPCP to the unicast IPCP in
the sources, to get everything consistent with the documentation.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
| |
This fixes a use after free in an error condition, and makes sure that
pid is set in the flow_set early on, so flow_set_destroy won't create
a prefix with an uninitialized pid in case of an error in
shm_flow_set_create.
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
| |
This adds some tests for the shm_rbuff after some reports that the
queue length would be erroneously reported as 0 when the rbuff was
full. The test passes for the reported case.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
| |
This fixes writing at high speeds when the rbuff is smaller than the
rdrbuff. The pthread_cond_wait calls were blocking on the wrong
condition variable.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
| |
This allows setting the size of the rbuffs in a system independently
of the main packet buffer using SHM_RBUFF_SIZE. The benefit of setting
a smaller rbuff size is that a single process can't fully occupy the
main packet buffer.
Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
| |
The compiler flags for the SWIG target were added to the global
CMAKE_C_FLAGS used for the entire project. This sets the flags
uniquely for the SWIG target. The eth has a similar case for the c99
flag. There was a lingering include in dev.c that was removed.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
|
| |
The UDP IPCP now uses a fixed server UDP port (default 3435) for all
communications. This allows passing firewalls more easily since only a
single port needs to be opened. The client port can be fixed as well
if needed (default random). It uses an internal eid, so the MTU of the
UDP layer is reduced by 4 bytes, similar to the Ethernet IPCPs.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
|
| |
The API calls for the IPCP to inform the IRMd of IPCP creation and
incoming flow request had the pid_t in the call. This pid_t is removed
and the getpid() call is now placed inside the function. Also
refactors the cleanup for the main() functions of some of the lower
IPCPs.
Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks>
Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a simple round-trip time estimator to FRCT. The estimate is
a weighted average with deviation. The retransmission is scheduled
after rtt + 2 times the deviation. A retransmit doubles the rtt
estimate to avoid the no-update case when rtt suddenly increases. The
rtt is estimated in microseconds and the granularity for retransmits
is 256 microseconds.
Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be>
Signed-off-by: Sander Vrijders <sander.vrijders@ugent.be>
|