summaryrefslogtreecommitdiff
path: root/src/lib/timerwheel.c
Commit message (Collapse)AuthorAgeFilesLines
* lib: Fix RTO update on timeoutDimitri Staessens2022-04-031-12/+6
| | | | | | | | | | | | | This fixes the RTO doubling on timeout according to Karn/Partridge. Exponentially increasing RTO when it times out (e.g. doubling) fixes the problem that a sudden increase in real RTT starves the sRTT updates by never getting out of backoff as retransmitted packets can't update RTT. Added an parameter to make it less aggressive, default is doubling. Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks> Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
* build: Update copyright to 2022Dimitri Staessens2022-04-031-1/+1
| | | | | | | Growing pains. Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks> Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
* lib: Fix timing of delayed ACKsDimitri Staessens2022-04-011-2/+1
| | | | | | | | | | Delayed ACKs are now sent after twice the internal tick time. Fixes initial ACK record (rcv_cr.seqno) being uninitialized (0) when the first ACK was to be sent. Adds some FRCT metrics for number of received delayed (bare) ACKs and the RTT estimator. Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks> Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
* lib: Fix unidirectional FRCT traffic handlingDimitri Staessens2022-03-301-2/+0
| | | | | | | | | | | | | Unidirectional traffic has one of the peers only send bare FRCT packets. These never set a DRF, since they have no sequence number. At the receiver, all these ACKs and window updates were always dropped as the receiver connection record was timed out. Also fixes a SEGV if flow control kicks in (passing NULL timeout to pthread_cond_timedwait). Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks> Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
* lib: Move timerwheel processing to its own threadDimitri Staessens2022-03-301-9/+0
| | | | | | | | | | | | | | | This is the first step moving away from scheduling the FRCT and flow monitoring functions as part of the IPC calls (flow_read / flow_write / fevent) and towards the more scalable (and far less complicated) implementation to take care of these functions in separate threads. If a process creates the first flow that requires FRCT, it will spin up a thread to process events on the timerwheel (retransmissions and delayed ACKs). This single thread lives until the last flow with FRCT is deallocated. Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks> Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
* lib: Refactor reading packet from rbuffDimitri Staessens2022-03-301-1/+1
| | | | | | | | | | | | Reading packets from the rbuff and checking their validity (non-zero size, pass crc check, pass decryption) is now extracted into a function. Also adds a function to get the length of an sdu_du_buff instead of subtracting the tail and head pointers. Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks> Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
* lib: Non-configurable delayed acks in FRCPDimitri Staessens2022-03-301-9/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It doesn't really make sense to manually and one-sidedly configure the timeout of delayed acknowledgements, as setting it too high upsets the peer's sRTT estimates. Even worse, it also causes a lot of spurious retransmissions if it exceeds the sRTT mean deviation calculated by the receiver. Compensating on bare acknowledgment for the ack delay could improve the RTT estimate deviation, but not the spurious retransmissions if it was set too high. This sets the delayed ack to wait for a single RTT mean deviation. Probably needs more tweaking to further reduce differences between the RTT estimates at the sender and receiver, e.g. compensate the RTT estimate for delayed acks, or increase the RTO to add 8 mdevs to sRTT instead of 4. However, it looks like the mdev estimate is the trickiest one to get to sync, not the RTT average. Linux reduces the sample weight for mdev from 1/4 to 1/32 in some cases, will give that a shot some day too to see if that further align sRTT estimates. In any case, this patch already improves things a lot. Also fixes a bug where the sender was sending acknowlegments on the first packets in flight for the 0 sequence number. The receiver activity was measured in seconds but compared to a timeout value in nanoseconds. There's still a lot of spurious retransmissions that start after actual packet loss occurs, I'm still investigating what causes it. Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks> Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
* lib: Expose flow control metrics to RIB0.19.1Dimitri Staessens2022-03-161-2/+2
| | | | | | | | | | This exposes some additional metrics relating to FRCT / Flow control: the number of duplicate packets received, number of packets received out of the flow control window and / or reordering queue, and the number of rendez-vous messages sent. Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks> Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
* lib: Fix retransmission schedulingDimitri Staessens2022-03-161-72/+60
| | | | | | | | | | | | | There still were a couple of bugs in the timerwheel. If the future schedule was coinciding with the slot currently being processed (i.e. exactly RXMQ_SLOTS in the future), the list_add_tail caused an infinite loop. Another bug was causing the slots at higher levels to be processed too soon. Retransmissions should now schedule correctly. Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks> Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
* lib: Fix buffer allocation when retransmitting0.19.0Dimitri Staessens2022-03-111-9/+29
| | | | | | | | | | | | | | | The timerwheel was retransmitting packets and the error check for negative values of the rbuff allocation was instead checking for non-zero values, causing a buffer allocation to succeed but the program to continue down the unhappy path leaving that packet stuck in the buffer unattended. Also fixes wrongly scheduled retransmissions that cause packet storms. FRCP is much more stable now. Still needs some work for high bandwidth-delay products (fast-retransmit). Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks> Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
* lib: Remove dead code in timerwheelDimitri Staessens2022-03-031-6/+0
| | | | | | | The checked condition can't happen. Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks> Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
* lib: Ease lock in timerwheel0.18.4Dimitri Staessens2021-12-221-2/+2
| | | | | | | It was taking a write lock when a read lock was sufficient. Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks> Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
* lib, ipcpd, irmd: Wrap pthread unlocks for cleanupDimitri Staessens2021-06-231-2/+1
| | | | | | | | | | | | This add an ouroboros/pthread.h header that wraps the pthread_..._unlock() functions for cleanup using pthread_cleanup_push() as this casting is not safe (and there were definitely bad casts in the code). The close() function is now also wrapped for cleanup in ouroboros/sockets.h. This allows enabling more compiler checks. Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks> Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
* build: Update email addressesDimitri Staessens2021-01-031-2/+2
| | | | | | | | The ugent email addresses are shut down, updated to Ouroboros mail addresses. Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks> Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
* build: Update copyright to 2021Dimitri Staessens2021-01-031-1/+1
| | | | | | | Happy New Year, Ouroboros! Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks> Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
* lib: Reduce timerwheel CPU consumptionDimitri Staessens2020-11-251-0/+9
| | | | | | | | | | | The timerwheel is checked during IPC calls (fevent, flow_read), causing huge load on CPU consumption in IPCPs, since they have a lot of fevent() threads for QoS. The timerwheel will need further optimization), but for now I reduced the default tick time to 5 ms and added a boolean to check that the wheel is actually used. Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks> Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
* lib: Add Rendez-Vous mechanism for flow controlDimitri Staessens2020-10-111-1/+1
| | | | | | | | | | | | | This adds the rendez-vous mechanism to handle the case where the sending window is closed and window updates get lost. If the sending window is closed, the sender side will send an RDVS every DELT_RDV time (100ms), and give up after MAX_RDV time (1 second). Upon reception of a RDVS packet, a window update is sent immediately. We can make this much more configurable later on (build options for defaults, fccntl for runtime tuning). Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks> Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
* lib: Add compiler configuration for FRCPDimitri Staessens2020-10-111-71/+82
| | | | | | | | | | | | | This allows configuring some parameters for FRCP at compile time, such as default values for Delta-t and configuration of the timerwheel. The timerwheel will now reschedule when it fails to create a packet, instead of setting the flow down immediately. Some new things added are options to store packets for retransmission on the heap, and using non-blocking calls for retransmission. The defaults do not change the current behaviour. Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks> Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
* lib: Complete retransmission logicDimitri Staessens2020-09-251-0/+409
| | | | | | | | | | | | | | | | | | | | | | | | | | | This completes the retransmission (automated repeat-request, ARQ) logic, sending (delayed) ACK messages when needed. On deallocation, flows will ACK try to retransmit any remaining unacknowledged messages (unless the FRCTFLINGER flag is turned off; this is on by default). Applications can safely shut down as soon as everything is ACK'd (i.e. the current Delta-t run is done). The activity timeout is now passed to the IPCP for it to sleep before completing deallocation (and releasing the flow_id). That should be moved to the IRMd in due time. The timerwheel is revised to be multi-level to reduce memory consumption. The resolution bumps by a factor of 1 << RXMQ_BUMP (16) and each level has RXMQ_SLOTS (1 << 8) slots. The lowest level has a resolution of (1 << RXMQ_RES) (20) ns, which is roughly a millisecond. Currently, 3 levels are defined, so the largest delay we can schedule at each level is: Level 0: 256ms Level 1: 4s Level 2: about a minute. Signed-off-by: Dimitri Staessens <dimitri@ouroboros.rocks> Signed-off-by: Sander Vrijders <sander@ouroboros.rocks>
* lib: Support for rudimentary retransmissionDimitri Staessens2018-07-271-232/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | This adds rudimentary support for sending and processing acknowledgments and doing retransmission. It replaces the generic timerwheel with a specific one for retransmission. This is currently a fixed wheel allowing retransmissions to be scheduled up to about 32 seconds into the future. It currently has an 8ms resolution. This could be made configurable in the future. Failures of the flow (i.e. rtx not working) are indicated by the rxmwheel_move() function returning a fd. This is currently not yet handled (maybe just setting the state of the flow to FLOWDOWN is a better solution). The shm_rdrbuff tracks the number of users of a du_buff. One user is the full stack, each retransmission will increment the refs counter (which effectively acts as a semaphore). The refs counter is decremented when a packet is acked. The du_buff is only allowed to be removed if there is only one user left (the "stack"). When a packet is retransmitted, it is copied in the rdrbuff. This is to ensure integrity of the packet when multiple layers do retransmission and it is passed down the stack again. Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be> Signed-off-by: Sander Vrijders <sander.vrijders@ugent.be>
* lib: Fix assertions in timerwheel0.9.6Sander Vrijders2018-01-291-2/+2
| | | | | | | | The previous commit only fixed the issue for release builds. This fixes it for debug builds as well. Signed-off-by: Sander Vrijders <sander.vrijders@ugent.be> Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be>
* lib: Fix bad comparison in timerwheel0.9.5Sander Vrijders2018-01-291-3/+3
| | | | | | | | A comparison was done in the timerwheel between an unsigned value and a time_t. Signed-off-by: Sander Vrijders <sander.vrijders@ugent.be> Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be>
* include, src: Update copyright to 2018Dimitri Staessens2018-01-091-1/+1
| | | | | | | Happy New Year, Ouroboros. Signed-off-by: Dimitri Staessens <dimitri.staessens@ugent.be> Signed-off-by: Sander Vrijders <sander.vrijders@ugent.be>
* lib: Make timerwheel a passive componentSander Vrijders2017-08-221-150/+8
| | | | | | This turns the timerwheel into a passive component since it is used by application using the library. The user of the timerwheel now has to call timerwheel_move to advance the timerwheel.
* build: Revise the build systemdimitri staessens2017-08-211-1/+4
| | | | | | | | | | This revises the build system to have configuration per system component. System settings can now be set using cmake. The standard compliance defines were removed from configuration header and are set in the sources where needed. Also some small code refactors, such as moving the data for shims out of the ipcp structure to the respective shims were performed.
* lib: Add basic FRCT mechanismsSander Vrijders2017-08-171-0/+371
This adds the basic FRCT mechanisms to the library. Upon flow alloc or accept an FRCT instance is now created and used when reading or writing to the flow. The timerwheel has been refactored to allow recharging timers and removing them and is now part of the library. The first SDU sent over the connection has the DRF set and this initializes the connection. Sender and receiver inactivity timers are added.