Definitions

From Ouroboros
Jump to navigation Jump to search

Definitions: Flow, Routing, and Forwarding

This document contains formal definitions used in the Ouroboros architecture. The graph definitions follow Dieter Jungnickel's excellent Graph, Networks and Algorithms[1].

Basic Graph Definitions

A graph is a pair consisting of a set of vertices and a set of edges of two-element subsets of . An edge has (distinct) end vertices and .

A directed graph or digraph is a pair consisting of a set of vertices and a set of ordered pairs (called arcs) where .

A network is a digraph on which a mapping from the edgeset to the reals is defined; the number is called the weight of the arc [note 1].

If , then is adjacent to and incident with . The set of neighbors of a vertex is the set of vertices that are adjacent to , where contains all such that .

Walks, Paths, and Cycles

A walk is a sequence of vertices so that . So each walk also implies a sequence of edges. We define the weight of a walk as the sum of the weights of its arcs, .

If the source and destination are the same (), the walk is closed.

A walk where each of the arcs is distinct is called a trail.

A path is a trail for which each of the are distinct.

A closed trail for which the are distinct is called a cycle[note 2].

A directed acyclic graph (DAG) is a digraph that does not contain any cycles.

If for every pair of vertices there is at least one path , we call the (di)graph connected.

Distance and Metrics

The distance function defined by the weight of the shortest path between two vertices in a network (or if no such path exists) is called the geodesic distance.

A metric is a (distance) function fulfilling positivity, symmetry and triangle inequality:

A geodesic distance is in general not a metric since it doesn't always fulfill the positivity and symmetry requirements in a digraph.

Routing, Forwarding and Flooding

With routing we typically mean the process of selecting a path for traffic in a network. In computer science literature, there are two main groups of approaches to routing.

On the one hand there are the hierarchical routing solutions. This is the approach taken in IP networks, where a set of subnetworks is defined using prefixes or subnet masks.[2] A scalability issue with IP stems from not following the partial ordering implied by the subnetting in the delegation of IP addresses, causing fragmentation of the IP address space and making prefix aggregration in routers increasingly inefficient.[note 3]

On the other hand we have geographic or geometric routing,[3] where each node is assigned a coordinate so next hops can be calculated making use of the coordinates of the direct neighbors.[note 4]

The examples above illustrate that the concept of routing encompasses both the dissemination and gathering of information about the network, and the algorithms for calculation and selection of the paths. We will now define the concepts underpinning "routing" more formally. The reasoning behind it is similar to the reasoning commonly used in formulating (integer) linear programming solutions to problems in graph theory.

Let be a network.

ROUTING is any algorithm that, given source and destination vertices and , for each vertex in returns a subset of neighbors with the associated set of arcs , so that the following 4 conditions are met:

  1. The graph is a directed acyclic subgraph of ;
  2. is the only vertex in with only outgoing arcs;
  3. is the only vertex in with only incoming arcs; and
  4. if and only if there is no path between and in .

Equivalently,[note 5] ROUTING is any algorithm that, given source and destination vertices and , for each vertex in returns a subset of neighbors so that:

  1. for any chosen distance function on ; and
  2. if and only if there is no path between and in .

In other words, either the distance function bounds the routing solution, or the routing solution bounds the distance function.

The formal definition says that a routing algorithm is equivalent a tree-like structure where every node along any valid path selects one or more neighbors that are strictly "closer" to the destination than itself. This structure can contain multiple paths from source to destination, allowing routing solution that move different packets (or fragments) along different routes through the network. The key guarantee is that all these paths always make progress toward their destination and can not contain (permanent) loops - think of it like water flowing downhill where it may split into multiple streams, but each stream always flows to a lower elevation and can never flow back uphill.

This definition is very broad "one or more neighbors" selection allows solutions that send duplicates, erasure coding packet fragments etc.

We define FORWARDING as any algorithm that, for each vertex in returns the set of arcs . FORWARDING is often implemented as ROUTING with an additional edge selection step.

The necessary and sufficient condition for ROUTING is full knowledge of the graph and a valid geodesic distance . For a given vertex , the necessary and sufficient set of information to obtain is knowledge of its neighbor set and the subset of the geodesic distances originating at its neighbors, .

In less formal terms, ROUTING and FORWARDING provide a set of vertices and arcs, respectively, so that there are never loops if one travels to a next vertex or along an arc in the set. If implemented in a centralized way, ROUTING and FORWARDING roughly need to know the full network. When implemented in a distributed way, a node roughly needs to know its neighbors and the distances to the destination from itself and from all its neighbors.

The definitions above show what information needs to be disseminated in a network to allow FORWARDING. Let's assume that vertices know their neighbors or incident outgoing arcs, then what is needed is a dissemination procedure for the (geodesic) distance function . This is implemented in a class of dissemination protocols, called link-state routing protocols.

FLOODING is any algorithm that, for each vertex in returns a subset of neighbors (with the associated set of arcs ) so that:

  1. The graph is a connected subgraph of ; and
  2. is the only vertex in with only outgoing arcs.

Equivalently, FLOODING is any algorithm that, for each vertex in returns a subset of neighbors so that:

  1. for any chosen distance function on .

In other words, FLOODING creates a connected tree rooted at that reaches all (reachable) vertices in the network. Each vertex selects neighbors that are strictly farther from the source than itself, ensuring the flood propagates outward without backtracking.

Flow

A flow is the abstraction of a collection of resources within a network layer that allow bidirectional communications using packets between two processes that are clients of this layer. A flow enables a point-to-point packet delivery service and can be viewed as a bidirectional pipe that has a number of observable quantities associated with it that describe the probability of a packet of given size being transferred within a certain period of time . The maximum probability for error-free transfer depends on the packet-drop-rate (PDR) and bit-error-rate (BER) of the flow.

The Ouroboros architecture ensures that flows are content neutral, i.e. the probability above is independent of the bits that make up the packets sent along a flow.

Flow Characteristics

The delay or latency describes how long packets take to traverse the flow, and the variation on the delay is called jitter, or more precisely, packet delay variation.[4] The delay for a flow has four main components:

  • Propagation delay
  • Queuing delay
  • Transmission delay
  • Processing delay

Flow Bounds

There are 2 upper bounds:

  • The Maximum Packet Lifetime (MPL) is the maximum amount of time that a packet can take to transfer over the flow
  • The Maximum Packet Size (MPS) is the maximum length for a packet to be acceptable for transfer

In other words, the probability for a packet to arrive after MPL expires should be close to 0,[note 6] and the probability for a packet to arrive that exceeds the MPS is equal to 0.

Similarly, there can be lower bounds such as Minimum Packet Lifetime (mPL) and Minimum Packet Size (mPS).

Flow Resources

The resources that make up a layer are finite, limiting the total number of packets that can traverse the network layer in a given period of time. Flows provide the mechanism to put a network layer fully in control of its own resources. The resources that constitute the flow can either be shared with other flows or dedicated (reserved) for this particular flow.

Other externally measureable quantities can be associated with a flow, such as bandwidth and load for flows with reserved resources. The probability function may depend on these quantities (e.g. if the load reaches a certain threshold, delay could increase).

Flow Identifiers

A flow endpoint is identified in a system by a flow ID (FID), which defines the layer boundary[note 7]. For security reasons, a process has no direct access to the flow, but rather accesses the flow through a Flow Descriptor (FD) to read and write from a flow. Flow identifiers are unique within the scope of a system, flow descriptors are unique within the scope of a process[note 8].

Flows are an important concept for enabling Quality-of-Service (QoS) in a layer.

Footnotes

  1. The weight of an edge is often called a metric in computer science literature (e.g. BGP metric). To avoid confusion, we use the term metric only in its mathematical meaning.
  2. A Hamiltonian cycle is a special case of a closed walk that includes every vertex in the graph exactly once. This is necessarily also a closed trail.
  3. Rekhter's Law: Addressing can follow topology or topology can follow addressing. Choose one. RFC 4984.
  4. Geographic coordinates are a compound name, consisting of latitudes and longitudes, but there is no order implied between these two coordinate spaces. Hence the partial ordering in our definition of an address.
  5. Proof of Equivalence: choose for the length of the longest path between the corresponding vertices in .
  6. This may be hard to guarantee with 100 percent certainty, so MPL should be large enough so that this probability is 0 in practice. Ouroboros estimates MPL based on the Layer characteristics. IP has a bound on the Maximum Datagram Lifetime (MDL) via the Time-To-Live (v4) or Hop Limit (v6) decrement in each router to a maximum of 255 seconds (RFC 791), with a recommended value of 64 seconds (RFC 1700). In addition, TCP defines the Maximum Segment Lifetime (MSL) as 120s (RFC 793).
  7. In this respect, a flow id is similar to an OSI Service Access Point (SAP) or RINA port id.
  8. This is similar in function to a UNIX file descriptor.

References

  1. Jungnickel, D. (2007). Graphs, Networks and Algorithms.
  2. RFC 4632: Classless Inter-domain Routing (CIDR): The Internet Address Assignment and Aggregation Plan.
  3. Kuhn, F., Wattenhofer, R., and Zollinger, A. (2003). Worst-case optimal and average-case efficient geometric ad-hoc routing.
  4. RFC 3393: IP Packet Delay Variation Metric for IP Performance Metrics (IPPM).