Skip to content

Commit 22a815a

Browse files
committed
Refinement of blog post on decentralized coordination
1 parent 4c891f4 commit 22a815a

File tree

2 files changed

+74
-26
lines changed

2 files changed

+74
-26
lines changed

blog/2025-09-18-decentralized-coord.md

Lines changed: 73 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -2,52 +2,99 @@
22
slug: decentralized-coordination
33
title: "Consistency and Availability Challenges with Decentralized Coordination"
44
authors: [fra-p, eal, rcakella]
5-
tags: [lingua franca, federation, decentralized]
5+
tags: [lingua franca, federation, decentralized, STA]
66
---
77

8-
The design of [distributed applications](/docs/writing-reactors/distributed-execution) in Lingua Franca requires care, particularly if the coordination of the federation is [decentralized](/docs/writing-reactors/distributed-execution#decentralized-coordination).
8+
The design of [distributed applications](/docs/writing-reactors/distributed-execution) in Lingua Franca requires care, particularly if the coordination of the federation is [decentralized](/docs/writing-reactors/distributed-execution#decentralized-coordination). The intent of this post is to illustrate and handle the challenges arising from designing distributed applications in Lingua Franca, focusing on a realistic automotive use case.
9+
10+
## Automatic emergency braking use case
11+
![AutomaticEmergencyBrakingSystem diagram](../static/img/blog/AutomaticEmergencyBrakingSystem.svg)
912

1013
Consider the above Lingua Franca implementation of an automatic emergency braking system, one of the most critical ADAS systems which modern cars are equipped with.
11-
The controller system reads data coming from two sensors, a lidar and a radar, and uses both to detect if objects or pedestrians cross the path of the car, thus performing sensor fusion.
12-
When either of the two signals the presence of a close object, the controller triggers the brake to stop the car and avoid crashing into it.
14+
The controller system modeled by the `AutomaticEmergencyBraking` reactor reads data coming from two sensors, a lidar and a radar, and uses both to detect if objects or pedestrians cross the trajectory the car, thus performing _sensor fusion_.
15+
When one of the two sensors signals the presence of an object at a distance shorter than a configurable threshold, the controller triggers the brake to stop the car and avoid crashing into it.
16+
17+
The sensors are modeled with their own timer that triggers the generation of data. The clocks of all federates are automatically synchronized by the [clock synchronization algorithm](/docs/writing-reactors/distributed-execution#clock-synchronization) of the Lingua Franca runtime.
18+
Typically, in a real use case of this kind, the clock of sensor devices cannot be controlled by Lingua Franca, but a way to work around this limitation is to resample the data collected by sensors with the timing given by a clock that the runtime can control.
19+
The sensor reactors of our application are then modeling this resampling of sensor data that fits well with the Lingua Franca semantics for time determinism.
20+
21+
The lidar sensor has a sampling frequency that is twice that of the radar, and this is reflected by the timer in the corresponding reactors: the lidar timer has a period of 50ms, while that of the radar 100ms.
22+
Their deadline is equal to their period and is enforced using the dedicated `DeadlineCheck` reactors, following the guidelines of how to [work with deadlines](/blog/deadlines).
1323

14-
The lidar sensor has a higher sampling frequency, while the radar is slower, and this is reflected by the timer in the corresponding reactors.
15-
Their deadline is equal to their period and is enforced using dedicated deadline checking reactors, following the guidelines of how to [work with deadlines](/blog/deadlines).
24+
The sensor behavior in the application is simulated in a way that each sensor constantly produces distance values above the threshold (i.e., no objects in the way), and then at a random time it sends a distance value below the threshold, indicating the presence of a close object. When the `AutomaticEmergencyBraking` reactor receives that message, it signals the `BrakingSystem` reactor to brake the car, and the whole system shuts down.
25+
26+
### Desired system properties
1627
Availability is a crucial property of this application, because we want the automatic emergency braking system to brake as fast as possible when a close object is detected. Consistency is also necessary: sensor fusion happens with sensor data produced at the same logical time, so in-order data processing is critical.
1728

29+
### Challenges of decentralized coordination
1830
The application is implemented as a federated program with decentralized coordination, which means that the advancement of logical time in each single federate is not subject to approval from any centralized entities, but it is done locally based on the input it receives from the other federates.
19-
Consistency problems may arise when a federate receives data from two or more federates, as it is the case of the automatic emergency braking reactor.
20-
As an example, the controller expects to receive input from both sensors at times 0ms, 100ms, 200ms, etc. Let's consider the case where the remote connection between the controller and the radar has a slightly larger delay than that between the controller and the lidar. The lidar input will arrive slightly earlier than the radar one. When the controller receives the lidar input, should it process the data immediately, or should it wait for the radar input to come? Sensor fusion requires consistency: if the controller processes the input from the lidar and then the radar data comes, the elaborated control action did not take into account both sensors even though it should have.
2131

22-
The desired behavior with simultaneous inputs is highly dependent on the application under analysis, and Lingua Franca lets you customize it. Each federate has a parameter called [STA (safe-to-advance)](/docs/writing-reactors/distributed-execution#safe-to-advance-sta) that controls how long the federate should wait for inputs from other federates before processing an input it has just received.
23-
More precisely, the STA is how much time a federate waits before advancing its tag to that of the just received event, when it is not known if the other input ports will receive data at the same or an earlier tag. At the expiration of the STA, the federate assumes that those unresolved ports will not receive data at earlier tags, and advances its logical time to the tag of the received event.
32+
#### Consistency challenge
33+
Consistency problems may arise when a federate receives data from two or more federates, as it is the case of the `AutomaticEmergencyBraking` reactor.
34+
The controller expects to receive input from both sensors at times 0ms, 100ms, 200ms, etc. Let's consider as an example the case where the remote connection between the controller and the radar has a slightly larger delay than that between the controller and the lidar. The lidar input will then always arrive slightly earlier than the radar one. When the controller receives the lidar input, should it process the data immediately, or should it wait for the radar input to come? Sensor fusion requires consistency: if the controller processes the input from the lidar and then the radar data comes, the control action elaborated upon the arrival of the lidar data does not take into account both sensors, even though it should. Hence, in our use case, the `AutomaticEmergencyBraking` reactor needs to wait for both inputs before processing new data.
35+
36+
In general, the desired behavior with simultaneous inputs and decentralized coordination is highly dependent on the application under analysis, and Lingua Franca lets you customize it. Each federate has a parameter called [`STA` (safe-to-advance)](/docs/writing-reactors/distributed-execution#safe-to-advance-sta) that controls how long the federate should wait for inputs from other federates before processing an input it has just received.
37+
More precisely, the `STA` is how much time a federate waits before advancing its tag to that of the just received event, when it is not known if the other input ports will receive data at the same or an earlier tag. At the expiration of the `STA`, the federate assumes that those unresolved ports will not receive any data at earlier tags, and advances its logical time to the tag of the received event.
38+
39+
When a reactor commits to a tag after the `STA` expires, it may happen that one of the unresolved ports receives new data at an earlier logical time.
40+
Since the current tag is greater than the just received one, this event cannot be processed, as it would result in out-of-order handling of messages, thus violating the Lingua Franca semantics.
41+
In such cases, a safe-to-process (`STP`) violation occurs, the received event is dropped and a [fault handler](/docs/writing-reactors/distributed-execution#safe-to-process-stp-violation-handling) is executed instead: consistency is then preserved.
2442

25-
The maximum consistency guarantee is given by indefinitely waiting for the radar input before processing the radar, i.e., STA = forever, but this is viable only if the following two conditions are always satisfied:
43+
In our application, we aim to avoid `STP` violations and process all incoming data for sensor fusion. The maximum consistency guarantee is given by _indefinitely waiting_ for the radar input before processing the radar, i.e., `STA = forever`, but this is viable only if the following two conditions are always satisfied:
2644
* the communication medium between the sensors and the controller is perfectly reliable; and
2745
* none of the three federates is subject to faults.
2846

29-
These conditions guarantee that all expected data will be generated, sent and correctly received by the communication parties.
47+
These conditions guarantee that all expected data will be generated, sent and correctly received by the communication parties. If any of the two does not hold, the application may potentially experience indefinite blocking.
3048

31-
However, setting the STA to forever creates problems when only the lidar input is expected (50ms, 150ms, 250ms, etc): the controller cannot process that input until an input from the radar comes, because the STA will never expire. For example, if the single lidar input comes at 50ms, it has to wait until time 100ms before being processed. If that input was signaling the presence of a close object, the detection would be delayed by 50ms, which may potentially mean crashing into the object. The automatic emergency braking system must be available, otherwise it might not brake in time to avoid collisions.
32-
The ideal STA value for maximum availability in the time instants with only the lidar input is 0, because if a single input is expected, no wait is necessary.
49+
#### Availability challenge
50+
However, setting the `STA` to `forever` creates problems when only the lidar input is expected (50ms, 150ms, 250ms, etc): the controller cannot process that input until an input from the radar comes, because the `STA` will never expire. For example, if the single lidar input comes at time 50ms, it has to wait until time 100ms before being processed. If that input was signaling the presence of a close object, the detection would be delayed by 50ms, which may potentially mean crashing into the object. The automatic emergency braking system must be available, otherwise it might not brake in time to avoid collisions.
51+
The ideal `STA` value for maximum availability in the time instants with only the lidar input is 0, because if a single input is expected, no wait is necessary.
3352

34-
Summing up, consistency for sensor fusion requires STA=forever when inputs from both sensors are expected, while availability calls for STA=0 when only the lidar input is coming. The two values are at odds, and any value in between would mean sacrificing both properties at the same time.
53+
Summing up, consistency for sensor fusion requires `STA = forever` when inputs from both sensors are expected, while availability calls for `STA = 0` when only the lidar input is coming. The two values are at odds, and any value in between would mean sacrificing both properties at the same time.
3554

36-
The knowledge of the timing properties of the application under analysis enables the a priori determination of the time instants when both inputs are expected and those when only the lidar has new data available.
37-
Lingua Franca allows to dynamically change the STA in the reaction body using the lf_set_maxwait API, that takes as input parameter the new STA value to set.
55+
### Dynamic adjustment of STA
56+
The knowledge of the timing properties of the application under analysis enables the _a priori_ determination of the time instants when both inputs are expected and those when only the lidar has new data available.
57+
Lingua Franca allows to dynamically change the `STA` in the reaction body using the `lf_set_sta` API, that takes as input parameter the new `STA` value to set.
3858
This capability of the language permits the automatic emergency braking federate to:
39-
* start with the STA statically set to forever, because at time 0 (startup) both sensors produce data;
40-
* set the STA to 0 after processing both inputs arrived at the same logical time, because the next data will be sent by the lidar only;
41-
* set the STA back to forever after processing the radar input alone, because the next data will be sent by both sensors.
59+
* start with the `STA` statically set to `forever`, because at time 0 (startup) both sensors produce data;
60+
* set the `STA` to 0 after processing both inputs arrived at the same logical time, because the next data will be sent by the lidar only;
61+
* set the `STA` back to `forever` after processing the radar input alone, because the next data will be sent by both sensors.
4262

4363
This dynamic solution guarantees both consistency and availability in all input cases.
64+
The implementation of the `AutomaticEmergencyBraking` reactor is shown below:
4465

45-
Knowing the LF decentralized coordination:
46-
- consistency = in-order processing of events even with multiple events
47-
- availability = the system is responsive even with a single input
66+
```lf-c
67+
reactor AutomaticEmergencyBraking(dist_thld: float = 20.0) {
68+
input lidar_in: float
69+
input radar_in: float
70+
output brake: int
71+
state n_invocs: int = 0
4872
49-
Oh, maybe mention that the clock of the two sensors is synced because we're resampling the data
73+
reaction (lidar_in, radar_in) -> brake {=
74+
if (lf_is_present(lidar_in) && lidar_in->value < self->dist_thld) {
75+
printf("Lidar has detected close object -> signaling braking\n");
76+
lf_set(brake, 1);
77+
lf_request_stop();
78+
} else if (lf_is_present(radar_in) && radar_in->value < self->dist_thld) {
79+
printf("Radar has detected close object -> signaling braking\n");
80+
lf_set(brake, 1);
81+
lf_request_stop();
82+
}
5083
51-
I might also say that forever does not work when one of the sensors is delayed too much or when the medium fails for too much time, in which cases a finite STA is better (like a period or something) (this is gonna be the topic of a new blog post)
84+
self->n_invocs++;
85+
if (self->n_invocs % 2) {
86+
lf_set_sta(0);
87+
} else {
88+
lf_set_sta(FOREVER);
89+
}
90+
=} deadline(100ms) {=
91+
printf("AEB deadline violated\n");
92+
=} STA(forever) {=
93+
printf("STP violation on AEB\n");
94+
=}
95+
}
96+
```
5297

53-
-maybe a little bit of what happens when out-of-order msg.s are received? (not sure this is really needed though)
98+
The `dist_thld` parameter is the distance threshold from detected objects below which the `AutomaticEmergencyBraking` reactor activates the brakes.
99+
The reaction body reads the distance reported by both the lidar and the radar, and if any of these is less than the threshold, it sends a signal to the `BrakingSystem` reactor.
100+
The `n_invocs` integer state variable counts the number of times the reaction of the `AutomaticEmergencyBraking` reactor is invoked. This variable is used to determine how many inputs the reaction will see at the next invocation and set the `STA` accordingly. Even invocation numbers mean that the next reaction invocation will happen with both sensor inputs present, so the `STA` is set to `forever`; with odd invocation numbers, the next reaction invocation will see new data from the lidar only, and the `STA` is then set to 0.

static/img/blog/AutomaticEmergencyBrakingSystem.svg

Lines changed: 1 addition & 0 deletions
Loading

0 commit comments

Comments
 (0)