Skip to content

Latest commit

 

History

History
123 lines (85 loc) · 10.2 KB

RealTimeProgramming.md

File metadata and controls

123 lines (85 loc) · 10.2 KB

Introduction to Real-time Programming

Author: Tobit Flatscher (2023 - 2024)

0. Introduction

This guide gives an introduction into programming of real-time systems focussing on C++ and ROS 2. As such it outlines common mistakes that beginners make when starting to program real-time applications and tips to what pay attention to when programming with ROS. The ROS part is strongly influenced by the ROS 2 real-time working group presentation at ROSCon 2023.

1. Basics

Real-time programming requires a good understanding of how computers, their operating systems and programming languages work under the hood. While you will find several books and articles, in particular from people working in high-frequency trading, that discuss advanced aspects of low-latency programming, there is only little beginner-friendly literature.

One can find a few developer checklists for real-time programming such as this and this one. Here a more complete checklist and important aspects to consider when programming code for low-latency. The examples make use of C/C++ but these paradigms apply to all programming languages:

  • Take care when designing your own code and implementing your own algorithms:

    • Select algorithms by their worst-case run-time and not their average run-time (see also here). Keep in mind though that these latency are derived from asymptotic analysis for a large number of elements and only up to a constant (which can be quite large): Two O(n) algorithms might be very different in terms of computational speed (number of instructions and CPU cycles taken), just their scaling will be similar and similarly a O(n²) algorithm might be faster for smaller container sizes. Therefore in particular for smaller or fixed size containers it is necessary to benchmark the chosen algorithm. For small containers the cache locality and memory allocation of an algorithm will be far more important than its asymptotic scaling behavior.
    • Split your code into parts that have to be real-time and a non real-time part and make them communicate with lockless programming techniques
  • Set a priority (nice values) to your real-time code (see here). 80 is a good starting point. It is not advised to use too high priorities as this might result in problems with kernel threads:

    #include <pthread.h>
    #include <sched.h>
    
    ::pthread_t const current_thread {::pthread_self()}; // or t.native_handle() for an std::thread
    int policy {};
    struct ::sched_param param {};
    ::pthread_getschedparam(current_thread, &policy, &param);
    param.sched_priority = 80; // or use ::sched_get_priority_max(some_policy)
    if (::pthread_setschedparam(current_thread, policy, &param) == 0) {
      std::cout << "Set thread priority to '" << param.sched_priority << "'." << std::endl;
    } else {
      std::cerr << "Failed to set thread priority to '" << param.sched_priority << "'!" << std::endl;
    }
  • Set a scheduling policy that fits your needs (see here). SCHED_FIFO is likely the one you want to go for if you do not have a particular reason to do otherwise:

    #include <pthread.h>
    #include <sched.h>
    
    ::pthread_t const current_thread {::pthread_self()};
    int policy {};
    struct ::sched_param param {};
    ::pthread_getschedparam(current_thread, &policy, &param);
    policy = SCHED_FIFO;
    if (::pthread_setschedparam(current_thread, policy, &param) == 0) {
      std::cout << "Set scheduling policy to '" << policy << "'." << std::endl;
    } else {
      std::cerr << "Failed to set scheduling policy to '" << policy << "'!" << std::endl;
    }
  • Pin the thread to an isolated CPU core (which was previously isolated on the operating system). This way the process does not have to fight over resources with other processes running on the same core.

    #include <pthread.h>
    #include <sched.h>
    
    constexpr int cpu_core {0};
    ::pthread_t const current_thread {::pthread_self()};
    ::cpu_set_t cpuset {};
    CPU_ZERO(&cpuset);
    CPU_SET(cpu_core, &cpuset);
    if (::pthread_setaffinity_np(current_thread, sizeof(::cpu_set_t), &cpuset) == 0) {
      std::cout << "Set thread affinity to cpu '" << cpu_core << "'!" << std::endl;
    } else {
      std::cerr << "Failed to set thread affinity to cpu '" << cpu_core << "'!" << std::endl;
    }

    This can be tested by stressing the system e.g. with stress-ng. In a process viewer like htop you should see that the unisolated cores will be fully used while the isolated CPU cores should just be running the intended code and should only be partially used:

  • Dynamic memory allocation (reserving virtual and physical memory) is slow and so is copying. Both are generally not real-time safe. Avoid any form of dynamic memory allocation inside real-time code:

    • Do not use explicit dynamic memory allocation. Use functions for statically allocating memory before entering a real-time section (e.g. std::vector<T,Alloc>::reserve).

    • Also avoid structures that are using dynamic memory allocation under the hood such as std::string in C++. Mutate strings to eliminate temporary copies.

    • Lock memory pages with mlock. This locks the process's virtual address space into RAM, preventing that memory from being paged to the swap area.

      #include <sys/mman.h>
      
      ::mlockall(MCL_CURRENT | MCL_FUTURE);
  • Generally real-time processes need to communicate with other non real-time processes. Do not use standard mutexes (e.g. std::mutex) when communicating between threads with different priorities as this is known to potentially result in priority inversion: A low-priority task might only run after another task with same or slightly higher priority and therefore block the high-priority task that relies on the low-priority task to complete

  • Take special care when logging from real-time processes. Traditional logging tools generally involve mutexes and dynamic memory allocation.

    • Do not log from real-time sections if it can be avoided
    • Use dedicated real-time logging tools, these will use asynchronous logging that passes format string pointer and format arguments from a real-time thread to non real-time thread in a lockless way. Here a few libraries that might be helpful for this:
      • Quill: An asynchronous low-latency logger for C++
      • PAL statistics: A real-time logging framework for ROS
  • Similarly writing to files is multiple magnitudes slower than RAM access. Do not write to files.

    • Use a dedicated asynchronous logger framework for it as discussed above.
    • An acceptable solution might also be a RAM disk where a part of memory is formated with a file system.
  • Make sure all of your external library calls respect the above criteria as well.

    • Read their documentation and review their source code making sure that their latencies are bounded, they do not dynamically allocate memory, do not use normal mutexes, non O(1) algorithms and if they call IO/logging during the calls.
    • Likely you will have to refactor external code to make sure that it is useable inside real-time capable code.
  • Take care when using timing libraries: Linux has multiple clocks. While CLOCK_REALTIME might sounds like the right choice for a real-time system it is not as it can jump forwards and backwards due to time synchronization (e.g. NTP). You will want to use CLOCK_MONOTONIC or CLOCK_BOOTTIME.

    • Take care when relying on external libraries to time events and stop times, e.g. std::chrono.
  • Benchmark performance of your code and use tracing library to track you real-time performance. You can always test with simulated load.

  • For network applications that require to communicate over a high-speed NIC look into kernel-bypass instead of relying on POSIX sockets (see e.g. here and here)

A good resource for real-time programming is the book "Building Low Latency Applications with C++" as well as this CppCon 2021 talk. You might also want to have a look at the following two-part guide for audio-developers (1 and 2).