ros2_tracing and Performance Analysis¶
Source: ros2_tracing and the ROS 2 Jazzy tracing tutorial
Why This Matters¶
Logs tell you what software decided to say. Traces tell you what actually executed, when it executed, and how long it took. That difference matters when a robot feels slow, callbacks bunch up under load, executor behavior becomes suspicious, or a distributed pipeline drops performance without a clean crash.
Distilled Takeaways¶
ros2_tracinggives ROS 2 core instrumentation on Linux using LTTng, so you can inspect callback execution, message flow, and runtime structure with much lower guesswork.- For Jazzy-era Linux systems, tracing is usually available without rebuilding ROS 2, but kernel events still require the kernel tracer and correct tracing-group permissions.
- The
Tracelaunch action is often the safest way to capture useful data because tracing needs to start before the application if you want initialization metadata and early runtime context. - Snapshot mode and dual-session mode are especially useful for intermittent failures because they preserve recent history without constantly writing full traces to disk.
- Trace data becomes valuable when paired with analysis, not just collection.
babeltraceis good for quick inspection;tracetools_analysisis good for callback-duration and message-flow work.
Practical Workflow¶
- Start from a symptom: delayed control, frame drops, executor stalls, or autonomy that degrades only under realistic load.
- Capture a trace before launch with
ros2 traceor a launch-fileTraceaction. - Use snapshot mode when you need a flight recorder and dual-session mode when startup behavior matters but continuous disk writes do not.
- Inspect the raw trace quickly with
babeltraceto confirm that the session contains the events you expected. - Use
tracetools_analysisnotebooks or APIs to plot callback duration and inspect message flow through the pipeline. - Only then change executor structure, callback grouping, QoS, or algorithm design.
Operational Guidance¶
- Reach for tracing when logs and CPU graphs tell you something is wrong but not where the latency is accumulating.
- Pair traces with rosbag, diagnostics, and behavior-tree logs when investigating autonomy failures that span sensing, execution, and decision layers.
- Treat callback outliers as system clues, not only algorithm clues. Executor contention, blocking I/O, and queueing structure often matter as much as pure compute time.
- Keep tracing in reserve for field failures by deciding ahead of time where trace data should be stored and when snapshot capture should be triggered.
Corroborating References¶
- ROS 2 Jazzy tracing tutorial
- ros2_tracing repository
- tracetools_analysis repository
- ros2_tracing paper
When to Read the Original Source¶
Go to the original sources when you want the exact CLI and launch-file tracing controls, snapshot and dual-session semantics, kernel-tracing requirements, or the sample analysis notebooks for callback durations and message-flow studies.