32: Capture backtraces for mutex dependencies r=bertptrs a=bertptrs Builds on top of #28. This PR adds backtrace data to the dependency graph, so you can figure out what series of events might have introduced the cycle in dependencies. Only the first backtrace These changes do have a performance penalty, with a worst case of 20-50% degradation over previous results. This applies to the worst case scenario where every dependency between mutexes is new and thus is unlikely to be as severe. Below is an example of what this can look like, generated with `examples/mutex_cycle.rs`. The formatting is decidedly suboptimal but backtraces cannot be formatted very well in stable rust at the moment. The exact performance hit depends on a lot of things, such as the level of backtraces captured (off, 1, or full), and how many dependencies are involved. ``` thread 'main' panicked at 'Found cycle in mutex dependency graph: 0: tracing_mutex::MutexDep::capture at ./src/lib.rs:278:23 1: core::ops::function::FnOnce::call_once at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/core/src/ops/function.rs:250:5 2: tracing_mutex::graph::DiGraph<V,E>::add_edge at ./src/graph.rs:131:50 3: tracing_mutex::MutexId::mark_held::{{closure}} at ./src/lib.rs:146:17 4: std:🧵:local::LocalKey<T>::try_with at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/thread/local.rs:270:16 5: std:🧵:local::LocalKey<T>::with at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/thread/local.rs:246:9 6: tracing_mutex::MutexId::mark_held at ./src/lib.rs:142:25 7: tracing_mutex::MutexId::get_borrowed at ./src/lib.rs:129:9 8: tracing_mutex::stdsync::tracing::Mutex<T>::lock at ./src/stdsync.rs:110:25 9: mutex_cycle::main at ./examples/mutex_cycle.rs:20:18 10: core::ops::function::FnOnce::call_once at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/core/src/ops/function.rs:250:5 11: std::sys_common::backtrace::__rust_begin_short_backtrace at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/sys_common/backtrace.rs:135:18 12: std::rt::lang_start::{{closure}} at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/rt.rs:166:18 13: core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/core/src/ops/function.rs:284:13 14: std::panicking::try::do_call at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panicking.rs:500:40 15: std::panicking::try at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panicking.rs:464:19 16: std::panic::catch_unwind at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panic.rs:142:14 17: std::rt::lang_start_internal::{{closure}} at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/rt.rs:148:48 18: std::panicking::try::do_call at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panicking.rs:500:40 19: std::panicking::try at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panicking.rs:464:19 20: std::panic::catch_unwind at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panic.rs:142:14 21: std::rt::lang_start_internal at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/rt.rs:148:20 22: std::rt::lang_start at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/rt.rs:165:17 23: main 24: <unknown> 25: __libc_start_main 26: _start 0: tracing_mutex::MutexDep::capture at ./src/lib.rs:278:23 1: core::ops::function::FnOnce::call_once at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/core/src/ops/function.rs:250:5 2: tracing_mutex::graph::DiGraph<V,E>::add_edge at ./src/graph.rs:131:50 3: tracing_mutex::MutexId::mark_held::{{closure}} at ./src/lib.rs:146:17 4: std:🧵:local::LocalKey<T>::try_with at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/thread/local.rs:270:16 5: std:🧵:local::LocalKey<T>::with at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/thread/local.rs:246:9 6: tracing_mutex::MutexId::mark_held at ./src/lib.rs:142:25 7: tracing_mutex::MutexId::get_borrowed at ./src/lib.rs:129:9 8: tracing_mutex::stdsync::tracing::Mutex<T>::lock at ./src/stdsync.rs:110:25 9: mutex_cycle::main at ./examples/mutex_cycle.rs:14:18 10: core::ops::function::FnOnce::call_once at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/core/src/ops/function.rs:250:5 11: std::sys_common::backtrace::__rust_begin_short_backtrace at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/sys_common/backtrace.rs:135:18 12: std::rt::lang_start::{{closure}} at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/rt.rs:166:18 13: core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/core/src/ops/function.rs:284:13 14: std::panicking::try::do_call at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panicking.rs:500:40 15: std::panicking::try at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panicking.rs:464:19 16: std::panic::catch_unwind at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panic.rs:142:14 17: std::rt::lang_start_internal::{{closure}} at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/rt.rs:148:48 18: std::panicking::try::do_call at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panicking.rs:500:40 19: std::panicking::try at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panicking.rs:464:19 20: std::panic::catch_unwind at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panic.rs:142:14 21: std::rt::lang_start_internal at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/rt.rs:148:20 22: std::rt::lang_start at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/rt.rs:165:17 23: main 24: <unknown> 25: __libc_start_main 26: _start ', src/lib.rs:163:13 stack backtrace: 0: rust_begin_unwind at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panicking.rs:593:5 1: core::panicking::panic_fmt at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/core/src/panicking.rs:67:14 2: tracing_mutex::MutexId::mark_held at ./src/lib.rs:163:13 3: tracing_mutex::MutexId::get_borrowed at ./src/lib.rs:129:9 4: tracing_mutex::stdsync::tracing::Mutex<T>::lock at ./src/stdsync.rs:110:25 5: mutex_cycle::main at ./examples/mutex_cycle.rs:25:14 6: core::ops::function::FnOnce::call_once at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/core/src/ops/function.rs:250:5 ``` Importantly, the error shows all the dependencies that are already part of the graph, not the one that was just added, since that is already visible from the immediate panic. Co-authored-by: Bert Peters <bert@bertptrs.nl>
Tracing Mutex
Avoid deadlocks in your mutexes by acquiring them in a consistent order, or else.
Background
In any code that uses mutexes or locks, you quickly run into the possibility of deadlock. With just
two mutexes Foo and Bar you can already deadlock, assuming one thread first locks Foo then
attempts to get Bar and another first gets Bar then tries to get Foo. Now both threads are
waiting for each other to release the lock they already have.
One simple way to get around this is by ensuring that, when you need both Foo and Bar, you
should first acquire Foo then you can never deadlock. Of course, with just two mutexes, this is
easy to keep track of, but once your code starts to grow you might lose track of all these
dependencies. That's where this crate comes in.
This crate tracks the order in which you acquire locks in your code, tries to build a dependency tree out of it, and panics if your dependencies would create a cycle. It provides replacements for existing synchronization primitives with an identical API, and should be a drop-in replacement.
Inspired by this blogpost, which references a similar behaviour implemented by Abseil for their mutexes. This article goes into more depth on the exact implementation.
Usage
Add this dependency to your Cargo.lock file like any other:
[dependencies]
tracing-mutex = "0.2"
Then use the locks provided by this library instead of the ones you would use otherwise.
Replacements for the synchronization primitives in std::sync can be found in the stdsync module.
Support for other synchronization primitives is planned.
use tracing_mutex::stdsync::Mutex;
let some_mutex = Mutex::new(42);
*some_mutex.lock().unwrap() += 1;
println!("{:?}", some_mutex);
The interdependencies between locks are automatically tracked. If any locking operation would introduce a cyclic dependency between your locks, the operation panics instead. This allows you to immediately notice the cyclic dependency rather than be eventually surprised by it in production.
Mutex tracing is efficient, but it is not completely overhead-free. If you cannot spare the
performance penalty in your production environment, this library also offers debug-only tracing.
DebugMutex, also found in the stdsync module, is a type alias that evaluates to TracingMutex
when debug assertions are enabled, and to Mutex when they are not. Similar helper types are
available for other synchronization primitives.
The minimum supported Rust version is 1.70. Increasing this is not considered a breaking change, but will be avoided within semver-compatible releases if possible.
Features
- Dependency-tracking wrappers for all locking primitives
- Optional opt-out for release mode code
- Support for primitives from:
std::syncparking_lot- Any library that implements the
lock_apitraits
Future improvements
- Improve performance in lock tracing
- Optional logging to make debugging easier
- Better and configurable error handling when detecting cyclic dependencies
- Support for other locking libraries
- Support for async locking libraries
- Support for
Sendmutex guards
Note: parking_lot has already began work on its own deadlock detection mechanism, which works
in a different way. Both can be complimentary.
License
Licensed under either of
- Apache License, Version 2.0, (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.
Contributing
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.