terminal/src/inc/til/ticket_lock.h
Leonard Hecker f68324cd09
Fix output stuttering using a ticket lock (#10653)
`SRWLOCK`, as used by `std::shared_mutex`, is a inherently unfair mutex
and makes no guarantee whatsoever whether a thread may acquire the lock
in a timely manner. This is problematic for our renderer which relies on
being able to acquire the lock in a timely and predictable manner.
Drawing stalls of up to one minute have been observed in tests.

This issue can be solved with a primitive ticket lock, which is 10x
slower than a `SRWLOCK` but still sufficiently fast for our use case
(10M locks per second per thread). It's likely that any non-trivial lock
duration will diminish the difference to "negligible".

## Validation Steps Performed

* It still blends ✔️
2021-07-14 23:41:22 +00:00

57 lines
2.1 KiB
C++

// Copyright (c) Microsoft Corporation.
// Licensed under the MIT license.
#pragma once
#include "atomic.h"
namespace til
{
// ticket_lock implements a classic fair lock.
//
// Compared to a SRWLOCK this implementation is significantly more unsafe to use:
// Forgetting to call unlock or calling unlock more than once, will lead to deadlocks,
// as _now_serving will remain out of sync with _next_ticket and prevent any further lockings.
//
// I recommend to use the following with this class:
// * A low number of concurrent accesses (this lock doesn't scale well beyond 2 threads)
// * alignas(std::hardware_destructive_interference_size) to prevent false sharing
// * std::unique_lock or std::scoped_lock to prevent unbalanced lock/unlock calls
struct ticket_lock
{
void lock() noexcept
{
const auto ticket = _next_ticket.fetch_add(1, std::memory_order_relaxed);
for (;;)
{
const auto current = _now_serving.load(std::memory_order_acquire);
if (current == ticket)
{
break;
}
til::atomic_wait(_now_serving, current);
}
}
void unlock() noexcept
{
_now_serving.fetch_add(1, std::memory_order_release);
til::atomic_notify_all(_now_serving);
}
private:
// You may be inclined to add alignas(std::hardware_destructive_interference_size)
// here to force the two atomics on separate cache lines, but I suggest to carefully
// benchmark such a change. Since this ticket_lock is primarily used to synchronize
// exactly 2 threads, it actually helps us that these atomic are on the same cache line
// as any change by one thread is flushed to the other, which will then read it anyways.
//
// Integer overflow doesn't break the algorithm, as these two
// atomics are treated more like "IDs" and less like counters.
std::atomic<uint32_t> _next_ticket{ 0 };
std::atomic<uint32_t> _now_serving{ 0 };
};
}