diff options
author | Carlos Llamas <cmllamas@google.com> | 2023-12-01 20:21:57 +0300 |
---|---|---|
committer | Greg Kroah-Hartman <gregkh@linuxfoundation.org> | 2023-12-05 03:23:41 +0300 |
commit | 7710e2cca32e7f3958480e8bd44f50e29d0c2509 (patch) | |
tree | dd750a11cdc5c1fd884f653c7c7437561c65a16a /drivers/android/binder_alloc.h | |
parent | e50f4e6cc9bfaca655d3b6a3506d27cf2caa1d40 (diff) | |
download | linux-7710e2cca32e7f3958480e8bd44f50e29d0c2509.tar.xz |
binder: switch alloc->mutex to spinlock_t
The alloc->mutex is a highly contended lock that causes performance
issues on Android devices. When a low-priority task is given this lock
and it sleeps, it becomes difficult for the task to wake up and complete
its work. This delays other tasks that are also waiting on the mutex.
The problem gets worse when there is memory pressure in the system,
because this increases the contention on the alloc->mutex while the
shrinker reclaims binder pages.
Switching to a spinlock helps to keep the waiters running and avoids the
overhead of waking up tasks. This significantly improves the transaction
latency when the problematic scenario occurs.
The performance impact of this patchset was measured by stress-testing
the binder alloc contention. In this test, several clients of different
priorities send thousands of transactions of different sizes to a single
server. In parallel, pages get reclaimed using the shinker's debugfs.
The test was run on a Pixel 8, Pixel 6 and qemu machine. The results
were similar on all three devices:
after:
| sched | prio | average | max | min |
|--------+------+---------+-----------+---------|
| fifo | 99 | 0.135ms | 1.197ms | 0.022ms |
| fifo | 01 | 0.136ms | 5.232ms | 0.018ms |
| other | -20 | 0.180ms | 7.403ms | 0.019ms |
| other | 19 | 0.241ms | 58.094ms | 0.018ms |
before:
| sched | prio | average | max | min |
|--------+------+---------+-----------+---------|
| fifo | 99 | 0.350ms | 248.730ms | 0.020ms |
| fifo | 01 | 0.357ms | 248.817ms | 0.024ms |
| other | -20 | 0.399ms | 249.906ms | 0.020ms |
| other | 19 | 0.477ms | 297.756ms | 0.022ms |
The key metrics above are the average and max latencies (wall time).
These improvements should roughly translate to p95-p99 latencies on real
workloads. The response time is up to 200x faster in these scenarios and
there is no penalty in the regular path.
Note that it is only possible to convert this lock after a series of
changes made by previous patches. These mainly include refactoring the
sections that might_sleep() and changing the locking order with the
mmap_lock amongst others.
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-29-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Diffstat (limited to 'drivers/android/binder_alloc.h')
-rw-r--r-- | drivers/android/binder_alloc.h | 10 |
1 files changed, 5 insertions, 5 deletions
diff --git a/drivers/android/binder_alloc.h b/drivers/android/binder_alloc.h index a5181916942e..70387234477e 100644 --- a/drivers/android/binder_alloc.h +++ b/drivers/android/binder_alloc.h @@ -9,7 +9,7 @@ #include <linux/rbtree.h> #include <linux/list.h> #include <linux/mm.h> -#include <linux/rtmutex.h> +#include <linux/spinlock.h> #include <linux/vmalloc.h> #include <linux/slab.h> #include <linux/list_lru.h> @@ -72,7 +72,7 @@ struct binder_lru_page { /** * struct binder_alloc - per-binder proc state for binder allocator - * @mutex: protects binder_alloc fields + * @lock: protects binder_alloc fields * @vma: vm_area_struct passed to mmap_handler * (invariant after mmap) * @mm: copy of task->mm (invariant after open) @@ -96,7 +96,7 @@ struct binder_lru_page { * struct binder_buffer objects used to track the user buffers */ struct binder_alloc { - struct mutex mutex; + spinlock_t lock; struct vm_area_struct *vma; struct mm_struct *mm; unsigned long buffer; @@ -153,9 +153,9 @@ binder_alloc_get_free_async_space(struct binder_alloc *alloc) { size_t free_async_space; - mutex_lock(&alloc->mutex); + spin_lock(&alloc->lock); free_async_space = alloc->free_async_space; - mutex_unlock(&alloc->mutex); + spin_unlock(&alloc->lock); return free_async_space; } |