From 1dfc9d60a69ec148e1cb709256617d86e5f0e8f8 Mon Sep 17 00:00:00 2001
From: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Date: Thu, 5 Mar 2026 22:45:40 +0100
Subject: workqueue: devres: Add device-managed allocate workqueue

Add a Resource-managed version of alloc_workqueue() to fix common
problem of drivers mixing devm() calls with destroy_workqueue.  Such
naive and discouraged driver approach leads to difficult to debug bugs
when the driver:

1. Allocates workqueue in standard way and destroys it in driver
   remove() callback,
2. Sets work struct with devm_work_autocancel(),
3. Registers interrupt handler with devm_request_threaded_irq().

Which leads to following unbind/removal path:

1. destroy_workqueue() via driver remove(),
   Any interrupt coming now would still execute the interrupt handler,
   which queues work on destroyed workqueue.
2. devm_irq_release(),
3. devm_work_drop() -> cancel_work_sync() on destroyed workqueue.

devm_alloc_workqueue() has two benefits:
1. Solves above problem of mix-and-match devres and non-devres code in
   driver,
2. Simplify any sane drivers which were correctly using
   alloc_workqueue() + devm_add_action_or_reset().

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
---
 Documentation/driver-api/driver-model/devres.rst | 4 ++++
 1 file changed, 4 insertions(+)

(limited to 'Documentation')

diff --git a/Documentation/driver-api/driver-model/devres.rst b/Documentation/driver-api/driver-model/devres.rst
index 7d2b897d66fa..017fb155a5bc 100644
--- a/Documentation/driver-api/driver-model/devres.rst
+++ b/Documentation/driver-api/driver-model/devres.rst
@@ -464,3 +464,7 @@ SPI
 
 WATCHDOG
   devm_watchdog_register_device()
+
+WORKQUEUE
+  devm_alloc_workqueue()
+  devm_alloc_ordered_workqueue()
-- 
cgit v1.2.3


From 41e3ccca00b374b7f39cf68e818b59a921cd7069 Mon Sep 17 00:00:00 2001
From: Breno Leitao <leitao@debian.org>
Date: Wed, 1 Apr 2026 06:03:57 -0700
Subject: docs: workqueue: document WQ_AFFN_CACHE_SHARD affinity scope

Update kernel-parameters.txt and workqueue.rst to reflect the new
cache_shard affinity scope and the default change from cache to
cache_shard.

Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
---
 Documentation/admin-guide/kernel-parameters.txt |  3 ++-
 Documentation/core-api/workqueue.rst            | 14 ++++++++++----
 2 files changed, 12 insertions(+), 5 deletions(-)

(limited to 'Documentation')

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index cb850e5290c2..2c182a3e002d 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -8519,7 +8519,8 @@ Kernel parameters
         workqueue.default_affinity_scope=
 			Select the default affinity scope to use for unbound
 			workqueues. Can be one of "cpu", "smt", "cache",
-			"numa" and "system". Default is "cache". For more
+			"cache_shard", "numa" and "system". Default is
+			"cache_shard". For more
 			information, see the Affinity Scopes section in
 			Documentation/core-api/workqueue.rst.
 
diff --git a/Documentation/core-api/workqueue.rst b/Documentation/core-api/workqueue.rst
index 165ca73e8351..411e1b28b8de 100644
--- a/Documentation/core-api/workqueue.rst
+++ b/Documentation/core-api/workqueue.rst
@@ -378,9 +378,9 @@ Affinity Scopes
 
 An unbound workqueue groups CPUs according to its affinity scope to improve
 cache locality. For example, if a workqueue is using the default affinity
-scope of "cache", it will group CPUs according to last level cache
-boundaries. A work item queued on the workqueue will be assigned to a worker
-on one of the CPUs which share the last level cache with the issuing CPU.
+scope of "cache_shard", it will group CPUs into sub-LLC shards. A work item
+queued on the workqueue will be assigned to a worker on one of the CPUs
+within the same shard as the issuing CPU.
 Once started, the worker may or may not be allowed to move outside the scope
 depending on the ``affinity_strict`` setting of the scope.
 
@@ -402,7 +402,13 @@ Workqueue currently supports the following affinity scopes.
 ``cache``
   CPUs are grouped according to cache boundaries. Which specific cache
   boundary is used is determined by the arch code. L3 is used in a lot of
-  cases. This is the default affinity scope.
+  cases.
+
+``cache_shard``
+  CPUs are grouped into sub-LLC shards of at most ``wq_cache_shard_size``
+  cores (default 8, tunable via the ``workqueue.cache_shard_size`` boot
+  parameter). Shards are always split on core (SMT group) boundaries.
+  This is the default affinity scope.
 
 ``numa``
   CPUs are grouped according to NUMA boundaries.
-- 
cgit v1.2.3