diff options
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/RCU/trace.txt | 2 | ||||
-rw-r--r-- | Documentation/circular-buffers.txt | 45 | ||||
-rw-r--r-- | Documentation/kernel-parameters.txt | 9 |
3 files changed, 32 insertions, 24 deletions
diff --git a/Documentation/RCU/trace.txt b/Documentation/RCU/trace.txt index b8c3c813ea57..910870b15acd 100644 --- a/Documentation/RCU/trace.txt +++ b/Documentation/RCU/trace.txt @@ -447,7 +447,7 @@ The output of "cat rcu/rcuboost" looks as follows: balk: nt=0 egt=6541 bt=0 nb=0 ny=126 nos=0 This information is output only for rcu_preempt. Each two-line entry -corresponds to a leaf rcu_node strcuture. The fields are as follows: +corresponds to a leaf rcu_node structure. The fields are as follows: o "n:m" is the CPU-number range for the corresponding two-line entry. In the sample output above, the first entry covers diff --git a/Documentation/circular-buffers.txt b/Documentation/circular-buffers.txt index 8117e5bf6065..88951b179262 100644 --- a/Documentation/circular-buffers.txt +++ b/Documentation/circular-buffers.txt @@ -160,6 +160,7 @@ The producer will look something like this: spin_lock(&producer_lock); unsigned long head = buffer->head; + /* The spin_unlock() and next spin_lock() provide needed ordering. */ unsigned long tail = ACCESS_ONCE(buffer->tail); if (CIRC_SPACE(head, tail, buffer->size) >= 1) { @@ -168,9 +169,8 @@ The producer will look something like this: produce_item(item); - smp_wmb(); /* commit the item before incrementing the head */ - - buffer->head = (head + 1) & (buffer->size - 1); + smp_store_release(buffer->head, + (head + 1) & (buffer->size - 1)); /* wake_up() will make sure that the head is committed before * waking anyone up */ @@ -183,9 +183,14 @@ This will instruct the CPU that the contents of the new item must be written before the head index makes it available to the consumer and then instructs the CPU that the revised head index must be written before the consumer is woken. -Note that wake_up() doesn't have to be the exact mechanism used, but whatever -is used must guarantee a (write) memory barrier between the update of the head -index and the change of state of the consumer, if a change of state occurs. +Note that wake_up() does not guarantee any sort of barrier unless something +is actually awakened. We therefore cannot rely on it for ordering. However, +there is always one element of the array left empty. Therefore, the +producer must produce two elements before it could possibly corrupt the +element currently being read by the consumer. Therefore, the unlock-lock +pair between consecutive invocations of the consumer provides the necessary +ordering between the read of the index indicating that the consumer has +vacated a given element and the write by the producer to that same element. THE CONSUMER @@ -195,21 +200,20 @@ The consumer will look something like this: spin_lock(&consumer_lock); - unsigned long head = ACCESS_ONCE(buffer->head); + /* Read index before reading contents at that index. */ + unsigned long head = smp_load_acquire(buffer->head); unsigned long tail = buffer->tail; if (CIRC_CNT(head, tail, buffer->size) >= 1) { - /* read index before reading contents at that index */ - smp_read_barrier_depends(); /* extract one item from the buffer */ struct item *item = buffer[tail]; consume_item(item); - smp_mb(); /* finish reading descriptor before incrementing tail */ - - buffer->tail = (tail + 1) & (buffer->size - 1); + /* Finish reading descriptor before incrementing tail. */ + smp_store_release(buffer->tail, + (tail + 1) & (buffer->size - 1)); } spin_unlock(&consumer_lock); @@ -218,12 +222,17 @@ This will instruct the CPU to make sure the index is up to date before reading the new item, and then it shall make sure the CPU has finished reading the item before it writes the new tail pointer, which will erase the item. - -Note the use of ACCESS_ONCE() in both algorithms to read the opposition index. -This prevents the compiler from discarding and reloading its cached value - -which some compilers will do across smp_read_barrier_depends(). This isn't -strictly needed if you can be sure that the opposition index will _only_ be -used the once. +Note the use of ACCESS_ONCE() and smp_load_acquire() to read the +opposition index. This prevents the compiler from discarding and +reloading its cached value - which some compilers will do across +smp_read_barrier_depends(). This isn't strictly needed if you can +be sure that the opposition index will _only_ be used the once. +The smp_load_acquire() additionally forces the CPU to order against +subsequent memory references. Similarly, smp_store_release() is used +in both algorithms to write the thread's index. This documents the +fact that we are writing to something that can be read concurrently, +prevents the compiler from tearing the store, and enforces ordering +against previous accesses. =============== diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 50680a59a2ff..9063e74ba5c6 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2625,7 +2625,6 @@ bytes respectively. Such letter suffixes can also be entirely omitted. for RCU-preempt, and "s" for RCU-sched, and "N" is the CPU number. This reduces OS jitter on the offloaded CPUs, which can be useful for HPC and - real-time workloads. It can also improve energy efficiency for asymmetric multiprocessors. @@ -2641,8 +2640,8 @@ bytes respectively. Such letter suffixes can also be entirely omitted. periodically wake up to do the polling. rcutree.blimit= [KNL] - Set maximum number of finished RCU callbacks to process - in one batch. + Set maximum number of finished RCU callbacks to + process in one batch. rcutree.rcu_fanout_leaf= [KNL] Increase the number of CPUs assigned to each @@ -2661,8 +2660,8 @@ bytes respectively. Such letter suffixes can also be entirely omitted. value is one, and maximum value is HZ. rcutree.qhimark= [KNL] - Set threshold of queued - RCU callbacks over which batch limiting is disabled. + Set threshold of queued RCU callbacks beyond which + batch limiting is disabled. rcutree.qlowmark= [KNL] Set threshold of queued RCU callbacks below which |