<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include, branch v3.0.1</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v3.0.1</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v3.0.1'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2011-08-05T04:58:40+00:00</updated>
<entry>
<title>NFS: Fix spurious readdir cookie loop messages</title>
<updated>2011-08-05T04:58:40+00:00</updated>
<author>
<name>Trond Myklebust</name>
<email>Trond.Myklebust@netapp.com</email>
</author>
<published>2011-07-30T16:45:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c14acb19a4b1482b6dd6e9d0874b2c8e32d6599d'/>
<id>urn:sha1:c14acb19a4b1482b6dd6e9d0874b2c8e32d6599d</id>
<content type='text'>
commit 0c0308066ca53fdf1423895f3a42838b67b3a5a8 upstream.

If the directory contents change, then we have to accept that the
file-&gt;f_pos value may shrink if we do a 'search-by-cookie'. In that
case, we should turn off the loop detection and let the NFS client
try to recover.

The patch also fixes a second loop detection bug by ensuring
that after turning on the ctx-&gt;duped flag, we read at least one new
cookie into ctx-&gt;dir_cookie before attempting to match with
ctx-&gt;dup_cookie.

Reported-by: Petr Vandrovec &lt;petr@vandrovec.name&gt;
Signed-off-by: Trond Myklebust &lt;Trond.Myklebust@netapp.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>mm/futex: fix futex writes on archs with SW tracking of dirty &amp; young</title>
<updated>2011-08-05T04:58:38+00:00</updated>
<author>
<name>Benjamin Herrenschmidt</name>
<email>benh@kernel.crashing.org</email>
</author>
<published>2011-07-26T00:12:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b045b9a265fb46d8197b7d78aff1a8f6ab8e23df'/>
<id>urn:sha1:b045b9a265fb46d8197b7d78aff1a8f6ab8e23df</id>
<content type='text'>
commit 2efaca927f5cd7ecd0f1554b8f9b6a9a2c329c03 upstream.

I haven't reproduced it myself but the fail scenario is that on such
machines (notably ARM and some embedded powerpc), if you manage to hit
that futex path on a writable page whose dirty bit has gone from the PTE,
you'll livelock inside the kernel from what I can tell.

It will go in a loop of trying the atomic access, failing, trying gup to
"fix it up", getting succcess from gup, go back to the atomic access,
failing again because dirty wasn't fixed etc...

So I think you essentially hang in the kernel.

The scenario is probably rare'ish because affected architecture are
embedded and tend to not swap much (if at all) so we probably rarely hit
the case where dirty is missing or young is missing, but I think Shan has
a piece of SW that can reliably reproduce it using a shared writable
mapping &amp; fork or something like that.

On archs who use SW tracking of dirty &amp; young, a page without dirty is
effectively mapped read-only and a page without young unaccessible in the
PTE.

Additionally, some architectures might lazily flush the TLB when relaxing
write protection (by doing only a local flush), and expect a fault to
invalidate the stale entry if it's still present on another processor.

The futex code assumes that if the "in_atomic()" access -EFAULT's, it can
"fix it up" by causing get_user_pages() which would then be equivalent to
taking the fault.

However that isn't the case.  get_user_pages() will not call
handle_mm_fault() in the case where the PTE seems to have the right
permissions, regardless of the dirty and young state.  It will eventually
update those bits ...  in the struct page, but not in the PTE.

Additionally, it will not handle the lazy TLB flushing that can be
required by some architectures in the fault case.

Basically, gup is the wrong interface for the job.  The patch provides a
more appropriate one which boils down to just calling handle_mm_fault()
since what we are trying to do is simulate a real page fault.

The futex code currently attempts to write to user memory within a
pagefault disabled section, and if that fails, tries to fix it up using
get_user_pages().

This doesn't work on archs where the dirty and young bits are maintained
by software, since they will gate access permission in the TLB, and will
not be updated by gup().

In addition, there's an expectation on some archs that a spurious write
fault triggers a local TLB flush, and that is missing from the picture as
well.

I decided that adding those "features" to gup() would be too much for this
already too complex function, and instead added a new simpler
fixup_user_fault() which is essentially a wrapper around handle_mm_fault()
which the futex code can call.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix some nits Darren saw, fiddle comment layout]
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Reported-by: Shan Hai &lt;haishan.bai@gmail.com&gt;
Tested-by: Shan Hai &lt;haishan.bai@gmail.com&gt;
Cc: David Laight &lt;David.Laight@ACULAB.COM&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Darren Hart &lt;darren.hart@intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>pnfs: let layoutcommit handle a list of lseg</title>
<updated>2011-08-05T04:58:37+00:00</updated>
<author>
<name>Peng Tao</name>
<email>peng_tao@emc.com</email>
</author>
<published>2011-07-31T00:52:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=892cd4a38f0d379dfdbc1a0a45eaa31a89976796'/>
<id>urn:sha1:892cd4a38f0d379dfdbc1a0a45eaa31a89976796</id>
<content type='text'>
commit a9bae5666d0510ad69bdb437371c9a3e6b770705 upstream.

There can be multiple lseg per file, so layoutcommit should be
able to handle it.

[Needed in v3.0]
Signed-off-by: Peng Tao &lt;peng_tao@emc.com&gt;
Signed-off-by: Boaz Harrosh &lt;bharrosh@panasas.com&gt;
Signed-off-by: Jim Rees &lt;rees@umich.edu&gt;
Signed-off-by: Trond Myklebust &lt;Trond.Myklebust@netapp.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>firewire: cdev: prevent race between first get_info ioctl and bus reset event queuing</title>
<updated>2011-08-05T04:58:34+00:00</updated>
<author>
<name>Stefan Richter</name>
<email>stefanr@s5r6.in-berlin.de</email>
</author>
<published>2011-07-09T14:43:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6f437783919467437f19ec534a0317aef2fd2584'/>
<id>urn:sha1:6f437783919467437f19ec534a0317aef2fd2584</id>
<content type='text'>
commit 93b37905f70083d6143f5f4dba0a45cc64379a62 upstream.

Between open(2) of a /dev/fw* and the first FW_CDEV_IOC_GET_INFO
ioctl(2) on it, the kernel already queues FW_CDEV_EVENT_BUS_RESET events
to be read(2) by the client.  The get_info ioctl is practically always
issued right away after open, hence this condition only occurs if the
client opens during a bus reset, especially during a rapid series of bus
resets.

The problem with this condition is twofold:

  - These bus reset events carry the (as yet undocumented) @closure
    value of 0.  But it is not the kernel's place to choose closures;
    they are privat to the client.  E.g., this 0 value forced from the
    kernel makes it unsafe for clients to dereference it as a pointer to
    a closure object without NULL pointer check.

  - It is impossible for clients to determine the relative order of bus
    reset events from get_info ioctl(2) versus those from read(2),
    except in one way:  By comparison of closure values.  Again, such a
    procedure imposes complexity on clients and reduces freedom in use
    of the bus reset closure.

So, change the ABI to suppress queuing of bus reset events before the
first FW_CDEV_IOC_GET_INFO ioctl was issued by the client.

Note, this ABI change cannot be version-controlled.  The kernel cannot
distinguish old from new clients before the first FW_CDEV_IOC_GET_INFO
ioctl.

We will try to back-merge this change into currently maintained stable/
longterm series, and we only document the new behaviour.  The old
behavior is now considered a kernel bug, which it basically is.

Signed-off-by: Stefan Richter &lt;stefanr@s5r6.in-berlin.de&gt;
Cc: &lt;stable@kernel.org&gt;

</content>
</entry>
<entry>
<title>gro: Only reset frag0 when skb can be pulled</title>
<updated>2011-08-05T04:58:31+00:00</updated>
<author>
<name>Herbert Xu</name>
<email>herbert@gondor.apana.org.au</email>
</author>
<published>2011-07-27T13:16:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=97edbc901240090ca75b81aa8955bcef8d570434'/>
<id>urn:sha1:97edbc901240090ca75b81aa8955bcef8d570434</id>
<content type='text'>
commit 17dd759c67f21e34f2156abcf415e1f60605a188 upstream.

Currently skb_gro_header_slow unconditionally resets frag0 and
frag0_len.  However, when we can't pull on the skb this leaves
the GRO fields in an inconsistent state.

This patch fixes this by only resetting those fields after the
pskb_may_pull test.

Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip</title>
<updated>2011-07-20T22:56:25+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2011-07-20T22:56:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=cf6ace16a3cd8b728fb0afa68368fd40bbeae19f'/>
<id>urn:sha1:cf6ace16a3cd8b728fb0afa68368fd40bbeae19f</id>
<content type='text'>
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  signal: align __lock_task_sighand() irq disabling and RCU
  softirq,rcu: Inform RCU of irq_exit() activity
  sched: Add irq_{enter,exit}() to scheduler_ipi()
  rcu: protect __rcu_read_unlock() against scheduler-using irq handlers
  rcu: Streamline code produced by __rcu_read_unlock()
  rcu: Fix RCU_BOOST race handling current-&gt;rcu_read_unlock_special
  rcu: decrease rcu_report_exp_rnp coupling with scheduler
</content>
</entry>
<entry>
<title>Merge branch 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu into core/urgent</title>
<updated>2011-07-20T18:59:26+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@elte.hu</email>
</author>
<published>2011-07-20T18:59:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d1e9ae47a0285d3f1699e8219ce50f656243b93f'/>
<id>urn:sha1:d1e9ae47a0285d3f1699e8219ce50f656243b93f</id>
<content type='text'>
</content>
</entry>
<entry>
<title>sched: Allow for overlapping sched_domain spans</title>
<updated>2011-07-20T16:32:41+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2011-07-15T08:35:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e3589f6c81e4764d32a25d2a2a0afe54fa344f5c'/>
<id>urn:sha1:e3589f6c81e4764d32a25d2a2a0afe54fa344f5c</id>
<content type='text'>
Allow for sched_domain spans that overlap by giving such domains their
own sched_group list instead of sharing the sched_groups amongst
each-other.

This is needed for machines with more than 16 nodes, because
sched_domain_node_span() will generate a node mask from the
16 nearest nodes without regard if these masks have any overlap.

Currently sched_domains have a sched_group that maps to their child
sched_domain span, and since there is no overlap we share the
sched_group between the sched_domains of the various CPUs. If however
there is overlap, we would need to link the sched_group list in
different ways for each cpu, and hence sharing isn't possible.

In order to solve this, allocate private sched_groups for each CPU's
sched_domain but have the sched_groups share a sched_group_power
structure such that we can uniquely track the power.

Reported-and-tested-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Link: http://lkml.kernel.org/n/tip-08bxqw9wis3qti9u5inifh3y@git.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
<entry>
<title>sched: Break out cpu_power from the sched_group structure</title>
<updated>2011-07-20T16:32:40+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2011-07-14T11:00:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9c3f75cbd144014bea6af866a154cc2e73ab2287'/>
<id>urn:sha1:9c3f75cbd144014bea6af866a154cc2e73ab2287</id>
<content type='text'>
In order to prepare for non-unique sched_groups per domain, we need to
carry the cpu_power elsewhere, so put a level of indirection in.

Reported-and-tested-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Link: http://lkml.kernel.org/n/tip-qkho2byuhe4482fuknss40ad@git.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
<entry>
<title>rcu: Fix RCU_BOOST race handling current-&gt;rcu_read_unlock_special</title>
<updated>2011-07-20T04:38:52+00:00</updated>
<author>
<name>Paul E. McKenney</name>
<email>paul.mckenney@linaro.org</email>
</author>
<published>2011-07-14T19:24:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7765be2fec0f476fcd61812d5f9406b04c765020'/>
<id>urn:sha1:7765be2fec0f476fcd61812d5f9406b04c765020</id>
<content type='text'>
The RCU_BOOST commits for TREE_PREEMPT_RCU introduced an other-task
write to a new RCU_READ_UNLOCK_BOOSTED bit in the task_struct structure's
-&gt;rcu_read_unlock_special field, but, as noted by Steven Rostedt, without
correctly synchronizing all accesses to -&gt;rcu_read_unlock_special.
This could result in bits in -&gt;rcu_read_unlock_special being spuriously
set and cleared due to conflicting accesses, which in turn could result
in deadlocks between the rcu_node structure's -&gt;lock and the scheduler's
rq and pi locks.  These deadlocks would result from RCU incorrectly
believing that the just-ended RCU read-side critical section had been
preempted and/or boosted.  If that RCU read-side critical section was
executed with either rq or pi locks held, RCU's ensuing (incorrect)
calls to the scheduler would cause the scheduler to attempt to once
again acquire the rq and pi locks, resulting in deadlock.  More complex
deadlock cycles are also possible, involving multiple rq and pi locks
as well as locks from multiple rcu_node structures.

This commit fixes synchronization by creating -&gt;rcu_boosted field in
task_struct that is accessed and modified only when holding the -&gt;lock
in the rcu_node structure on which the task is queued (on that rcu_node
structure's -&gt;blkd_tasks list).  This results in tasks accessing only
their own current-&gt;rcu_read_unlock_special fields, making unsynchronized
access once again legal, and keeping the rcu_read_unlock() fastpath free
of atomic instructions and memory barriers.

The reason that the rcu_read_unlock() fastpath does not need to access
the new current-&gt;rcu_boosted field is that this new field cannot
be non-zero unless the RCU_READ_UNLOCK_BLOCKED bit is set in the
current-&gt;rcu_read_unlock_special field.  Therefore, rcu_read_unlock()
need only test current-&gt;rcu_read_unlock_special: if that is zero, then
current-&gt;rcu_boosted must also be zero.

This bug does not affect TINY_PREEMPT_RCU because this implementation
of RCU accesses current-&gt;rcu_read_unlock_special with irqs disabled,
thus preventing races on the !SMP systems that TINY_PREEMPT_RCU runs on.

Maybe-reported-by: Dave Jones &lt;davej@redhat.com&gt;
Maybe-reported-by: Sergey Senozhatsky &lt;sergey.senozhatsky@gmail.com&gt;
Reported-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Paul E. McKenney &lt;paul.mckenney@linaro.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Reviewed-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</content>
</entry>
</feed>
