<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include, branch v4.14.196</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v4.14.196</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v4.14.196'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2020-09-03T09:22:33+00:00</updated>
<entry>
<title>overflow.h: Add allocation size calculation helpers</title>
<updated>2020-09-03T09:22:33+00:00</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2018-05-07T23:47:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=80459b71e2ce64b39eafc2422212e4c99066d118'/>
<id>urn:sha1:80459b71e2ce64b39eafc2422212e4c99066d118</id>
<content type='text'>
commit 610b15c50e86eb1e4b77274fabcaea29ac72d6a8 upstream.

In preparation for replacing unchecked overflows for memory allocations,
this creates helpers for the 3 most common calculations:

array_size(a, b): 2-dimensional array
array3_size(a, b, c): 3-dimensional array
struct_size(ptr, member, n): struct followed by n-many trailing members

Each of these return SIZE_MAX on overflow instead of wrapping around.

(Additionally renames a variable named "array_size" to avoid future
collision.)

Co-developed-by: Matthew Wilcox &lt;mawilcox@microsoft.com&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>writeback: Fix sync livelock due to b_dirty_time processing</title>
<updated>2020-09-03T09:22:32+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2020-05-29T14:08:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=77322edb99f0176ec67a300b9f263719ba391b89'/>
<id>urn:sha1:77322edb99f0176ec67a300b9f263719ba391b89</id>
<content type='text'>
commit f9cae926f35e8230330f28c7b743ad088611a8de upstream.

When we are processing writeback for sync(2), move_expired_inodes()
didn't set any inode expiry value (older_than_this). This can result in
writeback never completing if there's steady stream of inodes added to
b_dirty_time list as writeback rechecks dirty lists after each writeback
round whether there's more work to be done. Fix the problem by using
sync(2) start time is inode expiry value when processing b_dirty_time
list similarly as for ordinarily dirtied inodes. This requires some
refactoring of older_than_this handling which simplifies the code
noticeably as a bonus.

Fixes: 0ae45f63d4ef ("vfs: add support for a lazytime mount option")
CC: stable@vger.kernel.org
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>writeback: Avoid skipping inode writeback</title>
<updated>2020-09-03T09:22:32+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2020-05-29T13:05:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=856fa4ebf57c083fb5d6b190f0243f0abd27cbea'/>
<id>urn:sha1:856fa4ebf57c083fb5d6b190f0243f0abd27cbea</id>
<content type='text'>
commit 5afced3bf28100d81fb2fe7e98918632a08feaf5 upstream.

Inode's i_io_list list head is used to attach inode to several different
lists - wb-&gt;{b_dirty, b_dirty_time, b_io, b_more_io}. When flush worker
prepares a list of inodes to writeback e.g. for sync(2), it moves inodes
to b_io list. Thus it is critical for sync(2) data integrity guarantees
that inode is not requeued to any other writeback list when inode is
queued for processing by flush worker. That's the reason why
writeback_single_inode() does not touch i_io_list (unless the inode is
completely clean) and why __mark_inode_dirty() does not touch i_io_list
if I_SYNC flag is set.

However there are two flaws in the current logic:

1) When inode has only I_DIRTY_TIME set but it is already queued in b_io
list due to sync(2), concurrent __mark_inode_dirty(inode, I_DIRTY_SYNC)
can still move inode back to b_dirty list resulting in skipping
writeback of inode time stamps during sync(2).

2) When inode is on b_dirty_time list and writeback_single_inode() races
with __mark_inode_dirty() like:

writeback_single_inode()		__mark_inode_dirty(inode, I_DIRTY_PAGES)
  inode-&gt;i_state |= I_SYNC
  __writeback_single_inode()
					  inode-&gt;i_state |= I_DIRTY_PAGES;
					  if (inode-&gt;i_state &amp; I_SYNC)
					    bail
  if (!(inode-&gt;i_state &amp; I_DIRTY_ALL))
  - not true so nothing done

We end up with I_DIRTY_PAGES inode on b_dirty_time list and thus
standard background writeback will not writeback this inode leading to
possible dirty throttling stalls etc. (thanks to Martijn Coenen for this
analysis).

Fix these problems by tracking whether inode is queued in b_io or
b_more_io lists in a new I_SYNC_QUEUED flag. When this flag is set, we
know flush worker has queued inode and we should not touch i_io_list.
On the other hand we also know that once flush worker is done with the
inode it will requeue the inode to appropriate dirty list. When
I_SYNC_QUEUED is not set, __mark_inode_dirty() can (and must) move inode
to appropriate dirty list.

Reported-by: Martijn Coenen &lt;maco@android.com&gt;
Reviewed-by: Martijn Coenen &lt;maco@android.com&gt;
Tested-by: Martijn Coenen &lt;maco@android.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Fixes: 0ae45f63d4ef ("vfs: add support for a lazytime mount option")
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>efi: provide empty efi_enter_virtual_mode implementation</title>
<updated>2020-09-03T09:22:28+00:00</updated>
<author>
<name>Andrey Konovalov</name>
<email>andreyknvl@google.com</email>
</author>
<published>2020-08-07T06:25:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ab0c78f79ec6b69d0aaa5d4fd1b22359eaa3c9d5'/>
<id>urn:sha1:ab0c78f79ec6b69d0aaa5d4fd1b22359eaa3c9d5</id>
<content type='text'>
[ Upstream commit 2c547f9da0539ad1f7ef7f08c8c82036d61b011a ]

When CONFIG_EFI is not enabled, we might get an undefined reference to
efi_enter_virtual_mode() error, if this efi_enabled() call isn't inlined
into start_kernel().  This happens in particular, if start_kernel() is
annodated with __no_sanitize_address.

Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Signed-off-by: Andrey Konovalov &lt;andreyknvl@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Acked-by: Ard Biesheuvel &lt;ardb@kernel.org&gt;
Cc: Alexander Potapenko &lt;glider@google.com&gt;
Cc: Andrey Ryabinin &lt;aryabinin@virtuozzo.com&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Cc: Elena Petrova &lt;lenaptr@google.com&gt;
Cc: Marco Elver &lt;elver@google.com&gt;
Cc: Vincenzo Frascino &lt;vincenzo.frascino@arm.com&gt;
Cc: Walter Wu &lt;walter-zh.wu@mediatek.com&gt;
Link: http://lkml.kernel.org/r/6514652d3a32d3ed33d6eb5c91d0af63bf0d1a0c.1596544734.git.andreyknvl@google.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>genirq/affinity: Make affinity setting if activated opt-in</title>
<updated>2020-08-21T07:48:23+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2020-07-24T20:44:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6a7a29935d98567331ae26b183422e22f4d0ae90'/>
<id>urn:sha1:6a7a29935d98567331ae26b183422e22f4d0ae90</id>
<content type='text'>
commit f0c7baca180046824e07fc5f1326e83a8fd150c7 upstream.

John reported that on a RK3288 system the perf per CPU interrupts are all
affine to CPU0 and provided the analysis:

 "It looks like what happens is that because the interrupts are not per-CPU
  in the hardware, armpmu_request_irq() calls irq_force_affinity() while
  the interrupt is deactivated and then request_irq() with IRQF_PERCPU |
  IRQF_NOBALANCING.

  Now when irq_startup() runs with IRQ_STARTUP_NORMAL, it calls
  irq_setup_affinity() which returns early because IRQF_PERCPU and
  IRQF_NOBALANCING are set, leaving the interrupt on its original CPU."

This was broken by the recent commit which blocked interrupt affinity
setting in hardware before activation of the interrupt. While this works in
general, it does not work for this particular case. As contrary to the
initial analysis not all interrupt chip drivers implement an activate
callback, the safe cure is to make the deferred interrupt affinity setting
at activation time opt-in.

Implement the necessary core logic and make the two irqchip implementations
for which this is required opt-in. In hindsight this would have been the
right thing to do, but ...

Fixes: baedb87d1b53 ("genirq/affinity: Handle affinity setting on inactive interrupts correctly")
Reported-by: John Keeping &lt;john@metanate.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Tested-by: Marc Zyngier &lt;maz@kernel.org&gt;
Acked-by: Marc Zyngier &lt;maz@kernel.org&gt;
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/87blk4tzgm.fsf@nanos.tec.linutronix.de
[fllinden@amazon.com - backported to 4.14]
Signed-off-by: Frank van der Linden &lt;fllinden@amazon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>iommu/vt-d: Enforce PASID devTLB field mask</title>
<updated>2020-08-21T07:48:21+00:00</updated>
<author>
<name>Liu Yi L</name>
<email>yi.l.liu@intel.com</email>
</author>
<published>2020-07-24T01:49:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=40f782ead6b1d0bcebff304a6a55d10d7ebcb275'/>
<id>urn:sha1:40f782ead6b1d0bcebff304a6a55d10d7ebcb275</id>
<content type='text'>
[ Upstream commit 5f77d6ca5ca74e4b4a5e2e010f7ff50c45dea326 ]

Set proper masks to avoid invalid input spillover to reserved bits.

Signed-off-by: Liu Yi L &lt;yi.l.liu@intel.com&gt;
Signed-off-by: Jacob Pan &lt;jacob.jun.pan@linux.intel.com&gt;
Signed-off-by: Lu Baolu &lt;baolu.lu@linux.intel.com&gt;
Reviewed-by: Eric Auger &lt;eric.auger@redhat.com&gt;
Link: https://lore.kernel.org/r/20200724014925.15523-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel &lt;jroedel@suse.de&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>net/compat: Add missing sock updates for SCM_RIGHTS</title>
<updated>2020-08-21T07:48:18+00:00</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2020-06-09T23:11:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e34237a26c04308c721b6ce460b0beaa7d7e0e28'/>
<id>urn:sha1:e34237a26c04308c721b6ce460b0beaa7d7e0e28</id>
<content type='text'>
commit d9539752d23283db4692384a634034f451261e29 upstream.

Add missed sock updates to compat path via a new helper, which will be
used more in coming patches. (The net/core/scm.c code is left as-is here
to assist with -stable backports for the compat path.)

Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Sargun Dhillon &lt;sargun@sargun.me&gt;
Cc: Jakub Kicinski &lt;kuba@kernel.org&gt;
Cc: stable@vger.kernel.org
Fixes: 48a87cc26c13 ("net: netprio: fd passed in SCM_RIGHTS datagram not set correctly")
Fixes: d84295067fc7 ("net: net_cls: fd passed in SCM_RIGHTS datagram not set correctly")
Acked-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>bitfield.h: don't compile-time validate _val in FIELD_FIT</title>
<updated>2020-08-21T07:48:15+00:00</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2020-08-10T18:21:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b4840e848efa4648a551e3d833bfe7cebde344d2'/>
<id>urn:sha1:b4840e848efa4648a551e3d833bfe7cebde344d2</id>
<content type='text'>
commit 444da3f52407d74c9aa12187ac6b01f76ee47d62 upstream.

When ur_load_imm_any() is inlined into jeq_imm(), it's possible for the
compiler to deduce a case where _val can only have the value of -1 at
compile time. Specifically,

/* struct bpf_insn: _s32 imm */
u64 imm = insn-&gt;imm; /* sign extend */
if (imm &gt;&gt; 32) { /* non-zero only if insn-&gt;imm is negative */
  /* inlined from ur_load_imm_any */
  u32 __imm = imm &gt;&gt; 32; /* therefore, always 0xffffffff */
  if (__builtin_constant_p(__imm) &amp;&amp; __imm &gt; 255)
    compiletime_assert_XXX()

This can result in tripping a BUILD_BUG_ON() in __BF_FIELD_CHECK() that
checks that a given value is representable in one byte (interpreted as
unsigned).

FIELD_FIT() should return true or false at runtime for whether a value
can fit for not. Don't break the build over a value that's too large for
the mask. We'd prefer to keep the inlining and compiler optimizations
though we know this case will always return false.

Cc: stable@vger.kernel.org
Fixes: 1697599ee301a ("bitfield.h: add FIELD_FIT() helper")
Link: https://lore.kernel.org/kernel-hardening/CAK7LNASvb0UDJ0U5wkYYRzTAdnEs64HjXpEUL7d=V0CXiAXcNw@mail.gmail.com/
Reported-by: Masahiro Yamada &lt;masahiroy@kernel.org&gt;
Debugged-by: Sami Tolvanen &lt;samitolvanen@google.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>net: refactor bind_bucket fastreuse into helper</title>
<updated>2020-08-21T07:48:14+00:00</updated>
<author>
<name>Tim Froidcoeur</name>
<email>tim.froidcoeur@tessares.net</email>
</author>
<published>2020-08-11T18:33:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6de6d1ca5f3e8383cb175777f0e9aba7a3c397f5'/>
<id>urn:sha1:6de6d1ca5f3e8383cb175777f0e9aba7a3c397f5</id>
<content type='text'>
[ Upstream commit 62ffc589abb176821662efc4525ee4ac0b9c3894 ]

Refactor the fastreuse update code in inet_csk_get_port into a small
helper function that can be called from other places.

Acked-by: Matthieu Baerts &lt;matthieu.baerts@tessares.net&gt;
Signed-off-by: Tim Froidcoeur &lt;tim.froidcoeur@tessares.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>ipvs: allow connection reuse for unconfirmed conntrack</title>
<updated>2020-08-21T07:48:08+00:00</updated>
<author>
<name>Julian Anastasov</name>
<email>ja@ssi.bg</email>
</author>
<published>2020-07-01T15:17:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ff9e162946e1b8051ffe357994d382077d909b2e'/>
<id>urn:sha1:ff9e162946e1b8051ffe357994d382077d909b2e</id>
<content type='text'>
[ Upstream commit f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f ]

YangYuxi is reporting that connection reuse
is causing one-second delay when SYN hits
existing connection in TIME_WAIT state.
Such delay was added to give time to expire
both the IPVS connection and the corresponding
conntrack. This was considered a rare case
at that time but it is causing problem for
some environments such as Kubernetes.

As nf_conntrack_tcp_packet() can decide to
release the conntrack in TIME_WAIT state and
to replace it with a fresh NEW conntrack, we
can use this to allow rescheduling just by
tuning our check: if the conntrack is
confirmed we can not schedule it to different
real server and the one-second delay still
applies but if new conntrack was created,
we are free to select new real server without
any delays.

YangYuxi lists some of the problem reports:

- One second connection delay in masquerading mode:
https://marc.info/?t=151683118100004&amp;r=1&amp;w=2

- IPVS low throughput #70747
https://github.com/kubernetes/kubernetes/issues/70747

- Apache Bench can fill up ipvs service proxy in seconds #544
https://github.com/cloudnativelabs/kube-router/issues/544

- Additional 1s latency in `host -&gt; service IP -&gt; pod`
https://github.com/kubernetes/kubernetes/issues/90854

Fixes: f719e3754ee2 ("ipvs: drop first packet to redirect conntrack")
Co-developed-by: YangYuxi &lt;yx.atom1@gmail.com&gt;
Signed-off-by: YangYuxi &lt;yx.atom1@gmail.com&gt;
Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Reviewed-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
