<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/vhost, branch linux-4.20.y</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=linux-4.20.y</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=linux-4.20.y'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2019-03-19T12:11:56+00:00</updated>
<entry>
<title>vhost/vsock: fix vhost vsock cid hashing inconsistent</title>
<updated>2019-03-19T12:11:56+00:00</updated>
<author>
<name>Zha Bin</name>
<email>zhabin@linux.alibaba.com</email>
</author>
<published>2019-01-08T08:07:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9ad64a53fd0f2dbb54c80280c1bc573e71666e27'/>
<id>urn:sha1:9ad64a53fd0f2dbb54c80280c1bc573e71666e27</id>
<content type='text'>
commit 7fbe078c37aba3088359c9256c1a1d0c3e39ee81 upstream.

The vsock core only supports 32bit CID, but the Virtio-vsock spec define
CID (dst_cid and src_cid) as u64 and the upper 32bits is reserved as
zero. This inconsistency causes one bug in vhost vsock driver. The
scenarios is:

  0. A hash table (vhost_vsock_hash) is used to map an CID to a vsock
  object. And hash_min() is used to compute the hash key. hash_min() is
  defined as:
  (sizeof(val) &lt;= 4 ? hash_32(val, bits) : hash_long(val, bits)).
  That means the hash algorithm has dependency on the size of macro
  argument 'val'.
  0. In function vhost_vsock_set_cid(), a 64bit CID is passed to
  hash_min() to compute the hash key when inserting a vsock object into
  the hash table.
  0. In function vhost_vsock_get(), a 32bit CID is passed to hash_min()
  to compute the hash key when looking up a vsock for an CID.

Because the different size of the CID, hash_min() returns different hash
key, thus fails to look up the vsock object for an CID.

To fix this bug, we keep CID as u64 in the IOCTLs and virtio message
headers, but explicitly convert u64 to u32 when deal with the hash table
and vsock core.

Fixes: 834e772c8db0 ("vhost/vsock: fix use-after-free in network stack callers")
Link: https://github.com/stefanha/virtio/blob/vsock/trunk/content.tex
Signed-off-by: Zha Bin &lt;zhabin@linux.alibaba.com&gt;
Reviewed-by: Liu Jiang &lt;gerry@linux.alibaba.com&gt;
Reviewed-by: Stefan Hajnoczi &lt;stefanha@redhat.com&gt;
Acked-by: Jason Wang &lt;jasowang@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Shengjing Zhu &lt;i@zhsj.me&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;


</content>
</entry>
<entry>
<title>vhost/scsi: Use copy_to_iter() to send control queue response</title>
<updated>2019-02-27T09:09:48+00:00</updated>
<author>
<name>Bijan Mottahedeh</name>
<email>bijan.mottahedeh@oracle.com</email>
</author>
<published>2018-12-04T00:48:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=643bb331b53c89574df9d8ae85eff61e92a488ee'/>
<id>urn:sha1:643bb331b53c89574df9d8ae85eff61e92a488ee</id>
<content type='text'>
[ Upstream commit 8e5dadfe76cf2862ebf3e4f22adef29982df7766 ]

Uses copy_to_iter() instead of __copy_to_user() in order to ensure we
support arbitrary layouts and an input buffer split across iov entries.

Fixes: 0d02dbd68c47b ("vhost/scsi: Respond to control queue operations")
Signed-off-by: Bijan Mottahedeh &lt;bijan.mottahedeh@oracle.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>vhost: return EINVAL if iovecs size does not match the message size</title>
<updated>2019-02-27T09:09:48+00:00</updated>
<author>
<name>Pavel Tikhomirov</name>
<email>ptikhomirov@virtuozzo.com</email>
</author>
<published>2018-12-13T14:53:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5a908828ed3c4cb681bf5d714de7b1fe05cda690'/>
<id>urn:sha1:5a908828ed3c4cb681bf5d714de7b1fe05cda690</id>
<content type='text'>
[ Upstream commit 74ad7419489ddade8044e3c9ab064ad656520306 ]

We've failed to copy and process vhost_iotlb_msg so let userspace at
least know about it. For instance before these patch the code below runs
without any error:

int main()
{
  struct vhost_msg msg;
  struct iovec iov;
  int fd;

  fd = open("/dev/vhost-net", O_RDWR);
  if (fd == -1) {
    perror("open");
    return 1;
  }

  iov.iov_base = &amp;msg;
  iov.iov_len = sizeof(msg)-4;

  if (writev(fd, &amp;iov,1) == -1) {
    perror("writev");
    return 1;
  }

  return 0;
}

Signed-off-by: Pavel Tikhomirov &lt;ptikhomirov@virtuozzo.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>vhost: correctly check the return value of translate_desc() in log_used()</title>
<updated>2019-02-23T08:08:06+00:00</updated>
<author>
<name>Jason Wang</name>
<email>jasowang@redhat.com</email>
</author>
<published>2019-02-19T06:53:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=67df61cef0ea269336234c0986fe983da22026fe'/>
<id>urn:sha1:67df61cef0ea269336234c0986fe983da22026fe</id>
<content type='text'>
[ Upstream commit 816db7663565cd23f74ed3d5c9240522e3fb0dda ]

When fail, translate_desc() returns negative value, otherwise the
number of iovs. So we should fail when the return value is negative
instead of a blindly check against zero.

Detected by CoverityScan, CID# 1442593:  Control flow issues  (DEADCODE)

Fixes: cc5e71075947 ("vhost: log dirty page correctly")
Acked-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Reported-by: Stephen Hemminger &lt;stephen@networkplumber.org&gt;
Signed-off-by: Jason Wang &lt;jasowang@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>vhost: fix OOB in get_rx_bufs()</title>
<updated>2019-02-06T16:27:37+00:00</updated>
<author>
<name>Jason Wang</name>
<email>jasowang@redhat.com</email>
</author>
<published>2019-01-28T07:05:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6706210b3897492ffe74907b5126444ddbf9f166'/>
<id>urn:sha1:6706210b3897492ffe74907b5126444ddbf9f166</id>
<content type='text'>
[ Upstream commit b46a0bf78ad7b150ef5910da83859f7f5a514ffd ]

After batched used ring updating was introduced in commit e2b3b35eb989
("vhost_net: batch used ring update in rx"). We tend to batch heads in
vq-&gt;heads for more than one packet. But the quota passed to
get_rx_bufs() was not correctly limited, which can result a OOB write
in vq-&gt;heads.

        headcount = get_rx_bufs(vq, vq-&gt;heads + nvq-&gt;done_idx,
                    vhost_len, &amp;in, vq_log, &amp;log,
                    likely(mergeable) ? UIO_MAXIOV : 1);

UIO_MAXIOV was still used which is wrong since we could have batched
used in vq-&gt;heads, this will cause OOB if the next buffer needs more
than 960 (1024 (UIO_MAXIOV) - 64 (VHOST_NET_BATCH)) heads after we've
batched 64 (VHOST_NET_BATCH) heads:
Acked-by: Stefan Hajnoczi &lt;stefanha@redhat.com&gt;

=============================================================================
BUG kmalloc-8k (Tainted: G    B            ): Redzone overwritten
-----------------------------------------------------------------------------

INFO: 0x00000000fd93b7a2-0x00000000f0713384. First byte 0xa9 instead of 0xcc
INFO: Allocated in alloc_pd+0x22/0x60 age=3933677 cpu=2 pid=2674
    kmem_cache_alloc_trace+0xbb/0x140
    alloc_pd+0x22/0x60
    gen8_ppgtt_create+0x11d/0x5f0
    i915_ppgtt_create+0x16/0x80
    i915_gem_create_context+0x248/0x390
    i915_gem_context_create_ioctl+0x4b/0xe0
    drm_ioctl_kernel+0xa5/0xf0
    drm_ioctl+0x2ed/0x3a0
    do_vfs_ioctl+0x9f/0x620
    ksys_ioctl+0x6b/0x80
    __x64_sys_ioctl+0x11/0x20
    do_syscall_64+0x43/0xf0
    entry_SYSCALL_64_after_hwframe+0x44/0xa9
INFO: Slab 0x00000000d13e87af objects=3 used=3 fp=0x          (null) flags=0x200000000010201
INFO: Object 0x0000000003278802 @offset=17064 fp=0x00000000e2e6652b

Fixing this by allocating UIO_MAXIOV + VHOST_NET_BATCH iovs for
vhost-net. This is done through set the limitation through
vhost_dev_init(), then set_owner can allocate the number of iov in a
per device manner.

This fixes CVE-2018-16880.

Fixes: e2b3b35eb989 ("vhost_net: batch used ring update in rx")
Signed-off-by: Jason Wang &lt;jasowang@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>vhost: log dirty page correctly</title>
<updated>2019-01-31T07:15:33+00:00</updated>
<author>
<name>Jason Wang</name>
<email>jasowang@redhat.com</email>
</author>
<published>2019-01-16T08:54:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=44df799e619844bd6f81fdece1e4e172c8bf024b'/>
<id>urn:sha1:44df799e619844bd6f81fdece1e4e172c8bf024b</id>
<content type='text'>
[ Upstream commit cc5e710759470bc7f3c61d11fd54586f15fdbdf4 ]

Vhost dirty page logging API is designed to sync through GPA. But we
try to log GIOVA when device IOTLB is enabled. This is wrong and may
lead to missing data after migration.

To solve this issue, when logging with device IOTLB enabled, we will:

1) reuse the device IOTLB translation result of GIOVA-&gt;HVA mapping to
   get HVA, for writable descriptor, get HVA through iovec. For used
   ring update, translate its GIOVA to HVA
2) traverse the GPA-&gt;HVA mapping to get the possible GPA and log
   through GPA. Pay attention this reverse mapping is not guaranteed
   to be unique, so we should log each possible GPA in this case.

This fix the failure of scp to guest during migration. In -next, we
will probably support passing GIOVA-&gt;GPA instead of GIOVA-&gt;HVA.

Fixes: 6b1e6cc7855b ("vhost: new device IOTLB API")
Reported-by: Jintack Lim &lt;jintack@cs.columbia.edu&gt;
Cc: Jintack Lim &lt;jintack@cs.columbia.edu&gt;
Signed-off-by: Jason Wang &lt;jasowang@redhat.com&gt;
Acked-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>Revert "net: vhost: lock the vqs one by one"</title>
<updated>2018-12-13T05:56:20+00:00</updated>
<author>
<name>Jason Wang</name>
<email>jasowang@redhat.com</email>
</author>
<published>2018-12-13T02:53:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=86a07da34380bc9467fa407dc3e484488a8ad40d'/>
<id>urn:sha1:86a07da34380bc9467fa407dc3e484488a8ad40d</id>
<content type='text'>
This reverts commit 78139c94dc8c96a478e67dab3bee84dc6eccb5fd. We don't
protect device IOTLB with vq mutex, which will lead e.g use after free
for device IOTLB entries. And since we've switched to use
mutex_trylock() in previous patch, it's safe to revert it without
having deadlock.

Fixes: commit 78139c94dc8c ("net: vhost: lock the vqs one by one")
Cc: Tonghao Zhang &lt;xiangxia.m.yue@gmail.com&gt;
Acked-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: Jason Wang &lt;jasowang@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>vhost_net: switch to use mutex_trylock() in vhost_net_busy_poll()</title>
<updated>2018-12-13T05:56:19+00:00</updated>
<author>
<name>Jason Wang</name>
<email>jasowang@redhat.com</email>
</author>
<published>2018-12-13T02:53:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=476e8ba712fa6a479192dfc4301cd3765d9ba713'/>
<id>urn:sha1:476e8ba712fa6a479192dfc4301cd3765d9ba713</id>
<content type='text'>
We used to hold the mutex of paired virtqueue in
vhost_net_busy_poll(). But this will results an inconsistent lock
order which may cause deadlock if we try to bring back the protection
of device IOTLB with vq mutex that requires to hold mutex of all
virtqueues at the same time.

Fix this simply by switching to use mutex_trylock(), when fail just
skip the busy polling. This can happen when device IOTLB is under
updating which should be rare.

Fixes: commit 78139c94dc8c ("net: vhost: lock the vqs one by one")
Cc: Tonghao Zhang &lt;xiangxia.m.yue@gmail.com&gt;
Acked-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: Jason Wang &lt;jasowang@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>vhost: make sure used idx is seen before log in vhost_add_used_n()</title>
<updated>2018-12-13T05:56:19+00:00</updated>
<author>
<name>Jason Wang</name>
<email>jasowang@redhat.com</email>
</author>
<published>2018-12-13T02:53:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=841df922417eb82c835e93d4b93eb6a68c99d599'/>
<id>urn:sha1:841df922417eb82c835e93d4b93eb6a68c99d599</id>
<content type='text'>
We miss a write barrier that guarantees used idx is updated and seen
before log. This will let userspace sync and copy used ring before
used idx is update. Fix this by adding a barrier before log_write().

Fixes: 8dd014adfea6f ("vhost-net: mergeable buffers support")
Acked-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: Jason Wang &lt;jasowang@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net</title>
<updated>2018-12-09T23:12:33+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2018-12-09T23:12:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d48f782e4fb20dc7ec935ca0ca41ae31e4a69362'/>
<id>urn:sha1:d48f782e4fb20dc7ec935ca0ca41ae31e4a69362</id>
<content type='text'>
Pull networking fixes from David Miller:
 "A decent batch of fixes here. I'd say about half are for problems that
  have existed for a while, and half are for new regressions added in
  the 4.20 merge window.

   1) Fix 10G SFP phy module detection in mvpp2, from Baruch Siach.

   2) Revert bogus emac driver change, from Benjamin Herrenschmidt.

   3) Handle BPF exported data structure with pointers when building
      32-bit userland, from Daniel Borkmann.

   4) Memory leak fix in act_police, from Davide Caratti.

   5) Check RX checksum offload in RX descriptors properly in aquantia
      driver, from Dmitry Bogdanov.

   6) SKB unlink fix in various spots, from Edward Cree.

   7) ndo_dflt_fdb_dump() only works with ethernet, enforce this, from
      Eric Dumazet.

   8) Fix FID leak in mlxsw driver, from Ido Schimmel.

   9) IOTLB locking fix in vhost, from Jean-Philippe Brucker.

  10) Fix SKB truesize accounting in ipv4/ipv6/netfilter frag memory
      limits otherwise namespace exit can hang. From Jiri Wiesner.

  11) Address block parsing length fixes in x25 from Martin Schiller.

  12) IRQ and ring accounting fixes in bnxt_en, from Michael Chan.

  13) For tun interfaces, only iface delete works with rtnl ops, enforce
      this by disallowing add. From Nicolas Dichtel.

  14) Use after free in liquidio, from Pan Bian.

  15) Fix SKB use after passing to netif_receive_skb(), from Prashant
      Bhole.

  16) Static key accounting and other fixes in XPS from Sabrina Dubroca.

  17) Partially initialized flow key passed to ip6_route_output(), from
      Shmulik Ladkani.

  18) Fix RTNL deadlock during reset in ibmvnic driver, from Thomas
      Falcon.

  19) Several small TCP fixes (off-by-one on window probe abort, NULL
      deref in tail loss probe, SNMP mis-estimations) from Yuchung
      Cheng"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (93 commits)
  net/sched: cls_flower: Reject duplicated rules also under skip_sw
  bnxt_en: Fix _bnxt_get_max_rings() for 57500 chips.
  bnxt_en: Fix NQ/CP rings accounting on the new 57500 chips.
  bnxt_en: Keep track of reserved IRQs.
  bnxt_en: Fix CNP CoS queue regression.
  net/mlx4_core: Correctly set PFC param if global pause is turned off.
  Revert "net/ibm/emac: wrong bit is used for STA control"
  neighbour: Avoid writing before skb-&gt;head in neigh_hh_output()
  ipv6: Check available headroom in ip6_xmit() even without options
  tcp: lack of available data can also cause TSO defer
  ipv6: sr: properly initialize flowi6 prior passing to ip6_route_output
  mlxsw: spectrum_switchdev: Fix VLAN device deletion via ioctl
  mlxsw: spectrum_router: Relax GRE decap matching check
  mlxsw: spectrum_switchdev: Avoid leaking FID's reference count
  mlxsw: spectrum_nve: Remove easily triggerable warnings
  ipv4: ipv6: netfilter: Adjust the frag mem limit when truesize changes
  sctp: frag_point sanity check
  tcp: fix NULL ref in tail loss probe
  tcp: Do not underestimate rwnd_limited
  net: use skb_list_del_init() to remove from RX sublists
  ...
</content>
</entry>
</feed>
