summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2018-12-05Linux 4.14.86v4.14.86Greg Kroah-Hartman1-1/+1
2018-12-05f2fs: fix missing up_readJaegeuk Kim1-1/+3
commit 89d13c38501df730cbb2e02c4499da1b5187119d upstream. This patch fixes missing up_read call. Fixes: c9b60788fc76 ("f2fs: fix to do sanity check with block address in main area") Cc: <stable@vger.kernel.org> # 4.19+ Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05libceph: check authorizer reply/challenge length before readingIlya Dryomov1-0/+7
commit 130f52f2b203aa0aec179341916ffb2e905f3afd upstream. Avoid scribbling over memory if the received reply/challenge is larger than the buffer supplied with the authorizer. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05libceph: weaken sizeof check in ceph_x_verify_authorizer_reply()Ilya Dryomov1-2/+4
commit f1d10e04637924f2b00a0fecdd2ca4565f5cfc3f upstream. Allow for extending ceph_x_authorize_reply in the future. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05binder: fix race that allows malicious free of live bufferTodd Kjos3-19/+19
commit 7bada55ab50697861eee6bb7d60b41e68a961a9c upstream. Malicious code can attempt to free buffers using the BC_FREE_BUFFER ioctl to binder. There are protections against a user freeing a buffer while in use by the kernel, however there was a window where BC_FREE_BUFFER could be used to free a recently allocated buffer that was not completely initialized. This resulted in a use-after-free detected by KASAN with a malicious test program. This window is closed by setting the buffer's allow_user_free attribute to 0 when the buffer is allocated or when the user has previously freed it instead of waiting for the caller to set it. The problem was that when the struct buffer was recycled, allow_user_free was stale and set to 1 allowing a free to go through. Signed-off-by: Todd Kjos <tkjos@google.com> Acked-by: Arve Hjønnevåg <arve@android.com> Cc: stable <stable@vger.kernel.org> # 4.14 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05misc: mic/scif: fix copy-paste error in scif_create_remote_lookupYueHaibing1-1/+1
commit 6484a677294aa5d08c0210f2f387ebb9be646115 upstream. gcc '-Wunused-but-set-variable' warning: drivers/misc/mic/scif/scif_rma.c: In function 'scif_create_remote_lookup': drivers/misc/mic/scif/scif_rma.c:373:25: warning: variable 'vmalloc_num_pages' set but not used [-Wunused-but-set-variable] 'vmalloc_num_pages' should be used to determine if the address is within the vmalloc range. Fixes: ba612aa8b487 ("misc: mic: SCIF memory registration and unregistration") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05Drivers: hv: vmbus: check the creation_status in vmbus_establish_gpadl()Dexuan Cui1-0/+8
commit eceb05965489784f24bbf4d61ba60e475a983016 upstream. This is a longstanding issue: if the vmbus upper-layer drivers try to consume too many GPADLs, the host may return with an error 0xC0000044 (STATUS_QUOTA_EXCEEDED), but currently we forget to check the creation_status, and hence we can pass an invalid GPADL handle into the OPEN_CHANNEL message, and get an error code 0xc0000225 in open_info->response.open_result.status, and finally we hang in vmbus_open() -> "goto error_free_info" -> vmbus_teardown_gpadl(). With this patch, we can exit gracefully on STATUS_QUOTA_EXCEEDED. Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: stable@vger.kernel.org Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05mm: use swp_offset as key in shmem_replace_page()Yu Zhao1-2/+4
commit c1cb20d43728aa9b5393bd8d489bc85c142949b2 upstream. We changed the key of swap cache tree from swp_entry_t.val to swp_offset. We need to do so in shmem_replace_page() as well. Hugh said: "shmem_replace_page() has been wrong since the day I wrote it: good enough to work on swap "type" 0, which is all most people ever use (especially those few who need shmem_replace_page() at all), but broken once there are any non-0 swp_type bits set in the higher order bits" Link: http://lkml.kernel.org/r/20181121215442.138545-1-yuzhao@google.com Fixes: f6ab1f7f6b2d ("mm, swap: use offset of swap entry as key of swap cache") Signed-off-by: Yu Zhao <yuzhao@google.com> Reviewed-by: Matthew Wilcox <willy@infradead.org> Acked-by: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> [4.9+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05lib/test_kmod.c: fix rmmod double freeLuis Chamberlain1-1/+0
commit 5618cf031fecda63847cafd1091e7b8bd626cdb1 upstream. We free the misc device string twice on rmmod; fix this. Without this we cannot remove the module without crashing. Link: http://lkml.kernel.org/r/20181124050500.5257-1-mcgrof@kernel.org Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Reported-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: <stable@vger.kernel.org> [4.12+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05iio:st_magn: Fix enable device after triggerMartin Kelly1-9/+3
commit fe5192ac81ad0d4dfe1395d11f393f0513c15f7f upstream. Currently, we enable the device before we enable the device trigger. At high frequencies, this can cause interrupts that don't yet have a poll function associated with them and are thus treated as spurious. At high frequencies with level interrupts, this can even cause an interrupt storm of repeated spurious interrupts (~100,000 on my Beagleboard with the LSM9DS1 magnetometer). If these repeat too much, the interrupt will get disabled and the device will stop functioning. To prevent these problems, enable the device prior to enabling the device trigger, and disable the divec prior to disabling the trigger. This means there's no window of time during which the device creates interrupts but we have no trigger to answer them. Fixes: 90efe055629 ("iio: st_sensors: harden interrupt handling") Signed-off-by: Martin Kelly <martin@martingkelly.com> Tested-by: Denis Ciocca <denis.ciocca@st.com> Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05Revert "usb: dwc3: gadget: skip Set/Clear Halt when invalid"Felipe Balbi1-5/+0
commit 38317f5c0f2faae5110854f36edad810f841d62f upstream. This reverts commit ffb80fc672c3a7b6afd0cefcb1524fb99917b2f3. Turns out that commit is wrong. Host controllers are allowed to use Clear Feature HALT as means to sync data toggle between host and periperal. Cc: <stable@vger.kernel.org> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05usb: core: quirks: add RESET_RESUME quirk for Cherry G230 Stream seriesMichael Niewöhner1-0/+3
commit effd14f66cc1ef6701a19c5a56e39c35f4d395a5 upstream. Cherry G230 Stream 2.0 (G85-231) and 3.0 (G85-232) need this quirk to function correctly. This fixes a but where double pressing numlock locks up the device completely with need to replug the keyboard. Signed-off-by: Michael Niewöhner <linux@mniewoehner.de> Tested-by: Michael Niewöhner <linux@mniewoehner.de> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05USB: usb-storage: Add new IDs to ums-realtekKai-Heng Feng1-0/+10
commit a84a1bcc992f0545a51d2e120b8ca2ef20e2ea97 upstream. There are two new Realtek card readers require ums-realtek to work correctly. Add the new IDs to support them. Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05staging: rtl8723bs: Add missing return for cfg80211_rtw_get_stationLarry Finger1-1/+1
commit 8561fb31a1f9594e2807681f5c0721894e367f19 upstream. With Androidx86 8.1, wificond returns "failed to get nl80211_sta_info_tx_failed" and wificondControl returns "Invalid signal poll result from wificond". The fix is to OR sinfo->filled with BIT_ULL(NL80211_STA_INFO_TX_FAILED). This missing bit is apparently not needed with NetworkManager, but it does no harm in that case. Reported-and-Tested-by: youling257 <youling257@gmail.com> Cc: linux-wireless@vger.kernel.org Cc: youling257 <youling257@gmail.com> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05staging: vchiq_arm: fix compat VCHIQ_IOC_AWAIT_COMPLETIONBen Wolsieffer1-1/+6
commit 5a96b2d38dc054c0bbcbcd585b116566cbd877fe upstream. The compatibility ioctl wrapper for VCHIQ_IOC_AWAIT_COMPLETION assumes that the native ioctl always uses a message buffer and decrements msgbufcount. Certain message types do not use a message buffer and in this case msgbufcount is not decremented, and completion->header for the message is NULL. Because the wrapper unconditionally decrements msgbufcount, the calling process may assume that a message buffer has been used even when it has not. This results in a memory leak in the userspace code that interfaces with this driver. When msgbufcount is decremented, the userspace code assumes that the buffer can be freed though the reference in completion->header, which cannot happen when the reference is NULL. This patch causes the wrapper to only decrement msgbufcount when the native ioctl decrements it. Note that we cannot simply copy the native ioctl's value of msgbufcount, because the wrapper only retrieves messages from the native ioctl one at a time, while userspace may request multiple messages. See https://github.com/raspberrypi/linux/pull/2703 for more discussion of this patch. Fixes: 5569a1260933 ("staging: vchiq_arm: Add compatibility wrappers for ioctls") Signed-off-by: Ben Wolsieffer <benwolsieffer@gmail.com> Acked-by: Stefan Wahren <stefan.wahren@i2se.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05btrfs: release metadata before running delayed refsJosef Bacik1-3/+3
We want to release the unused reservation we have since it refills the delayed refs reserve, which will make everything go smoother when running the delayed refs if we're short on our reservation. CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Liu Bo <bo.liu@linux.alibaba.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-12-05dmaengine: at_hdmac: fix module unloadingRichard Genoud1-0/+2
commit 77e75fda94d2ebb86aa9d35fb1860f6395bf95de upstream. of_dma_controller_free() was not called on module onloading. This lead to a soft lockup: watchdog: BUG: soft lockup - CPU#0 stuck for 23s! Modules linked in: at_hdmac [last unloaded: at_hdmac] when of_dma_request_slave_channel() tried to call ofdma->of_dma_xlate(). Cc: stable@vger.kernel.org Fixes: bbe89c8e3d59 ("at_hdmac: move to generic DMA binding") Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com> Signed-off-by: Richard Genoud <richard.genoud@gmail.com> Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05dmaengine: at_hdmac: fix memory leak in at_dma_xlate()Richard Genoud1-1/+7
commit 98f5f932254b88ce828bc8e4d1642d14e5854caa upstream. The leak was found when opening/closing a serial port a great number of time, increasing kmalloc-32 in slabinfo. Each time the port was opened, dma_request_slave_channel() was called. Then, in at_dma_xlate(), atslave was allocated with devm_kzalloc() and never freed. (Well, it was free at module unload, but that's not what we want). So, here, kzalloc is more suited for the job since it has to be freed in atc_free_chan_resources(). Cc: stable@vger.kernel.org Fixes: bbe89c8e3d59 ("at_hdmac: move to generic DMA binding") Reported-by: Mario Forner <m.forner@be4energy.com> Suggested-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com> Signed-off-by: Richard Genoud <richard.genoud@gmail.com> Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05ARM: dts: rockchip: Remove @0 from the veyron memory nodeHeiko Stuebner1-1/+5
commit 672e60b72bbe7aace88721db55b380b6a51fb8f9 upstream. The Coreboot version on veyron ChromeOS devices seems to ignore memory@0 nodes when updating the available memory and instead inserts another memory node without the address. This leads to 4GB systems only ever be using 2GB as the memory@0 node takes precedence. So remove the @0 for veyron devices. Fixes: 0b639b815f15 ("ARM: dts: rockchip: Add missing unit name to memory nodes in rk3288 boards") Cc: stable@vger.kernel.org Reported-by: Heikki Lindholm <holin@iki.fi> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05ext2: fix potential use after freePan Bian1-1/+1
commit ecebf55d27a11538ea84aee0be643dd953f830d5 upstream. The function ext2_xattr_set calls brelse(bh) to drop the reference count of bh. After that, bh may be freed. However, following brelse(bh), it reads bh->b_data via macro HDR(bh). This may result in a use-after-free bug. This patch moves brelse(bh) after reading field. CC: stable@vger.kernel.org Signed-off-by: Pan Bian <bianpan2016@163.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05ALSA: hda/realtek - fix headset mic detection for MSI MS-B171Anisse Astier1-0/+1
commit 8cd65271f8e545ddeed10ecc2e417936bdff168e upstream. MSI Cubi N 8GL (MS-B171) needs the same fixup as its older model, the MS-B120, in order for the headset mic to be properly detected. They both use a single 3-way jack for both mic and headset with an ALC283 codec, with the same pins used. Cc: stable@vger.kernel.org Signed-off-by: Anisse Astier <anisse@astier.eu> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05ALSA: hda/realtek - Support ALC300Kailang Yang1-0/+8
commit 1078bef0cd9291355a20369b21cd823026ab8eaa upstream. This patch will enable ALC300. [ It's almost equivalent with other ALC269-compatible ones, and apparently has no loopback mixer -- tiwai ] Signed-off-by: Kailang Yang <kailang@realtek.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05ALSA: sparc: Fix invalid snd_free_pages() at error pathTakashi Iwai1-6/+2
commit 9a20332ab373b1f8f947e0a9c923652b32dab031 upstream. Some spurious calls of snd_free_pages() have been overlooked and remain in the error paths of sparc cs4231 driver code. Since runtime->dma_area is managed by the PCM core helper, we shouldn't release manually. Drop the superfluous calls. Reviewed-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05ALSA: control: Fix race between adding and removing a user elementTakashi Iwai1-35/+45
commit e1a7bfe3807974e66f971f2589d4e0197ec0fced upstream. The procedure for adding a user control element has some window opened for race against the concurrent removal of a user element. This was caught by syzkaller, hitting a KASAN use-after-free error. This patch addresses the bug by wrapping the whole procedure to add a user control element with the card->controls_rwsem, instead of only around the increment of card->user_ctl_count. This required a slight code refactoring, too. The function snd_ctl_add() is split to two parts: a core function to add the control element and a part calling it. The former is called from the function for adding a user control element inside the controls_rwsem. One change to be noted is that snd_ctl_notify() for adding a control element gets called inside the controls_rwsem as well while it was called outside the rwsem. But this should be OK, as snd_ctl_notify() takes another (finer) rwlock instead of rwsem, and the call of snd_ctl_notify() inside rwsem is already done in another code path. Reported-by: syzbot+dc09047bce3820621ba2@syzkaller.appspotmail.com Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05ALSA: ac97: Fix incorrect bit shift at AC97-SPSA control writeTakashi Iwai1-1/+1
commit 7194eda1ba0872d917faf3b322540b4f57f11ba5 upstream. The function snd_ac97_put_spsa() gets the bit shift value from the associated private_value, but it extracts too much; the current code extracts 8 bit values in bits 8-15, but this is a combination of two nibbles (bits 8-11 and bits 12-15) for left and right shifts. Due to the incorrect bits extraction, the actual shift may go beyond the 32bit value, as spotted recently by UBSAN check: UBSAN: Undefined behaviour in sound/pci/ac97/ac97_codec.c:836:7 shift exponent 68 is too large for 32-bit type 'int' This patch fixes the shift value extraction by masking the properly with 0x0f instead of 0xff. Reported-and-tested-by: Meelis Roos <mroos@linux.ee> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05ALSA: wss: Fix invalid snd_free_pages() at error pathTakashi Iwai1-2/+0
commit 7b69154171b407844c273ab4c10b5f0ddcd6aa29 upstream. Some spurious calls of snd_free_pages() have been overlooked and remain in the error paths of wss driver code. Since runtime->dma_area is managed by the PCM core helper, we shouldn't release manually. Drop the superfluous calls. Reviewed-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05fs: fix lost error code in dio_completeMaximilian Heyne1-2/+2
commit 41e817bca3acd3980efe5dd7d28af0e6f4ab9247 upstream. commit e259221763a40403d5bb232209998e8c45804ab8 ("fs: simplify the generic_write_sync prototype") reworked callers of generic_write_sync(), and ended up dropping the error return for the directio path. Prior to that commit, in dio_complete(), an error would be bubbled up the stack, but after that commit, errors passed on to dio_complete were eaten up. This was reported on the list earlier, and a fix was proposed in https://lore.kernel.org/lkml/20160921141539.GA17898@infradead.org/, but never followed up with. We recently hit this bug in our testing where fencing io errors, which were previously erroring out with EIO, were being returned as success operations after this commit. The fix proposed on the list earlier was a little short -- it would have still called generic_write_sync() in case `ret` already contained an error. This fix ensures generic_write_sync() is only called when there's no pending error in the write. Additionally, transferred is replaced with ret to bring this code in line with other callers. Fixes: e259221763a4 ("fs: simplify the generic_write_sync prototype") Reported-by: Ravi Nankani <rnankani@amazon.com> Signed-off-by: Maximilian Heyne <mheyne@amazon.de> Reviewed-by: Christoph Hellwig <hch@lst.de> CC: Torsten Mehlan <tomeh@amazon.de> CC: Uwe Dannowski <uwed@amazon.de> CC: Amit Shah <aams@amazon.de> CC: David Woodhouse <dwmw@amazon.co.uk> CC: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05perf/x86/intel: Add generic branch tracing check to intel_pmu_has_bts()Jiri Olsa2-18/+12
commit 67266c1080ad56c31af72b9c18355fde8ccc124a upstream. Currently we check the branch tracing only by checking for the PERF_COUNT_HW_BRANCH_INSTRUCTIONS event of PERF_TYPE_HARDWARE type. But we can define the same event with the PERF_TYPE_RAW type. Changing the intel_pmu_has_bts() code to check on event's final hw config value, so both HW types are covered. Adding unlikely to intel_pmu_has_bts() condition calls, because it was used in the original code in intel_bts_constraints. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: <stable@vger.kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/20181121101612.16272-2-jolsa@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05perf/x86/intel: Move branch tracing setup to the Intel-specific source fileJiri Olsa2-21/+40
commit ed6101bbf6266ee83e620b19faa7c6ad56bb41ab upstream. Moving branch tracing setup to Intel core object into separate intel_pmu_bts_config function, because it's Intel specific. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: <stable@vger.kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/20181121101612.16272-1-jolsa@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05x86/fpu: Disable bottom halves while loading FPU registersSebastian Andrzej Siewior1-2/+2
commit 68239654acafe6aad5a3c1dc7237e60accfebc03 upstream. The sequence fpu->initialized = 1; /* step A */ preempt_disable(); /* step B */ fpu__restore(fpu); preempt_enable(); in __fpu__restore_sig() is racy in regard to a context switch. For 32bit frames, __fpu__restore_sig() prepares the FPU state within fpu->state. To ensure that a context switch (switch_fpu_prepare() in particular) does not modify fpu->state it uses fpu__drop() which sets fpu->initialized to 0. After fpu->initialized is cleared, the CPU's FPU state is not saved to fpu->state during a context switch. The new state is loaded via fpu__restore(). It gets loaded into fpu->state from userland and ensured it is sane. fpu->initialized is then set to 1 in order to avoid fpu__initialize() doing anything (overwrite the new state) which is part of fpu__restore(). A context switch between step A and B above would save CPU's current FPU registers to fpu->state and overwrite the newly prepared state. This looks like a tiny race window but the Kernel Test Robot reported this back in 2016 while we had lazy FPU support. Borislav Petkov made the link between that report and another patch that has been posted. Since the removal of the lazy FPU support, this race goes unnoticed because the warning has been removed. Disable bottom halves around the restore sequence to avoid the race. BH need to be disabled because BH is allowed to run (even with preemption disabled) and might invoke kernel_fpu_begin() by doing IPsec. [ bp: massage commit message a bit. ] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: kvm ML <kvm@vger.kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: stable@vger.kernel.org Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/20181120102635.ddv3fvavxajjlfqk@linutronix.de Link: https://lkml.kernel.org/r/20160226074940.GA28911@pd.tnic Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05x86/MCE/AMD: Fix the thresholding machinery initialization orderBorislav Petkov1-13/+6
commit 60c8144afc287ef09ce8c1230c6aa972659ba1bb upstream. Currently, the code sets up the thresholding interrupt vector and only then goes about initializing the thresholding banks. Which is wrong, because an early thresholding interrupt would cause a NULL pointer dereference when accessing those banks and prevent the machine from booting. Therefore, set the thresholding interrupt vector only *after* having initialized the banks successfully. Fixes: 18807ddb7f88 ("x86/mce/AMD: Reset Threshold Limit after logging error") Reported-by: Rafał Miłecki <rafal@milecki.pl> Reported-by: John Clemens <clemej@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Rafał Miłecki <rafal@milecki.pl> Tested-by: John Clemens <john@deater.net> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac@vger.kernel.org Cc: stable@vger.kernel.org Cc: Tony Luck <tony.luck@intel.com> Cc: x86@kernel.org Cc: Yazen Ghannam <Yazen.Ghannam@amd.com> Link: https://lkml.kernel.org/r/20181127101700.2964-1-zajec5@gmail.com Link: https://bugzilla.kernel.org/show_bug.cgi?id=201291 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05arm64: dts: rockchip: Fix PCIe reset polarity for rk3399-puma-haikou.Christoph Muellner1-1/+1
commit c1d91f86a1b4c9c05854d59c6a0abd5d0f75b849 upstream. This patch fixes the wrong polarity setting for the PCIe host driver's pre-reset pin for rk3399-puma-haikou. Without this patch link training will most likely fail. Fixes: 60fd9f72ce8a ("arm64: dts: rockchip: add Haikou baseboard with RK3399-Q7 SoM") Cc: stable@vger.kernel.org Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05PCI: layerscape: Fix wrong invocation of outbound window disable accessorHou Zhiqiang1-1/+1
commit c6fd6fe9dea44732cdcd970f1130b8cc50ad685a upstream. The order of parameters is not correct when invoking the outbound window disable routine. Fix it. Fixes: 4a2745d760fa ("PCI: layerscape: Disable outbound windows configured by bootloader") Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> [lorenzo.pieralisi@arm.com: commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05btrfs: relocation: set trans to be NULL after ending transactionPan Bian1-0/+1
commit 42a657f57628402c73237547f0134e083e2f6764 upstream. The function relocate_block_group calls btrfs_end_transaction to release trans when update_backref_cache returns 1, and then continues the loop body. If btrfs_block_rsv_refill fails this time, it will jump out the loop and the freed trans will be accessed. This may result in a use-after-free bug. The patch assigns NULL to trans after trans is released so that it will not be accessed. Fixes: 0647bf564f1 ("Btrfs: improve forever loop when doing balance relocation") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Pan Bian <bianpan2016@163.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05Btrfs: ensure path name is null terminated at btrfs_control_ioctlFilipe Manana1-0/+1
commit f505754fd6599230371cb01b9332754ddc104be1 upstream. We were using the path name received from user space without checking that it is null terminated. While btrfs-progs is well behaved and does proper validation and null termination, someone could call the ioctl and pass a non-null terminated patch, leading to buffer overrun problems in the kernel. The ioctl is protected by CAP_SYS_ADMIN. So just set the last byte of the path to a null character, similar to what we do in other ioctls (add/remove/resize device, snapshot creation, etc). CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05xtensa: fix coprocessor part of ptrace_{get,set}xregsMax Filippov1-4/+38
commit 38a35a78c5e270cbe53c4fef6b0d3c2da90dd849 upstream. Layout of coprocessor registers in the elf_xtregs_t and xtregs_coprocessor_t may be different due to alignment. Thus it is not always possible to copy data between the xtregs_coprocessor_t structure and the elf_xtregs_t and get correct values for all registers. Use a table of offsets and sizes of individual coprocessor register groups to do coprocessor context copying in the ptrace_getxregs and ptrace_setxregs. This fixes incorrect coprocessor register values reading from the user process by the native gdb on an xtensa core with multiple coprocessors and registers with high alignment requirements. Cc: stable@vger.kernel.org Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05xtensa: fix coprocessor context offset definitionsMax Filippov1-8/+8
commit 03bc996af0cc71c7f30c384d8ce7260172423b34 upstream. Coprocessor context offsets are used by the assembly code that moves coprocessor context between the individual fields of the thread_info::xtregs_cp structure and coprocessor registers. This fixes coprocessor context clobbering on flushing and reloading during normal user code execution and user process debugging in the presence of more than one coprocessor in the core configuration. Cc: stable@vger.kernel.org Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05xtensa: enable coprocessors that are being flushedMax Filippov1-1/+4
commit 2958b66694e018c552be0b60521fec27e8d12988 upstream. coprocessor_flush_all may be called from a context of a thread that is different from the thread being flushed. In that case contents of the cpenable special register may not match ti->cpenable of the target thread, resulting in unhandled coprocessor exception in the kernel context. Set cpenable special register to the ti->cpenable of the target register for the duration of the flush and restore it afterwards. This fixes the following crash caused by coprocessor register inspection in native gdb: (gdb) p/x $w0 Illegal instruction in kernel: sig: 9 [#1] PREEMPT Call Trace: ___might_sleep+0x184/0x1a4 __might_sleep+0x41/0xac exit_signals+0x14/0x218 do_exit+0xc9/0x8b8 die+0x99/0xa0 do_illegal_instruction+0x18/0x6c common_exception+0x77/0x77 coprocessor_flush+0x16/0x3c arch_ptrace+0x46c/0x674 sys_ptrace+0x2ce/0x3b4 system_call+0x54/0x80 common_exception+0x77/0x77 note: gdb[100] exited with preempt_count 1 Killed Cc: stable@vger.kernel.org Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05KVM: X86: Fix scan ioapic use-before-initializationWanpeng Li1-1/+2
commit e97f852fd4561e77721bb9a4e0ea9d98305b1e93 upstream. Reported by syzkaller: BUG: unable to handle kernel NULL pointer dereference at 00000000000001c8 PGD 80000003ec4da067 P4D 80000003ec4da067 PUD 3f7bfa067 PMD 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 7 PID: 5059 Comm: debug Tainted: G OE 4.19.0-rc5 #16 RIP: 0010:__lock_acquire+0x1a6/0x1990 Call Trace: lock_acquire+0xdb/0x210 _raw_spin_lock+0x38/0x70 kvm_ioapic_scan_entry+0x3e/0x110 [kvm] vcpu_enter_guest+0x167e/0x1910 [kvm] kvm_arch_vcpu_ioctl_run+0x35c/0x610 [kvm] kvm_vcpu_ioctl+0x3e9/0x6d0 [kvm] do_vfs_ioctl+0xa5/0x690 ksys_ioctl+0x6d/0x80 __x64_sys_ioctl+0x1a/0x20 do_syscall_64+0x83/0x6e0 entry_SYSCALL_64_after_hwframe+0x49/0xbe The reason is that the testcase writes hyperv synic HV_X64_MSR_SINT6 msr and triggers scan ioapic logic to load synic vectors into EOI exit bitmap. However, irqchip is not initialized by this simple testcase, ioapic/apic objects should not be accessed. This can be triggered by the following program: #define _GNU_SOURCE #include <endian.h> #include <stdint.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/syscall.h> #include <sys/types.h> #include <unistd.h> uint64_t r[3] = {0xffffffffffffffff, 0xffffffffffffffff, 0xffffffffffffffff}; int main(void) { syscall(__NR_mmap, 0x20000000, 0x1000000, 3, 0x32, -1, 0); long res = 0; memcpy((void*)0x20000040, "/dev/kvm", 9); res = syscall(__NR_openat, 0xffffffffffffff9c, 0x20000040, 0, 0); if (res != -1) r[0] = res; res = syscall(__NR_ioctl, r[0], 0xae01, 0); if (res != -1) r[1] = res; res = syscall(__NR_ioctl, r[1], 0xae41, 0); if (res != -1) r[2] = res; memcpy( (void*)0x20000080, "\x01\x00\x00\x00\x00\x5b\x61\xbb\x96\x00\x00\x40\x00\x00\x00\x00\x01\x00" "\x08\x00\x00\x00\x00\x00\x0b\x77\xd1\x78\x4d\xd8\x3a\xed\xb1\x5c\x2e\x43" "\xaa\x43\x39\xd6\xff\xf5\xf0\xa8\x98\xf2\x3e\x37\x29\x89\xde\x88\xc6\x33" "\xfc\x2a\xdb\xb7\xe1\x4c\xac\x28\x61\x7b\x9c\xa9\xbc\x0d\xa0\x63\xfe\xfe" "\xe8\x75\xde\xdd\x19\x38\xdc\x34\xf5\xec\x05\xfd\xeb\x5d\xed\x2e\xaf\x22" "\xfa\xab\xb7\xe4\x42\x67\xd0\xaf\x06\x1c\x6a\x35\x67\x10\x55\xcb", 106); syscall(__NR_ioctl, r[2], 0x4008ae89, 0x20000080); syscall(__NR_ioctl, r[2], 0xae80, 0); return 0; } This patch fixes it by bailing out scan ioapic if ioapic is not initialized in kernel. Reported-by: Wei Wu <ww9210@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Wei Wu <ww9210@gmail.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05KVM: x86: Fix kernel info-leak in KVM_HC_CLOCK_PAIRING hypercallLiran Alon1-0/+1
commit bcbfbd8ec21096027f1ee13ce6c185e8175166f6 upstream. kvm_pv_clock_pairing() allocates local var "struct kvm_clock_pairing clock_pairing" on stack and initializes all it's fields besides padding (clock_pairing.pad[]). Because clock_pairing var is written completely (including padding) to guest memory, failure to init struct padding results in kernel info-leak. Fix the issue by making sure to also init the padding with zeroes. Fixes: 55dd00a73a51 ("KVM: x86: add KVM_HC_CLOCK_PAIRING hypercall") Reported-by: syzbot+a8ef68d71211ba264f56@syzkaller.appspotmail.com Reviewed-by: Mark Kanda <mark.kanda@oracle.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05kvm: svm: Ensure an IBPB on all affected CPUs when freeing a vmcbJim Mattson1-5/+15
commit fd65d3142f734bc4376053c8d75670041903134d upstream. Previously, we only called indirect_branch_prediction_barrier on the logical CPU that freed a vmcb. This function should be called on all logical CPUs that last loaded the vmcb in question. Fixes: 15d45071523d ("KVM/x86: Add IBPB support") Reported-by: Neel Natu <neelnatu@google.com> Signed-off-by: Jim Mattson <jmattson@google.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05kvm: mmu: Fix race in emulated page table writesJunaid Shahid1-18/+9
commit 0e0fee5c539b61fdd098332e0e2cc375d9073706 upstream. When a guest page table is updated via an emulated write, kvm_mmu_pte_write() is called to update the shadow PTE using the just written guest PTE value. But if two emulated guest PTE writes happened concurrently, it is possible that the guest PTE and the shadow PTE end up being out of sync. Emulated writes do not mark the shadow page as unsync-ed, so this inconsistency will not be resolved even by a guest TLB flush (unless the page was marked as unsync-ed at some other point). This is fixed by re-reading the current value of the guest PTE after the MMU lock has been acquired instead of just using the value that was written prior to calling kvm_mmu_pte_write(). Signed-off-by: Junaid Shahid <junaids@google.com> Reviewed-by: Wanpeng Li <wanpengli@tencent.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05x86/speculation: Provide IBPB always command line optionsThomas Gleixner2-11/+35
commit 55a974021ec952ee460dc31ca08722158639de72 upstream Provide the possibility to enable IBPB always in combination with 'prctl' and 'seccomp'. Add the extra command line options and rework the IBPB selection to evaluate the command instead of the mode selected by the STIPB switch case. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Casey Schaufler <casey.schaufler@intel.com> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Jon Masters <jcm@redhat.com> Cc: Waiman Long <longman9394@gmail.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Dave Stewart <david.c.stewart@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20181125185006.144047038@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05x86/speculation: Add seccomp Spectre v2 user space protection modeThomas Gleixner3-2/+25
commit 6b3e64c237c072797a9ec918654a60e3a46488e2 upstream If 'prctl' mode of user space protection from spectre v2 is selected on the kernel command-line, STIBP and IBPB are applied on tasks which restrict their indirect branch speculation via prctl. SECCOMP enables the SSBD mitigation for sandboxed tasks already, so it makes sense to prevent spectre v2 user space to user space attacks as well. The Intel mitigation guide documents how STIPB works: Setting bit 1 (STIBP) of the IA32_SPEC_CTRL MSR on a logical processor prevents the predicted targets of indirect branches on any logical processor of that core from being controlled by software that executes (or executed previously) on another logical processor of the same core. Ergo setting STIBP protects the task itself from being attacked from a task running on a different hyper-thread and protects the tasks running on different hyper-threads from being attacked. While the document suggests that the branch predictors are shielded between the logical processors, the observed performance regressions suggest that STIBP simply disables the branch predictor more or less completely. Of course the document wording is vague, but the fact that there is also no requirement for issuing IBPB when STIBP is used points clearly in that direction. The kernel still issues IBPB even when STIBP is used until Intel clarifies the whole mechanism. IBPB is issued when the task switches out, so malicious sandbox code cannot mistrain the branch predictor for the next user space task on the same logical processor. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Casey Schaufler <casey.schaufler@intel.com> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Jon Masters <jcm@redhat.com> Cc: Waiman Long <longman9394@gmail.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Dave Stewart <david.c.stewart@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20181125185006.051663132@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05x86/speculation: Enable prctl mode for spectre_v2_userThomas Gleixner2-10/+38
commit 7cc765a67d8e04ef7d772425ca5a2a1e2b894c15 upstream Now that all prerequisites are in place: - Add the prctl command line option - Default the 'auto' mode to 'prctl' - When SMT state changes, update the static key which controls the conditional STIBP evaluation on context switch. - At init update the static key which controls the conditional IBPB evaluation on context switch. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Casey Schaufler <casey.schaufler@intel.com> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Jon Masters <jcm@redhat.com> Cc: Waiman Long <longman9394@gmail.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Dave Stewart <david.c.stewart@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20181125185005.958421388@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05x86/speculation: Add prctl() control for indirect branch speculationThomas Gleixner6-0/+92
commit 9137bb27e60e554dab694eafa4cca241fa3a694f upstream Add the PR_SPEC_INDIRECT_BRANCH option for the PR_GET_SPECULATION_CTRL and PR_SET_SPECULATION_CTRL prctls to allow fine grained per task control of indirect branch speculation via STIBP and IBPB. Invocations: Check indirect branch speculation status with - prctl(PR_GET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, 0, 0, 0); Enable indirect branch speculation with - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_ENABLE, 0, 0); Disable indirect branch speculation with - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_DISABLE, 0, 0); Force disable indirect branch speculation with - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_FORCE_DISABLE, 0, 0); See Documentation/userspace-api/spec_ctrl.rst. Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Casey Schaufler <casey.schaufler@intel.com> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Jon Masters <jcm@redhat.com> Cc: Waiman Long <longman9394@gmail.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Dave Stewart <david.c.stewart@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20181125185005.866780996@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05x86/speculation: Prepare arch_smt_update() for PRCTL modeThomas Gleixner1-21/+25
commit 6893a959d7fdebbab5f5aa112c277d5a44435ba1 upstream The upcoming fine grained per task STIBP control needs to be updated on CPU hotplug as well. Split out the code which controls the strict mode so the prctl control code can be added later. Mark the SMP function call argument __unused while at it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Casey Schaufler <casey.schaufler@intel.com> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Jon Masters <jcm@redhat.com> Cc: Waiman Long <longman9394@gmail.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Dave Stewart <david.c.stewart@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20181125185005.759457117@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05x86/speculation: Prevent stale SPEC_CTRL msr contentThomas Gleixner4-18/+40
commit 6d991ba509ebcfcc908e009d1db51972a4f7a064 upstream The seccomp speculation control operates on all tasks of a process, but only the current task of a process can update the MSR immediately. For the other threads the update is deferred to the next context switch. This creates the following situation with Process A and B: Process A task 2 and Process B task 1 are pinned on CPU1. Process A task 2 does not have the speculation control TIF bit set. Process B task 1 has the speculation control TIF bit set. CPU0 CPU1 MSR bit is set ProcB.T1 schedules out ProcA.T2 schedules in MSR bit is cleared ProcA.T1 seccomp_update() set TIF bit on ProcA.T2 ProcB.T1 schedules in MSR is not updated <-- FAIL This happens because the context switch code tries to avoid the MSR update if the speculation control TIF bits of the incoming and the outgoing task are the same. In the worst case ProcB.T1 and ProcA.T2 are the only tasks scheduling back and forth on CPU1, which keeps the MSR stale forever. In theory this could be remedied by IPIs, but chasing the remote task which could be migrated is complex and full of races. The straight forward solution is to avoid the asychronous update of the TIF bit and defer it to the next context switch. The speculation control state is stored in task_struct::atomic_flags by the prctl and seccomp updates already. Add a new TIF_SPEC_FORCE_UPDATE bit and set this after updating the atomic_flags. Check the bit on context switch and force a synchronous update of the speculation control if set. Use the same mechanism for updating the current task. Reported-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Casey Schaufler <casey.schaufler@intel.com> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Jon Masters <jcm@redhat.com> Cc: Waiman Long <longman9394@gmail.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Dave Stewart <david.c.stewart@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1811272247140.1875@nanos.tec.linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05x86/speculation: Split out TIF updateThomas Gleixner1-12/+23
commit e6da8bb6f9abb2628381904b24163c770e630bac upstream The update of the TIF_SSBD flag and the conditional speculation control MSR update is done in the ssb_prctl_set() function directly. The upcoming prctl support for controlling indirect branch speculation via STIBP needs the same mechanism. Split the code out and make it reusable. Reword the comment about updates for other tasks. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Casey Schaufler <casey.schaufler@intel.com> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Jon Masters <jcm@redhat.com> Cc: Waiman Long <longman9394@gmail.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Dave Stewart <david.c.stewart@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20181125185005.652305076@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05ptrace: Remove unused ptrace_may_access_sched() and MODE_IBRSThomas Gleixner2-27/+0
commit 46f7ecb1e7359f183f5bbd1e08b90e10e52164f9 upstream The IBPB control code in x86 removed the usage. Remove the functionality which was introduced for this. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Casey Schaufler <casey.schaufler@intel.com> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Jon Masters <jcm@redhat.com> Cc: Waiman Long <longman9394@gmail.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Dave Stewart <david.c.stewart@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20181125185005.559149393@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>