Age | Commit message (Collapse) | Author | Files | Lines |
|
It is always the BSP.
No functional changes.
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230130161709.11615-2-bp@alien8.de
|
|
Microcode gets reloaded late only if "1" is written to the reload file.
However, the code silently treats any other unsigned integer as a
successful write even though no actions are performed to load microcode.
Make the loader more strict to accept only "1" as a trigger value and
return an error otherwise.
[ bp: Massage commit message. ]
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230130213955.6046-3-ashok.raj@intel.com
|
|
Make early loading message match late loading message and print both old
and new revisions.
This is helpful to know what the BIOS loaded revision is before an early
update.
Cache the early BIOS revision before the microcode update and have
print_ucode_info() print both the old and new revision in the same
format as microcode_reload_late().
[ bp: Massage, remove useless comment. ]
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230120161923.118882-6-ashok.raj@intel.com
|
|
print_ucode_info() takes a struct ucode_cpu_info pointer as parameter.
Its sole purpose is to print the microcode revision.
The only available ucode_cpu_info always describes the currently loaded
microcode revision. After a microcode update is successful, this is the
new revision, or on failure it is the original revision.
In preparation for future changes, replace the struct ucode_cpu_info
pointer parameter with a plain integer which contains the revision
number and adjust the call sites accordingly.
No functional change.
[ bp:
- Fix + cleanup commit message.
- Revert arbitrary, unrelated change.
]
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230120161923.118882-5-ashok.raj@intel.com
|
|
During late microcode loading, the "Reload completed" message is issued
unconditionally, regardless of success or failure.
Adjust the message to report the result of the update.
[ bp: Massage. ]
Fixes: 9bd681251b7c ("x86/microcode: Announce reload operation's completion")
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/lkml/874judpqqd.ffs@tglx/
|
|
The kernel caches each CPU's feature bits at boot in an x86_capability[]
structure. However, the capabilities in the BSP's copy can be turned off
as a result of certain command line parameters or configuration
restrictions, for example the SGX bit. This can cause a mismatch when
comparing the values before and after the microcode update.
Another example is X86_FEATURE_SRBDS_CTRL which gets added only after
microcode update:
--- cpuid.before 2023-01-21 14:54:15.652000747 +0100
+++ cpuid.after 2023-01-21 14:54:26.632001024 +0100
@@ -10,7 +10,7 @@ CPU:
0x00000004 0x04: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
0x00000005 0x00: eax=0x00000040 ebx=0x00000040 ecx=0x00000003 edx=0x11142120
0x00000006 0x00: eax=0x000027f7 ebx=0x00000002 ecx=0x00000001 edx=0x00000000
- 0x00000007 0x00: eax=0x00000000 ebx=0x029c6fbf ecx=0x40000000 edx=0xbc002400
+ 0x00000007 0x00: eax=0x00000000 ebx=0x029c6fbf ecx=0x40000000 edx=0xbc002e00
^^^
and which proves for a gazillionth time that late loading is a bad bad
idea.
microcode_check() is called after an update to report any previously
cached CPUID bits which might have changed due to the update.
Therefore, store the cached CPU caps before the update and compare them
with the CPU caps after the microcode update has succeeded.
Thus, the comparison is done between the CPUID *hardware* bits before
and after the upgrade instead of using the cached, possibly runtime
modified values in BSP's boot_cpu_data copy.
As a result, false warnings about CPUID bits changes are avoided.
[ bp:
- Massage.
- Add SRBDS_CTRL example.
- Add kernel-doc.
- Incorporate forgotten review feedback from dhansen.
]
Fixes: 1008c52c09dc ("x86/CPU: Add a microcode loader callback")
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230109153555.4986-3-ashok.raj@intel.com
|
|
Add a parameter to store CPU capabilities before performing a microcode
update so that CPU capabilities can be compared before and after update.
[ bp: Massage. ]
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230109153555.4986-2-ashok.raj@intel.com
|
|
Use DEVICE_ATTR_RO() helper instead of open-coded DEVICE_ATTR(),
which makes the code a bit shorter and easier to read.
No change in functionality.
Signed-off-by: Guangju Wang[baidu] <wgj900@163.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230118023554.1898-1-wgj900@163.com
|
|
It can happen that - especially during testing - the microcode
blobs of all families are all glued together in the initrd. The
current code doesn't check whether the current container matched
a microcode patch and continues to the next one, which leads to
save_microcode_in_initrd_amd() to look at the next and thus wrong one:
microcode: parse_container: ucode: 0xffff88807e9d9082
microcode: verify_patch: buf: 0xffff88807e9d90ce, buf_size: 26428
microcode: verify_patch: proc_id: 0x8082, patch_fam: 0x17, this family: 0x17
microcode: verify_patch: buf: 0xffff88807e9d9d56, buf_size: 23220
microcode: verify_patch: proc_id: 0x8012, patch_fam: 0x17, this family: 0x17
microcode: parse_container: MATCH: eq_id: 0x8012, patch proc_rev_id: 0x8012
<-- matching patch found
microcode: verify_patch: buf: 0xffff88807e9da9de, buf_size: 20012
microcode: verify_patch: proc_id: 0x8310, patch_fam: 0x17, this family: 0x17
microcode: verify_patch: buf: 0xffff88807e9db666, buf_size: 16804
microcode: Invalid type field (0x414d44) in container file section header.
microcode: Patch section fail
<-- checking chokes on the microcode magic value of the next container.
microcode: parse_container: saving container 0xffff88807e9d9082
microcode: save_microcode_in_initrd_amd: scanned containers, data: 0xffff88807e9d9082, size: 9700a
and now if there's a next (and last container) it'll use that in
save_microcode_in_initrd_amd() and not find a proper patch, ofc.
Fix that by moving the out: label up, before the desc->mc check which
jots down the pointer of the matching patch and is used to signal to the
caller that it has found a matching patch in the current container.
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20221219210656.5140-2-bp@alien8.de
|
|
- Rename apply_microcode_early_amd() to early_apply_microcode():
simplify the name so that it is clear what it does and when does it do
it.
- Rename __load_ucode_amd() to find_blobs_in_containers(): the new name
actually explains what it does.
Document some.
No functional changes.
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20221219210656.5140-1-bp@alien8.de
|
|
Convert the final two users of prandom_u32_max() that slipped in during
6.2-rc1 to use get_random_u32_below().
Then, with no more users left, we can finally remove the deprecated
function.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Support zstd-compressed debug info
- Allow W=1 builds to detect objects shared among multiple modules
- Add srcrpm-pkg target to generate a source RPM package
- Make the -s option detection work for future GNU Make versions
- Add -Werror to KBUILD_CPPFLAGS when CONFIG_WERROR=y
- Allow W=1 builds to detect -Wundef warnings in any preprocessed files
- Raise the minimum supported version of binutils to 2.25
- Use $(intcmp ...) to compare integers if GNU Make >= 4.4 is used
- Use $(file ...) to read a file if GNU Make >= 4.2 is used
- Print error if GNU Make older than 3.82 is used
- Allow modpost to detect section mismatches with Clang LTO
- Include vmlinuz.efi into kernel tarballs for arm64 CONFIG_EFI_ZBOOT=y
* tag 'kbuild-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (29 commits)
buildtar: fix tarballs with EFI_ZBOOT enabled
modpost: Include '.text.*' in TEXT_SECTIONS
padata: Mark padata_work_init() as __ref
kbuild: ensure Make >= 3.82 is used
kbuild: refactor the prerequisites of the modpost rule
kbuild: change module.order to list *.o instead of *.ko
kbuild: use .NOTINTERMEDIATE for future GNU Make versions
kconfig: refactor Makefile to reduce process forks
kbuild: add read-file macro
kbuild: do not sort after reading modules.order
kbuild: add test-{ge,gt,le,lt} macros
Documentation: raise minimum supported version of binutils to 2.25
kbuild: add -Wundef to KBUILD_CPPFLAGS for W=1 builds
kbuild: move -Werror from KBUILD_CFLAGS to KBUILD_CPPFLAGS
kbuild: Port silent mode detection to future gnu make.
init/version.c: remove #include <generated/utsrelease.h>
firmware_loader: remove #include <generated/utsrelease.h>
modpost: Mark uuid_le type to be suitable only for MEI
kbuild: add ability to make source rpm buildable using koji
kbuild: warn objects shared among multiple modules
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
- Add powerpc qspinlock implementation optimised for large system
scalability and paravirt. See the merge message for more details
- Enable objtool to be built on powerpc to generate mcount locations
- Use a temporary mm for code patching with the Radix MMU, so the
writable mapping is restricted to the patching CPU
- Add an option to build the 64-bit big-endian kernel with the ELFv2
ABI
- Sanitise user registers on interrupt entry on 64-bit Book3S
- Many other small features and fixes
Thanks to Aboorva Devarajan, Angel Iglesias, Benjamin Gray, Bjorn
Helgaas, Bo Liu, Chen Lifu, Christoph Hellwig, Christophe JAILLET,
Christophe Leroy, Christopher M. Riedl, Colin Ian King, Deming Wang,
Disha Goel, Dmitry Torokhov, Finn Thain, Geert Uytterhoeven, Gustavo A.
R. Silva, Haowen Bai, Joel Stanley, Jordan Niethe, Julia Lawall, Kajol
Jain, Laurent Dufour, Li zeming, Miaoqian Lin, Michael Jeanson, Nathan
Lynch, Naveen N. Rao, Nayna Jain, Nicholas Miehlbradt, Nicholas Piggin,
Pali Rohár, Randy Dunlap, Rohan McLure, Russell Currey, Sathvika
Vasireddy, Shaomin Deng, Stephen Kitt, Stephen Rothwell, Thomas
Weißschuh, Tiezhu Yang, Uwe Kleine-König, Xie Shaowen, Xiu Jianfeng,
XueBing Chen, Yang Yingliang, Zhang Jiaming, ruanjinjie, Jessica Yu,
and Wolfram Sang.
* tag 'powerpc-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (181 commits)
powerpc/code-patching: Fix oops with DEBUG_VM enabled
powerpc/qspinlock: Fix 32-bit build
powerpc/prom: Fix 32-bit build
powerpc/rtas: mandate RTAS syscall filtering
powerpc/rtas: define pr_fmt and convert printk call sites
powerpc/rtas: clean up includes
powerpc/rtas: clean up rtas_error_log_max initialization
powerpc/pseries/eeh: use correct API for error log size
powerpc/rtas: avoid scheduling in rtas_os_term()
powerpc/rtas: avoid device tree lookups in rtas_os_term()
powerpc/rtasd: use correct OF API for event scan rate
powerpc/rtas: document rtas_call()
powerpc/pseries: unregister VPA when hot unplugging a CPU
powerpc/pseries: reset the RCU watchdogs after a LPM
powerpc: Take in account addition CPU node when building kexec FDT
powerpc: export the CPU node count
powerpc/cpuidle: Set CPUIDLE_FLAG_POLLING for snooze state
powerpc/dts/fsl: Fix pca954x i2c-mux node names
cxl: Remove unnecessary cxl_pci_window_alignment()
selftests/powerpc: Fix resource leaks
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 mm updates from Dave Hansen:
"New Feature:
- Randomize the per-cpu entry areas
Cleanups:
- Have CR3_ADDR_MASK use PHYSICAL_PAGE_MASK instead of open coding it
- Move to "native" set_memory_rox() helper
- Clean up pmd_get_atomic() and i386-PAE
- Remove some unused page table size macros"
* tag 'x86_mm_for_6.2_v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (35 commits)
x86/mm: Ensure forced page table splitting
x86/kasan: Populate shadow for shared chunk of the CPU entry area
x86/kasan: Add helpers to align shadow addresses up and down
x86/kasan: Rename local CPU_ENTRY_AREA variables to shorten names
x86/mm: Populate KASAN shadow for entire per-CPU range of CPU entry area
x86/mm: Recompute physical address for every page of per-CPU CEA mapping
x86/mm: Rename __change_page_attr_set_clr(.checkalias)
x86/mm: Inhibit _PAGE_NX changes from cpa_process_alias()
x86/mm: Untangle __change_page_attr_set_clr(.checkalias)
x86/mm: Add a few comments
x86/mm: Fix CR3_ADDR_MASK
x86/mm: Remove P*D_PAGE_MASK and P*D_PAGE_SIZE macros
mm: Convert __HAVE_ARCH_P..P_GET to the new style
mm: Remove pointless barrier() after pmdp_get_lockless()
x86/mm/pae: Get rid of set_64bit()
x86_64: Remove pointless set_64bit() usage
x86/mm/pae: Be consistent with pXXp_get_and_clear()
x86/mm/pae: Use WRITE_ONCE()
x86/mm/pae: Don't (ab)use atomic64
mm/gup: Fix the lockless PMD access
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the set of driver core and kernfs changes for 6.2-rc1.
The "big" change in here is the addition of a new macro,
container_of_const() that will preserve the "const-ness" of a pointer
passed into it.
The "problem" of the current container_of() macro is that if you pass
in a "const *", out of it can comes a non-const pointer unless you
specifically ask for it. For many usages, we want to preserve the
"const" attribute by using the same call. For a specific example, this
series changes the kobj_to_dev() macro to use it, allowing it to be
used no matter what the const value is. This prevents every subsystem
from having to declare 2 different individual macros (i.e.
kobj_const_to_dev() and kobj_to_dev()) and having the compiler enforce
the const value at build time, which having 2 macros would not do
either.
The driver for all of this have been discussions with the Rust kernel
developers as to how to properly mark driver core, and kobject,
objects as being "non-mutable". The changes to the kobject and driver
core in this pull request are the result of that, as there are lots of
paths where kobjects and device pointers are not modified at all, so
marking them as "const" allows the compiler to enforce this.
So, a nice side affect of the Rust development effort has been already
to clean up the driver core code to be more obvious about object
rules.
All of this has been bike-shedded in quite a lot of detail on lkml
with different names and implementations resulting in the tiny version
we have in here, much better than my original proposal. Lots of
subsystem maintainers have acked the changes as well.
Other than this change, included in here are smaller stuff like:
- kernfs fixes and updates to handle lock contention better
- vmlinux.lds.h fixes and updates
- sysfs and debugfs documentation updates
- device property updates
All of these have been in the linux-next tree for quite a while with
no problems"
* tag 'driver-core-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (58 commits)
device property: Fix documentation for fwnode_get_next_parent()
firmware_loader: fix up to_fw_sysfs() to preserve const
usb.h: take advantage of container_of_const()
device.h: move kobj_to_dev() to use container_of_const()
container_of: add container_of_const() that preserves const-ness of the pointer
driver core: fix up missed drivers/s390/char/hmcdrv_dev.c class.devnode() conversion.
driver core: fix up missed scsi/cxlflash class.devnode() conversion.
driver core: fix up some missing class.devnode() conversions.
driver core: make struct class.devnode() take a const *
driver core: make struct class.dev_uevent() take a const *
cacheinfo: Remove of_node_put() for fw_token
device property: Add a blank line in Kconfig of tests
device property: Rename goto label to be more precise
device property: Move PROPERTY_ENTRY_BOOL() a bit down
device property: Get rid of __PROPERTY_ENTRY_ARRAY_EL*SIZE*()
kernfs: fix all kernel-doc warnings and multiple typos
driver core: pass a const * into of_device_uevent()
kobject: kset_uevent_ops: make name() callback take a const *
kobject: kset_uevent_ops: make filter() callback take a const *
kobject: make kobject_namespace take a const *
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing updates from Steven Rostedt:
- Add options to the osnoise tracer:
- 'panic_on_stop' option that panics the kernel if osnoise is
greater than some user defined threshold.
- 'preempt' option, to test noise while preemption is disabled
- 'irq' option, to test noise when interrupts are disabled
- Add .percent and .graph suffix to histograms to give different
outputs
- Add nohitcount to disable showing hitcount in histogram output
- Add new __cpumask() to trace event fields to annotate that a unsigned
long array is a cpumask to user space and should be treated as one.
- Add trace_trigger kernel command line parameter to enable trace event
triggers at boot up. Useful to trace stack traces, disable tracing
and take snapshots.
- Fix x86/kmmio mmio tracer to work with the updates to lockdep
- Unify the panic and die notifiers
- Add back ftrace_expect reference that is used to extract more
information in the ftrace_bug() code.
- Have trigger filter parsing errors show up in the tracing error log.
- Updated MAINTAINERS file to add kernel tracing mailing list and
patchwork info
- Use IDA to keep track of event type numbers.
- And minor fixes and clean ups
* tag 'trace-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (44 commits)
tracing: Fix cpumask() example typo
tracing: Improve panic/die notifiers
ftrace: Prevent RCU stall on PREEMPT_VOLUNTARY kernels
tracing: Do not synchronize freeing of trigger filter on boot up
tracing: Remove pointer (asterisk) and brackets from cpumask_t field
tracing: Have trigger filter parsing errors show up in error_log
x86/mm/kmmio: Remove redundant preempt_disable()
tracing: Fix infinite loop in tracing_read_pipe on overflowed print_trace_line
Documentation/osnoise: Add osnoise/options documentation
tracing/osnoise: Add preempt and/or irq disabled options
tracing/osnoise: Add PANIC_ON_STOP option
Documentation/osnoise: Escape underscore of NO_ prefix
tracing: Fix some checker warnings
tracing/osnoise: Make osnoise_options static
tracing: remove unnecessary trace_trigger ifdef
ring-buffer: Handle resize in early boot up
tracing/hist: Fix issue of losting command info in error_log
tracing: Fix issue of missing one synthetic field
tracing/hist: Fix out-of-bound write on 'action_data.var_ref_idx'
tracing/hist: Fix wrong return value in parse_action_params()
...
|
|
Pull kvm updates from Paolo Bonzini:
"ARM64:
- Enable the per-vcpu dirty-ring tracking mechanism, together with an
option to keep the good old dirty log around for pages that are
dirtied by something other than a vcpu.
- Switch to the relaxed parallel fault handling, using RCU to delay
page table reclaim and giving better performance under load.
- Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping
option, which multi-process VMMs such as crosvm rely on (see merge
commit 382b5b87a97d: "Fix a number of issues with MTE, such as
races on the tags being initialised vs the PG_mte_tagged flag as
well as the lack of support for VM_SHARED when KVM is involved.
Patches from Catalin Marinas and Peter Collingbourne").
- Merge the pKVM shadow vcpu state tracking that allows the
hypervisor to have its own view of a vcpu, keeping that state
private.
- Add support for the PMUv3p5 architecture revision, bringing support
for 64bit counters on systems that support it, and fix the
no-quite-compliant CHAIN-ed counter support for the machines that
actually exist out there.
- Fix a handful of minor issues around 52bit VA/PA support (64kB
pages only) as a prefix of the oncoming support for 4kB and 16kB
pages.
- Pick a small set of documentation and spelling fixes, because no
good merge window would be complete without those.
s390:
- Second batch of the lazy destroy patches
- First batch of KVM changes for kernel virtual != physical address
support
- Removal of a unused function
x86:
- Allow compiling out SMM support
- Cleanup and documentation of SMM state save area format
- Preserve interrupt shadow in SMM state save area
- Respond to generic signals during slow page faults
- Fixes and optimizations for the non-executable huge page errata
fix.
- Reprogram all performance counters on PMU filter change
- Cleanups to Hyper-V emulation and tests
- Process Hyper-V TLB flushes from a nested guest (i.e. from a L2
guest running on top of a L1 Hyper-V hypervisor)
- Advertise several new Intel features
- x86 Xen-for-KVM:
- Allow the Xen runstate information to cross a page boundary
- Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured
- Add support for 32-bit guests in SCHEDOP_poll
- Notable x86 fixes and cleanups:
- One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0).
- Reinstate IBPB on emulated VM-Exit that was incorrectly dropped
a few years back when eliminating unnecessary barriers when
switching between vmcs01 and vmcs02.
- Clean up vmread_error_trampoline() to make it more obvious that
params must be passed on the stack, even for x86-64.
- Let userspace set all supported bits in MSR_IA32_FEAT_CTL
irrespective of the current guest CPUID.
- Fudge around a race with TSC refinement that results in KVM
incorrectly thinking a guest needs TSC scaling when running on a
CPU with a constant TSC, but no hardware-enumerated TSC
frequency.
- Advertise (on AMD) that the SMM_CTL MSR is not supported
- Remove unnecessary exports
Generic:
- Support for responding to signals during page faults; introduces
new FOLL_INTERRUPTIBLE flag that was reviewed by mm folks
Selftests:
- Fix an inverted check in the access tracking perf test, and restore
support for asserting that there aren't too many idle pages when
running on bare metal.
- Fix build errors that occur in certain setups (unsure exactly what
is unique about the problematic setup) due to glibc overriding
static_assert() to a variant that requires a custom message.
- Introduce actual atomics for clear/set_bit() in selftests
- Add support for pinning vCPUs in dirty_log_perf_test.
- Rename the so called "perf_util" framework to "memstress".
- Add a lightweight psuedo RNG for guest use, and use it to randomize
the access pattern and write vs. read percentage in the memstress
tests.
- Add a common ucall implementation; code dedup and pre-work for
running SEV (and beyond) guests in selftests.
- Provide a common constructor and arch hook, which will eventually
be used by x86 to automatically select the right hypercall (AMD vs.
Intel).
- A bunch of added/enabled/fixed selftests for ARM64, covering
memslots, breakpoints, stage-2 faults and access tracking.
- x86-specific selftest changes:
- Clean up x86's page table management.
- Clean up and enhance the "smaller maxphyaddr" test, and add a
related test to cover generic emulation failure.
- Clean up the nEPT support checks.
- Add X86_PROPERTY_* framework to retrieve multi-bit CPUID values.
- Fix an ordering issue in the AMX test introduced by recent
conversions to use kvm_cpu_has(), and harden the code to guard
against similar bugs in the future. Anything that tiggers
caching of KVM's supported CPUID, kvm_cpu_has() in this case,
effectively hides opt-in XSAVE features if the caching occurs
before the test opts in via prctl().
Documentation:
- Remove deleted ioctls from documentation
- Clean up the docs for the x86 MSR filter.
- Various fixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (361 commits)
KVM: x86: Add proper ReST tables for userspace MSR exits/flags
KVM: selftests: Allocate ucall pool from MEM_REGION_DATA
KVM: arm64: selftests: Align VA space allocator with TTBR0
KVM: arm64: Fix benign bug with incorrect use of VA_BITS
KVM: arm64: PMU: Fix period computation for 64bit counters with 32bit overflow
KVM: x86: Advertise that the SMM_CTL MSR is not supported
KVM: x86: remove unnecessary exports
KVM: selftests: Fix spelling mistake "probabalistic" -> "probabilistic"
tools: KVM: selftests: Convert clear/set_bit() to actual atomics
tools: Drop "atomic_" prefix from atomic test_and_set_bit()
tools: Drop conflicting non-atomic test_and_{clear,set}_bit() helpers
KVM: selftests: Use non-atomic clear/set bit helpers in KVM tests
perf tools: Use dedicated non-atomic clear/set bit helpers
tools: Take @bit as an "unsigned long" in {clear,set}_bit() helpers
KVM: arm64: selftests: Enable single-step without a "full" ucall()
KVM: x86: fix APICv/x2AVIC disabled when vm reboot by itself
KVM: Remove stale comment about KVM_REQ_UNHALT
KVM: Add missing arch for KVM_CREATE_DEVICE and KVM_{SET,GET}_DEVICE_ATTR
KVM: Reference to kvm_userspace_memory_region in doc and comments
KVM: Delete all references to removed KVM_SET_MEMORY_ALIAS ioctl
...
|
|
There are a few kernel users like kfence that require 4k pages to work
correctly and do not support large mappings. They use set_memory_4k()
to break down those large mappings.
That, in turn relies on cpa_data->force_split option to indicate to
set_memory code that it should split page tables regardless of whether
the need to be.
But, a recent change added an optimization which would return early
if a set_memory request came in that did not change permissions. It
did not consult ->force_split and would mistakenly optimize away the
splitting that set_memory_4k() needs. This broke kfence.
Skip the same-permission optimization when ->force_split is set.
Fixes: 127960a05548 ("x86/mm: Inhibit _PAGE_NX changes from cpa_process_alias()")
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Marco Elver <elver@google.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/CA+G9fYuFxZTxkeS35VTZMXwQvohu73W3xbZ5NtjebsVvH6hCuA@mail.gmail.com/
|
|
Popuplate the shadow for the shared portion of the CPU entry area, i.e.
the read-only IDT mapping, during KASAN initialization. A recent change
modified KASAN to map the per-CPU areas on-demand, but forgot to keep a
shadow for the common area that is shared amongst all CPUs.
Map the common area in KASAN init instead of letting idt_map_in_cea() do
the dirty work so that it Just Works in the unlikely event more shared
data is shoved into the CPU entry area.
The bug manifests as a not-present #PF when software attempts to lookup
an IDT entry, e.g. when KVM is handling IRQs on Intel CPUs (KVM performs
direct CALL to the IRQ handler to avoid the overhead of INTn):
BUG: unable to handle page fault for address: fffffbc0000001d8
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 16c03a067 P4D 16c03a067 PUD 0
Oops: 0000 [#1] PREEMPT SMP KASAN
CPU: 5 PID: 901 Comm: repro Tainted: G W 6.1.0-rc3+ #410
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:kasan_check_range+0xdf/0x190
vmx_handle_exit_irqoff+0x152/0x290 [kvm_intel]
vcpu_run+0x1d89/0x2bd0 [kvm]
kvm_arch_vcpu_ioctl_run+0x3ce/0xa70 [kvm]
kvm_vcpu_ioctl+0x349/0x900 [kvm]
__x64_sys_ioctl+0xb8/0xf0
do_syscall_64+0x2b/0x50
entry_SYSCALL_64_after_hwframe+0x46/0xb0
Fixes: 9fd429c28073 ("x86/kasan: Map shadow for percpu pages on demand")
Reported-by: syzbot+8cdd16fd5a6c0565e227@syzkaller.appspotmail.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20221110203504.1985010-6-seanjc@google.com
|
|
Add helpers to dedup code for aligning shadow address up/down to page
boundaries when translating an address to its shadow.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Link: https://lkml.kernel.org/r/20221110203504.1985010-5-seanjc@google.com
|
|
Rename the CPU entry area variables in kasan_init() to shorten their
names, a future fix will reference the beginning of the per-CPU portion
of the CPU entry area, and shadow_cpu_entry_per_cpu_begin is a bit much.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Link: https://lkml.kernel.org/r/20221110203504.1985010-4-seanjc@google.com
|
|
Populate a KASAN shadow for the entire possible per-CPU range of the CPU
entry area instead of requiring that each individual chunk map a shadow.
Mapping shadows individually is error prone, e.g. the per-CPU GDT mapping
was left behind, which can lead to not-present page faults during KASAN
validation if the kernel performs a software lookup into the GDT. The DS
buffer is also likely affected.
The motivation for mapping the per-CPU areas on-demand was to avoid
mapping the entire 512GiB range that's reserved for the CPU entry area,
shaving a few bytes by not creating shadows for potentially unused memory
was not a goal.
The bug is most easily reproduced by doing a sigreturn with a garbage
CS in the sigcontext, e.g.
int main(void)
{
struct sigcontext regs;
syscall(__NR_mmap, 0x1ffff000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
syscall(__NR_mmap, 0x20000000ul, 0x1000000ul, 7ul, 0x32ul, -1, 0ul);
syscall(__NR_mmap, 0x21000000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
memset(®s, 0, sizeof(regs));
regs.cs = 0x1d0;
syscall(__NR_rt_sigreturn);
return 0;
}
to coerce the kernel into doing a GDT lookup to compute CS.base when
reading the instruction bytes on the subsequent #GP to determine whether
or not the #GP is something the kernel should handle, e.g. to fixup UMIP
violations or to emulate CLI/STI for IOPL=3 applications.
BUG: unable to handle page fault for address: fffffbc8379ace00
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 16c03a067 P4D 16c03a067 PUD 15b990067 PMD 15b98f067 PTE 0
Oops: 0000 [#1] PREEMPT SMP KASAN
CPU: 3 PID: 851 Comm: r2 Not tainted 6.1.0-rc3-next-20221103+ #432
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:kasan_check_range+0xdf/0x190
Call Trace:
<TASK>
get_desc+0xb0/0x1d0
insn_get_seg_base+0x104/0x270
insn_fetch_from_user+0x66/0x80
fixup_umip_exception+0xb1/0x530
exc_general_protection+0x181/0x210
asm_exc_general_protection+0x22/0x30
RIP: 0003:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0003:0000000000000000 EFLAGS: 00000202
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000000001d0
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
</TASK>
Fixes: 9fd429c28073 ("x86/kasan: Map shadow for percpu pages on demand")
Reported-by: syzbot+ffb4f000dc2872c93f62@syzkaller.appspotmail.com
Suggested-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Link: https://lkml.kernel.org/r/20221110203504.1985010-3-seanjc@google.com
|
|
Recompute the physical address for each per-CPU page in the CPU entry
area, a recent commit inadvertantly modified cea_map_percpu_pages() such
that every PTE is mapped to the physical address of the first page.
Fixes: 9fd429c28073 ("x86/kasan: Map shadow for percpu pages on demand")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Link: https://lkml.kernel.org/r/20221110203504.1985010-2-seanjc@google.com
|
|
Now that the checkalias functionality is taken by CPA_NO_CHECK_ALIAS
rename the argument to better match is remaining purpose: primary,
matching __change_page_attr().
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20221110125544.661001508%40infradead.org
|
|
There is a cludge in change_page_attr_set_clr() that inhibits
propagating NX changes to the aliases (directmap and highmap) -- this
is a cludge twofold:
- it also inhibits the primary checks in __change_page_attr();
- it hard depends on single bit changes.
The introduction of set_memory_rox() triggered this last issue for
clearing both _PAGE_RW and _PAGE_NX.
Explicitly ignore _PAGE_NX in cpa_process_alias() instead.
Fixes: b38994948567 ("x86/mm: Implement native set_memory_rox()")
Reported-by: kernel test robot <oliver.sang@intel.com>
Debugged-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20221110125544.594991716%40infradead.org
|
|
The .checkalias argument to __change_page_attr_set_clr() is overloaded
and serves two different purposes:
- it inhibits the call to cpa_process_alias() -- as suggested by the
name; however,
- it also serves as 'primary' indicator for __change_page_attr()
( which in turn also serves as a recursion terminator for
cpa_process_alias() ).
Untangle these by extending the use of CPA_NO_CHECK_ALIAS to all
callsites that currently use .checkalias=0 for this purpose.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20221110125544.527267183%40infradead.org
|
|
It's a shame to hide useful comments in Changelogs, add some to the
code.
Shamelessly stolen from commit:
c40a56a7818c ("x86/mm/init: Remove freed kernel image areas from alias mapping")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20221110125544.460677011%40infradead.org
|
|
The mask must not include bits above physical address mask. These bits
are reserved and can be used for other things. Bits 61 and 62 are used
for Linear Address Masking.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Alexander Potapenko <glider@google.com>
Link: https://lore.kernel.org/all/20221109165140.9137-2-kirill.shutemov%40linux.intel.com
|
|
Other architectures and the common mm/ use P*D_MASK, and P*D_SIZE.
Remove the duplicated P*D_PAGE_MASK and P*D_PAGE_SIZE which are only
used in x86/*.
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Link: https://lore.kernel.org/r/20220516185202.604654-1-tatashin@google.com
|
|
Recognise that set_64bit() is a special case of our previously
introduced pxx_xchg64(), so use that and get rid of set_64bit().
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20221022114425.233481884%40infradead.org
|
|
The use of set_64bit() in X86_64 only code is pretty pointless, seeing
how it's a direct assignment. Remove all this nonsense.
[nathanchance: unbreak irte]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20221022114425.168036718%40infradead.org
|
|
Given that ptep_get_and_clear() uses cmpxchg8b, and that should be by
far the most common case, there's no point in having an optimized
variant for pmd/pud.
Introduce the pxx_xchg64() helper to implement the common logic once.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20221022114425.103392961%40infradead.org
|
|
Disallow write-tearing, that would be really unfortunate.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20221022114425.038102604%40infradead.org
|
|
PAE implies CX8, write readable code.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20221022114424.971450128%40infradead.org
|
|
Since it no longer applies to only PTEs, rename it to PXX.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20221022114424.776404066%40infradead.org
|
|
AFAICT there's no reason to do anything different than what we do for
PTEs. Make it so (also affects SH).
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20221022114424.711181252%40infradead.org
|
|
Instead of mucking about with at least 2 different ways of fudging
it, do the same thing we do for pte_t.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20221022114424.580310787%40infradead.org
|
|
Provide a native implementation of set_memory_rox(), avoiding the
double set_memory_ro();set_memory_x(); calls.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
|
|
Because endlessly repeating:
set_memory_ro()
set_memory_x()
is getting tedious.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/Y1jek64pXOsougmz@hirez.programming.kicks-ass.net
|
|
Straight up revert of commit:
a970174d7a10 ("x86/mm: Do not verify W^X at boot up")
now that the root cause has been fixed.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20221025201058.011279208@infradead.org
|
|
Now that text_poke is available before ftrace, remove the
SYSTEM_BOOTING exceptions.
Specifically, this cures a W+X case during boot.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20221025201057.945960823@infradead.org
|
|
Instead of duplicating init_mm, allocate a fresh mm. The advantage is
that mm_alloc() has much simpler dependencies. Additionally it makes
more conceptual sense, init_mm has no (and must not have) user state
to duplicate.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20221025201057.816175235@infradead.org
|
|
Seth found that the CPU-entry-area; the piece of per-cpu data that is
mapped into the userspace page-tables for kPTI is not subject to any
randomization -- irrespective of kASLR settings.
On x86_64 a whole P4D (512 GB) of virtual address space is reserved for
this structure, which is plenty large enough to randomize things a
little.
As such, use a straight forward randomization scheme that avoids
duplicates to spread the existing CPUs over the available space.
[ bp: Fix le build. ]
Reported-by: Seth Jenkins <sethjenkins@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
KASAN maps shadow for the entire CPU-entry-area:
[CPU_ENTRY_AREA_BASE, CPU_ENTRY_AREA_BASE + CPU_ENTRY_AREA_MAP_SIZE]
This will explode once the per-cpu entry areas are randomized since it
will increase CPU_ENTRY_AREA_MAP_SIZE to 512 GB and KASAN fails to
allocate shadow for such big area.
Fix this by allocating KASAN shadow only for really used cpu entry area
addresses mapped by cea_map_percpu_pages()
Thanks to the 0day folks for finding and reporting this to be an issue.
[ dhansen: tweak changelog since this will get committed before peterz's
actual cpu-entry-area randomization ]
Signed-off-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Yujie Liu <yujie.liu@intel.com>
Cc: kernel test robot <yujie.liu@intel.com>
Link: https://lore.kernel.org/r/202210241508.2e203c3d-yujie.liu@intel.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio updates from Bartosz Golaszewski:
"We have a new GPIO multiplexer driver, bunch of driver updates and
refactoring in the core GPIO library.
GPIO core:
- teach gpiolib to work with software nodes for HW description
- remove ARCH_NR_GPIOS treewide as we no longer impose any limit on
the number of GPIOS since the allocation became entirely dynamic
- add support for HW quirks for Cirrus CS42L56 codec, Marvell NFC
controller, Freescale PCIe and Ethernet controller, Himax LCDs and
Mediatek mt2701
- refactor OF quirk code
- some general refactoring of the OF and ACPI code, adding new
helpers, minor tweaks and fixes, making fwnode usage consistent
etc.
GPIO uAPI:
- fix an issue where the user-space can trigger a NULL-pointer
dereference in the kernel by opening a device file, forcing a
driver unbind and then calling one of the syscalls on the
associated file descriptor
New drivers:
- add gpio-latch: a new GPIO multiplexer based on latches connected
to other GPIOs
Driver updates:
- convert i2c GPIO expanders to using .probe_new()
- drop the gpio-sta2x11 driver
- factor out common code for the ACCES IDIO-16 family of controllers
and use this new library wherever applicable in drivers
- add DT support to gpio-hisi
- allow building gpio-davinci as a module and increase its maxItems
property
- add support for a new model to gpio-pca9570
- other minor changes to various drivers"
* tag 'gpio-updates-for-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: (66 commits)
gpio: sim: set a limit on the number of GPIOs
gpiolib: protect the GPIO device against being dropped while in use by user-space
gpiolib: cdev: fix NULL-pointer dereferences
gpiolib: Provide to_gpio_device() helper
gpiolib: Unify access to the device properties
gpio: Do not include <linux/kernel.h> when not really needed.
gpio: pcf857x: Convert to i2c's .probe_new()
gpio: pca953x: Convert to i2c's .probe_new()
gpio: max732x: Convert to i2c's .probe_new()
dt-bindings: gpio: gpio-davinci: Increase maxItems in gpio-line-names
gpiolib: ensure that fwnode is properly set
gpio: sl28cpld: Replace irqchip mask_invert with unmask_base
gpiolib: of: Use correct fwnode for DT-probed chips
gpiolib: of: Drop redundant check in of_mm_gpiochip_remove()
gpiolib: of: Prepare of_mm_gpiochip_add_data() for fwnode
gpiolib: add support for software nodes
gpiolib: consolidate GPIO lookups
gpiolib: acpi: avoid leaking ACPI details into upper gpiolib layers
gpiolib: acpi: teach acpi_find_gpio() to handle data-only nodes
gpiolib: acpi: change acpi_find_gpio() to accept firmware node
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 core updates from Borislav Petkov:
- Add the call depth tracking mitigation for Retbleed which has been
long in the making. It is a lighterweight software-only fix for
Skylake-based cores where enabling IBRS is a big hammer and causes a
significant performance impact.
What it basically does is, it aligns all kernel functions to 16 bytes
boundary and adds a 16-byte padding before the function, objtool
collects all functions' locations and when the mitigation gets
applied, it patches a call accounting thunk which is used to track
the call depth of the stack at any time.
When that call depth reaches a magical, microarchitecture-specific
value for the Return Stack Buffer, the code stuffs that RSB and
avoids its underflow which could otherwise lead to the Intel variant
of Retbleed.
This software-only solution brings a lot of the lost performance
back, as benchmarks suggest:
https://lore.kernel.org/all/20220915111039.092790446@infradead.org/
That page above also contains a lot more detailed explanation of the
whole mechanism
- Implement a new control flow integrity scheme called FineIBT which is
based on the software kCFI implementation and uses hardware IBT
support where present to annotate and track indirect branches using a
hash to validate them
- Other misc fixes and cleanups
* tag 'x86_core_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (80 commits)
x86/paravirt: Use common macro for creating simple asm paravirt functions
x86/paravirt: Remove clobber bitmask from .parainstructions
x86/debug: Include percpu.h in debugreg.h to get DECLARE_PER_CPU() et al
x86/cpufeatures: Move X86_FEATURE_CALL_DEPTH from bit 18 to bit 19 of word 11, to leave space for WIP X86_FEATURE_SGX_EDECCSSA bit
x86/Kconfig: Enable kernel IBT by default
x86,pm: Force out-of-line memcpy()
objtool: Fix weak hole vs prefix symbol
objtool: Optimize elf_dirty_reloc_sym()
x86/cfi: Add boot time hash randomization
x86/cfi: Boot time selection of CFI scheme
x86/ibt: Implement FineIBT
objtool: Add --cfi to generate the .cfi_sites section
x86: Add prefix symbols for function padding
objtool: Add option to generate prefix symbols
objtool: Avoid O(bloody terrible) behaviour -- an ode to libelf
objtool: Slice up elf_create_section_symbol()
kallsyms: Revert "Take callthunks into account"
x86: Unconfuse CONFIG_ and X86_FEATURE_ namespaces
x86/retpoline: Fix crash printing warning
x86/paravirt: Fix a !PARAVIRT build warning
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
"API:
- Optimise away self-test overhead when they are disabled
- Support symmetric encryption via keyring keys in af_alg
- Flip hwrng default_quality, the default is now maximum entropy
Algorithms:
- Add library version of aesgcm
- CFI fixes for assembly code
- Add arm/arm64 accelerated versions of sm3/sm4
Drivers:
- Remove assumption on arm64 that kmalloc is DMA-aligned
- Fix selftest failures in rockchip
- Add support for RK3328/RK3399 in rockchip
- Add deflate support in qat
- Merge ux500 into stm32
- Add support for TEE for PCI ID 0x14CA in ccp
- Add mt7986 support in mtk
- Add MaxLinear platform support in inside-secure
- Add NPCM8XX support in npcm"
* tag 'v6.2-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (184 commits)
crypto: ux500/cryp - delete driver
crypto: stm32/cryp - enable for use with Ux500
crypto: stm32 - enable drivers to be used on Ux500
dt-bindings: crypto: Let STM32 define Ux500 CRYP
hwrng: geode - Fix PCI device refcount leak
hwrng: amd - Fix PCI device refcount leak
crypto: qce - Set DMA alignment explicitly
crypto: octeontx2 - Set DMA alignment explicitly
crypto: octeontx - Set DMA alignment explicitly
crypto: keembay - Set DMA alignment explicitly
crypto: safexcel - Set DMA alignment explicitly
crypto: hisilicon/hpre - Set DMA alignment explicitly
crypto: chelsio - Set DMA alignment explicitly
crypto: ccree - Set DMA alignment explicitly
crypto: ccp - Set DMA alignment explicitly
crypto: cavium - Set DMA alignment explicitly
crypto: img-hash - Fix variable dereferenced before check 'hdev->req'
crypto: arm64/ghash-ce - use frame_push/pop macros consistently
crypto: arm64/crct10dif - use frame_push/pop macros consistently
crypto: arm64/aes-modes - use frame_push/pop macros consistently
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull kernel hardening updates from Kees Cook:
- Convert flexible array members, fix -Wstringop-overflow warnings, and
fix KCFI function type mismatches that went ignored by maintainers
(Gustavo A. R. Silva, Nathan Chancellor, Kees Cook)
- Remove the remaining side-effect users of ksize() by converting
dma-buf, btrfs, and coredump to using kmalloc_size_roundup(), add
more __alloc_size attributes, and introduce full testing of all
allocator functions. Finally remove the ksize() side-effect so that
each allocation-aware checker can finally behave without exceptions
- Introduce oops_limit (default 10,000) and warn_limit (default off) to
provide greater granularity of control for panic_on_oops and
panic_on_warn (Jann Horn, Kees Cook)
- Introduce overflows_type() and castable_to_type() helpers for cleaner
overflow checking
- Improve code generation for strscpy() and update str*() kern-doc
- Convert strscpy and sigphash tests to KUnit, and expand memcpy tests
- Always use a non-NULL argument for prepare_kernel_cred()
- Disable structleak plugin in FORTIFY KUnit test (Anders Roxell)
- Adjust orphan linker section checking to respect CONFIG_WERROR (Xin
Li)
- Make sure siginfo is cleared for forced SIGKILL (haifeng.xu)
- Fix um vs FORTIFY warnings for always-NULL arguments
* tag 'hardening-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (31 commits)
ksmbd: replace one-element arrays with flexible-array members
hpet: Replace one-element array with flexible-array member
um: virt-pci: Avoid GCC non-NULL warning
signal: Initialize the info in ksignal
lib: fortify_kunit: build without structleak plugin
panic: Expose "warn_count" to sysfs
panic: Introduce warn_limit
panic: Consolidate open-coded panic_on_warn checks
exit: Allow oops_limit to be disabled
exit: Expose "oops_count" to sysfs
exit: Put an upper limit on how often we can oops
panic: Separate sysctl logic from CONFIG_SMP
mm/pgtable: Fix multiple -Wstringop-overflow warnings
mm: Make ksize() a reporting-only function
kunit/fortify: Validate __alloc_size attribute results
drm/sti: Fix return type of sti_{dvo,hda,hdmi}_connector_mode_valid()
drm/fsl-dcu: Fix return type of fsl_dcu_drm_connector_mode_valid()
driver core: Add __alloc_size hint to devm allocators
overflow: Introduce overflows_type() and castable_to_type()
coredump: Proactively round up to kmalloc bucket size
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
"Enumeration:
- Squash portdrv_{core,pci}.c into portdrv.c to ease maintenance and
make more things static.
- Make portdrv bind to Switch Ports that have AER. Previously, if
these Ports lacked MSI/MSI-X, portdrv failed to bind, which meant
the Ports couldn't be suspended to low-power states. AER on these
Ports doesn't use interrupts, and the AER driver doesn't need to
claim them.
- Assign PCI domain IDs using ida_alloc(), which makes host bridge
add/remove work better.
Resource management:
- To work better with recent BIOSes that use EfiMemoryMappedIO for
PCI host bridge apertures, remove those regions from the E820 map
(E820 entries normally prevent us from allocating BARs). In v5.19,
we added some quirks to disable E820 checking, but that's not very
maintainable. EfiMemoryMappedIO means the OS needs to map the
region for use by EFI runtime services; it shouldn't prevent OS
from using it.
PCIe native device hotplug:
- Build pciehp by default if USB4 is enabled, since Thunderbolt/USB4
PCIe tunneling depends on native PCIe hotplug.
- Enable Command Completed Interrupt only if supported to avoid user
confusion from lspci output that says this is enabled but not
supported.
- Prevent pciehp from binding to Switch Upstream Ports; this happened
because of interaction with acpiphp and caused devices below the
Upstream Port to disappear.
Power management:
- Convert AGP drivers to generic power management. We hope to remove
legacy power management from the PCI core eventually.
Virtualization:
- Fix pci_device_is_present(), which previously always returned
"false" for VFs, causing virtio hangs when unbinding the driver.
Miscellaneous:
- Convert drivers to gpiod API to prepare for dropping some legacy
code.
- Fix DOE fencepost error for the maximum data object length.
Baikal-T1 PCIe controller driver:
- Add driver and DT bindings.
Broadcom STB PCIe controller driver:
- Enable Multi-MSI.
- Delay 100ms after PERST# deassert to allow power and clocks to
stabilize.
- Configure Read Completion Boundary to 64 bytes.
Freescale i.MX6 PCIe controller driver:
- Initialize PHY before deasserting core reset to fix a regression in
v6.0 on boards where the PHY provides the reference.
- Fix imx6sx and imx8mq clock names in DT schema.
Intel VMD host bridge driver:
- Fix Secondary Bus Reset on VMD bridges, which allows reset of NVMe
SSDs in VT-d pass-through scenarios.
- Disable MSI remapping, which gets re-enabled by firmware during
suspend/resume.
MediaTek PCIe Gen3 controller driver:
- Add MT7986 and MT8195 support.
Qualcomm PCIe controller driver:
- Add SC8280XP/SA8540P basic interconnect support.
Rockchip DesignWare PCIe controller driver:
- Base DT schema on common Synopsys schema.
Synopsys DesignWare PCIe core:
- Collect DT items shared between Root Port and Endpoint (PERST GPIO,
PHY info, clocks, resets, link speed, number of lanes, number of
iATU windows, interrupt info, etc) to snps,dw-pcie-common.yaml.
- Add dma-ranges support for Root Ports and Endpoints.
- Consolidate DT resource retrieval for "dbi", "dbi2", "atu", etc. to
reduce code duplication.
- Add generic names for clocks and resets to encourage more
consistent naming across drivers using DesignWare IP.
- Stop advertising PTM Responder role for Endpoints, which aren't
allowed to be responders.
TI J721E PCIe driver:
- Add j721s2 host mode ID to DT schema.
- Add interrupt properties to DT schema.
Toshiba Visconti PCIe controller driver:
- Fix interrupts array max constraints in DT schema"
* tag 'pci-v6.2-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (95 commits)
x86/PCI: Use pr_info() when possible
x86/PCI: Fix log message typo
x86/PCI: Tidy E820 removal messages
PCI: Skip allocate_resource() if too little space available
efi/x86: Remove EfiMemoryMappedIO from E820 map
PCI/portdrv: Allow AER service only for Root Ports & RCECs
PCI: xilinx-nwl: Fix coding style violations
PCI: mvebu: Switch to using gpiod API
PCI: pciehp: Enable Command Completed Interrupt only if supported
PCI: aardvark: Switch to using devm_gpiod_get_optional()
dt-bindings: PCI: mediatek-gen3: add support for mt7986
dt-bindings: PCI: mediatek-gen3: add SoC based clock config
dt-bindings: PCI: qcom: Allow 'dma-coherent' property
PCI: mt7621: Add sentinel to quirks table
PCI: vmd: Fix secondary bus reset for Intel bridges
PCI: endpoint: pci-epf-vntb: Fix sparse ntb->reg build warning
PCI: endpoint: pci-epf-vntb: Fix sparse build warning for epf_db
PCI: endpoint: pci-epf-vntb: Replace hardcoded 4 with sizeof(u32)
PCI: endpoint: pci-epf-vntb: Remove unused epf_db_phy struct member
PCI: endpoint: pci-epf-vntb: Fix call pci_epc_mem_free_addr() in error path
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- More userfaultfs work from Peter Xu
- Several convert-to-folios series from Sidhartha Kumar and Huang Ying
- Some filemap cleanups from Vishal Moola
- David Hildenbrand added the ability to selftest anon memory COW
handling
- Some cpuset simplifications from Liu Shixin
- Addition of vmalloc tracing support by Uladzislau Rezki
- Some pagecache folioifications and simplifications from Matthew
Wilcox
- A pagemap cleanup from Kefeng Wang: we have VM_ACCESS_FLAGS, so use
it
- Miguel Ojeda contributed some cleanups for our use of the
__no_sanitize_thread__ gcc keyword.
This series should have been in the non-MM tree, my bad
- Naoya Horiguchi improved the interaction between memory poisoning and
memory section removal for huge pages
- DAMON cleanups and tuneups from SeongJae Park
- Tony Luck fixed the handling of COW faults against poisoned pages
- Peter Xu utilized the PTE marker code for handling swapin errors
- Hugh Dickins reworked compound page mapcount handling, simplifying it
and making it more efficient
- Removal of the autonuma savedwrite infrastructure from Nadav Amit and
David Hildenbrand
- zram support for multiple compression streams from Sergey Senozhatsky
- David Hildenbrand reworked the GUP code's R/O long-term pinning so
that drivers no longer need to use the FOLL_FORCE workaround which
didn't work very well anyway
- Mel Gorman altered the page allocator so that local IRQs can remnain
enabled during per-cpu page allocations
- Vishal Moola removed the try_to_release_page() wrapper
- Stefan Roesch added some per-BDI sysfs tunables which are used to
prevent network block devices from dirtying excessive amounts of
pagecache
- David Hildenbrand did some cleanup and repair work on KSM COW
breaking
- Nhat Pham and Johannes Weiner have implemented writeback in zswap's
zsmalloc backend
- Brian Foster has fixed a longstanding corner-case oddity in
file[map]_write_and_wait_range()
- sparse-vmemmap changes for MIPS, LoongArch and NIOS2 from Feiyang
Chen
- Shiyang Ruan has done some work on fsdax, to make its reflink mode
work better under xfstests. Better, but still not perfect
- Christoph Hellwig has removed the .writepage() method from several
filesystems. They only need .writepages()
- Yosry Ahmed wrote a series which fixes the memcg reclaim target
beancounting
- David Hildenbrand has fixed some of our MM selftests for 32-bit
machines
- Many singleton patches, as usual
* tag 'mm-stable-2022-12-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (313 commits)
mm/hugetlb: set head flag before setting compound_order in __prep_compound_gigantic_folio
mm: mmu_gather: allow more than one batch of delayed rmaps
mm: fix typo in struct pglist_data code comment
kmsan: fix memcpy tests
mm: add cond_resched() in swapin_walk_pmd_entry()
mm: do not show fs mm pc for VM_LOCKONFAULT pages
selftests/vm: ksm_functional_tests: fixes for 32bit
selftests/vm: cow: fix compile warning on 32bit
selftests/vm: madv_populate: fix missing MADV_POPULATE_(READ|WRITE) definitions
mm/gup_test: fix PIN_LONGTERM_TEST_READ with highmem
mm,thp,rmap: fix races between updates of subpages_mapcount
mm: memcg: fix swapcached stat accounting
mm: add nodes= arg to memory.reclaim
mm: disable top-tier fallback to reclaim on proactive reclaim
selftests: cgroup: make sure reclaim target memcg is unprotected
selftests: cgroup: refactor proactive reclaim code to reclaim_until()
mm: memcg: fix stale protection of reclaim target memcg
mm/mmap: properly unaccount memory on mas_preallocate() failure
omfs: remove ->writepage
jfs: remove ->writepage
...
|