summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-04-17s390/mm: Uncouple physical vs virtual address spacesAlexander Gordeev9-68/+241
The uncoupling physical vs virtual address spaces brings the following benefits to s390: - virtual memory layout flexibility; - closes the address gap between kernel and modules, it caused s390-only problems in the past (e.g. 'perf' bugs); - allows getting rid of trampolines used for module calls into kernel; - allows simplifying BPF trampoline; - minor performance improvement in branch prediction; - kernel randomization entropy is magnitude bigger, as it is derived from the amount of available virtual, not physical memory; The whole change could be described in two pictures below: before and after the change. Some aspects of the virtual memory layout setup are not clarified (number of page levels, alignment, DMA memory), since these are not a part of this change or secondary with regard to how the uncoupling itself is implemented. The focus of the pictures is to explain why __va() and __pa() macros are implemented the way they are. Memory layout in V==R mode: | Physical | Virtual | +- 0 --------------+- 0 --------------+ identity mapping start | | S390_lowcore | Low-address memory | +- 8 KB -----------+ | | | | | identity | phys == virt | | mapping | virt == phys | | | +- AMODE31_START --+- AMODE31_START --+ .amode31 rand. phys/virt start |.amode31 text/data|.amode31 text/data| +- AMODE31_END ----+- AMODE31_END ----+ .amode31 rand. phys/virt start | | | | | | +- __kaslr_offset, __kaslr_offset_phys| kernel rand. phys/virt start | | | | kernel text/data | kernel text/data | phys == kvirt | | | +------------------+------------------+ kernel phys/virt end | | | | | | | | | | | | +- ident_map_size -+- ident_map_size -+ identity mapping end | | | ... unused gap | | | +---- vmemmap -----+ 'struct page' array start | | | virtually mapped | | memory map | | | +- __abs_lowcore --+ | | | Absolute Lowcore | | | +- __memcpy_real_area | | | Real Memory Copy| | | +- VMALLOC_START --+ vmalloc area start | | | vmalloc area | | | +- MODULES_VADDR --+ modules area start | | | modules area | | | +------------------+ UltraVisor Secure Storage limit | | | ... unused gap | | | +KASAN_SHADOW_START+ KASAN shadow memory start | | | KASAN shadow | | | +------------------+ ASCE limit Memory layout in V!=R mode: | Physical | Virtual | +- 0 --------------+- 0 --------------+ | | S390_lowcore | Low-address memory | +- 8 KB -----------+ | | | | | | | | ... unused gap | | | | +- AMODE31_START --+- AMODE31_START --+ .amode31 rand. phys/virt start |.amode31 text/data|.amode31 text/data| +- AMODE31_END ----+- AMODE31_END ----+ .amode31 rand. phys/virt end (<2GB) | | | | | | +- __kaslr_offset_phys | kernel rand. phys start | | | | kernel text/data | | | | | +------------------+ | kernel phys end | | | | | | | | | | | | +- ident_map_size -+ | | | | ... unused gap | | | +- __identity_base + identity mapping start (>= 2GB) | | | identity | phys == virt - __identity_base | mapping | virt == phys + __identity_base | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | +---- vmemmap -----+ 'struct page' array start | | | virtually mapped | | memory map | | | +- __abs_lowcore --+ | | | Absolute Lowcore | | | +- __memcpy_real_area | | | Real Memory Copy| | | +- VMALLOC_START --+ vmalloc area start | | | vmalloc area | | | +- MODULES_VADDR --+ modules area start | | | modules area | | | +- __kaslr_offset -+ kernel rand. virt start | | | kernel text/data | phys == (kvirt - __kaslr_offset) + | | __kaslr_offset_phys +- kernel .bss end + kernel rand. virt end | | | ... unused gap | | | +------------------+ UltraVisor Secure Storage limit | | | ... unused gap | | | +KASAN_SHADOW_START+ KASAN shadow memory start | | | KASAN shadow | | | +------------------+ ASCE limit Unused gaps in the virtual memory layout could be present or not - depending on how partucular system is configured. No page tables are created for the unused gaps. The relative order of vmalloc, modules and kernel image in virtual memory is defined by following considerations: - start of the modules area and end of the kernel should reside within 4GB to accommodate relative 32-bit jumps. The best way to achieve that is to place kernel next to modules; - vmalloc and module areas should locate next to each other to prevent failures and extra reworks in user level tools (makedumpfile, crash, etc.) which treat vmalloc and module addresses similarily; - kernel needs to be the last area in the virtual memory layout to easily distinguish between kernel and non-kernel virtual addresses. That is needed to (again) simplify handling of addresses in user level tools and make __pa() macro faster (see below); Concluding the above, the relative order of the considered virtual areas in memory is: vmalloc - modules - kernel. Therefore, the only change to the current memory layout is moving kernel to the end of virtual address space. With that approach the implementation of __pa() macro is straightforward - all linear virtual addresses less than kernel base are considered identity mapping: phys == virt - __identity_base All addresses greater than kernel base are kernel ones: phys == (kvirt - __kaslr_offset) + __kaslr_offset_phys By contrast, __va() macro deals only with identity mapping addresses: virt == phys + __identity_base .amode31 section is mapped separately and is not covered by __pa() macro. In fact, it could have been handled easily by checking whether a virtual address is within the section or not, but there is no need for that. Thus, let __pa() code do as little machine cycles as possible. The KASAN shadow memory is located at the very end of the virtual memory layout, at addresses higher than the kernel. However, that is not a linear mapping and no code other than KASAN instrumentation or API is expected to access it. When KASLR mode is enabled the kernel base address randomized within a memory window that spans whole unused virtual address space. The size of that window depends from the amount of physical memory available to the system, the limit imposed by UltraVisor (if present) and the vmalloc area size as provided by vmalloc= kernel command line parameter. In case the virtual memory is exhausted the minimum size of the randomization window is forcefully set to 2GB, which amounts to in 15 bits of entropy if KASAN is enabled or 17 bits of entropy in default configuration. The default kernel offset 0x100000 is used as a magic value both in the decompressor code and vmlinux linker script, but it will be removed with a follow-up change. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17s390/crash: Use old os_info to create PT_LOAD headersAlexander Gordeev3-5/+42
This is a preparatory rework to allow uncoupling virtual and physical addresses spaces. The vmcore ELF program headers describe virtual memory regions of a crashed kernel. User level tools use that information for the kernel text and data analysis (e.g vmcore-dmesg extracts the kernel log). Currently the kernel image is covered by program headers describing the identity mapping regions. But in the future the kernel image will be mapped into separate region outside of the identity mapping. Create the additional ELF program header that covers kernel image only, so that vmcore tools could locate kernel text and data. Further, the identity mapping in crashed and capture kernels will have different base address. Due to that __va() macro can not be used in the capture kernel. Instead, read crashed kernel identity mapping base address from os_info and use it for PT_LOAD type program headers creation. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17s390/vmcoreinfo: Store virtual memory layoutAlexander Gordeev1-0/+2
This is a preparatory rework to allow uncoupling virtual and physical addresses spaces. The virtual memory layout is needed for address translation by crash tool when /proc/kcore device is used as the memory image. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17s390/os_info: Store virtual memory layoutAlexander Gordeev2-2/+15
This is a preparatory rework to allow uncoupling virtual and physical addresses spaces. The virtual memory layout will be read out by makedumpfile, crash and other user tools for virtual address translation. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17s390/os_info: Introduce value entriesAlexander Gordeev3-9/+31
Introduce entries that do not reference any data in memory, but rather provide values. Set the size of such entries to zero and do not compute checksum for them, since there is no data which integrity needs to be checked. The integrity of the value entries itself is still covered by the os_info checksum. Reserve the lowest unused entry index OS_INFO_RESERVED for future use - presumably for the number of entries present. That could later be used by user level tools. The existing tools would not notice any difference. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17s390/boot: Make .amode31 section address range explicitAlexander Gordeev2-1/+4
This is a preparatory rework to allow uncoupling virtual and physical addresses spaces. Introduce .amode31 section address range AMODE31_START and AMODE31_END macros for later use. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17s390/boot: Make identity mapping base address explicitAlexander Gordeev2-2/+5
This is a preparatory rework to allow uncoupling virtual and physical addresses spaces. Currently the identity mapping base address is implicit and is always set to zero. Make it explicit by putting into __identity_base persistent boot variable and use it in proper context - which is the value of PAGE_OFFSET. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17s390/boot: Uncouple virtual and physical kernel offsetsAlexander Gordeev4-9/+20
This is a preparatory rework to allow uncoupling virtual and physical addresses spaces. Currently __kaslr_offset is the kernel offset in both physical memory on boot and in virtual memory after DAT mode is enabled. Uncouple these offsets and rename the physical address space variant to __kaslr_offset_phys while keep the name __kaslr_offset for the offset in virtual address space. Do not use __kaslr_offset_phys after DAT mode is enabled just yet, but still make it a persistent boot variable for later use. Use __kaslr_offset and __kaslr_offset_phys offsets in proper contexts and alter handle_relocs() function to distinguish between the two. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17s390/mm: Create virtual memory layout structureAlexander Gordeev3-5/+12
This is a preparatory rework to allow uncoupling virtual and physical addresses spaces. Put virtual memory layout information into a structure to improve code generation when accessing the structure members, which are currently only ident_map_size and __kaslr_offset. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17s390/mm: Move KASLR related to <asm/page.h>Alexander Gordeev2-14/+14
Move everyting KASLR related to <asm/page.h>, similarly to many other architectures. Acked-by: Heiko Carstens <hca@linux.ibm.com> Suggested-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17s390/boot: Swap vmalloc and Lowcore/Real Memory Copy areasAlexander Gordeev2-10/+13
This is a preparatory rework to allow uncoupling virtual and physical addresses spaces. Currently the order of virtual memory areas is (the lowcore and .amode31 section are skipped, as it is irrelevant): identity mapping (the kernel is contained within) vmemmap vmalloc modules Absolute Lowcore Real Memory Copy In the future the kernel will be mapped separately and placed to the end of the virtual address space, so the layout would turn like this: identity mapping vmemmap vmalloc modules Absolute Lowcore Real Memory Copy kernel However, the distance between kernel and modules needs to be as little as possible, ideally - none. Thus, the Absolute Lowcore and Real Memory Copy areas would stay in the way and therefore need to be moved as well: identity mapping vmemmap Absolute Lowcore Real Memory Copy vmalloc modules kernel To facilitate such layout swap the vmalloc and Absolute Lowcore together with Real Memory Copy areas. As result, the current layout turns into: identity mapping (the kernel is contained within) vmemmap Absolute Lowcore Real Memory Copy vmalloc modules This will allow to locate the kernel directly next to the modules once it gets mapped separately. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17s390/boot: Reduce size of identity mapping on overlapAlexander Gordeev1-1/+4
In case vmemmap array could overlap with vmalloc area on virtual memory layout setup, the size of vmalloc area is decreased. That could result in less memory than user requested with vmalloc= kernel command line parameter. Instead, reduce the size of identity mapping (and the size of vmemmap array as result) to avoid such overlap. Further, currently the virtual memmory allocation "rolls" from top to bottom and it is only VMALLOC_START that could get increased due to the overlap. Change that to decrease- only, which makes the whole allocation algorithm more easy to comprehend. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17s390/boot: Consider DCSS segments on memory layout setupAlexander Gordeev2-2/+12
The maximum mappable physical address (as returned by arch_get_mappable_range() callback) is limited by the value of (1UL << MAX_PHYSMEM_BITS). The maximum physical address available to a DCSS segment is 512GB. In case the available online or offline memory size is less than the DCSS limit arch_get_mappable_range() would include never used [512GB..(1UL << MAX_PHYSMEM_BITS)] range. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17s390/boot: Do not force vmemmap to start at MAX_PHYSMEM_BITSAlexander Gordeev1-3/+2
vmemmap is forcefully set to start at MAX_PHYSMEM_BITS at most. That could be needed in the past to limit ident_map_size to MAX_PHYSMEM_BITS. However since commit 75eba6ec0de1 ("s390: unify identity mapping limits handling") ident_map_size is limited in setup_ident_map_size() function, which is called earlier. Another reason to limit vmemmap start to MAX_PHYSMEM_BITS is because it was returned by arch_get_mappable_range() as the maximum mappable physical address. Since commit f641679dfe55 ("s390/mm: rework arch_get_mappable_range() callback") that is not required anymore. As result, there is no neccessity to limit vmemmap starting address with MAX_PHYSMEM_BITS. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17KVM: s390: vsie: Use virt_to_phys for facility control blockNina Schoetterl-Glausch1-1/+2
In order for SIE to interpretively execute STFLE, it requires the real or absolute address of a facility-list control block. Before writing the location into the shadow SIE control block, convert it from a virtual address. We currently do not run into this bug because the lower 31 bits are the same for virtual and physical addresses. Signed-off-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com> Link: https://lore.kernel.org/r/20240319164420.4053380-3-nsg@linux.ibm.com Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Message-Id: <20240319164420.4053380-3-nsg@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-12s390/cio: fix tracepoint subchannel type fieldPeter Oberparleiter1-1/+1
The subchannel-type field "st" of s390_cio_stsch and s390_cio_msch tracepoints is incorrectly filled with the subchannel-enabled SCHIB value "ena". Fix this by assigning the correct value. Fixes: d1de8633d96a ("s390 cio: Rewrite trace point class s390_class_schib") Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-12s390/cio: export CHPID operating speedPeter Oberparleiter3-2/+49
Add a per-CHPID sysfs attribute named "speed_bps" that provides the operating speed of the associated channel path in bits per second, or 0 if the operating speed is not available. Example: $ cat /sys/devices/css0/chp0.32/speed_bps 32G Reviewed-by: Vineeth Vijayan <vneethv@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-12s390/cio: export measurement data for all CMGsPeter Oberparleiter1-4/+0
A channel-path's channel-measurement-group value (CMG) determines the format of associated measurement data and characteristics blocks. Both blocks are of fixed size and contain a generic and CMG-dependent part. Currently CIO exports these data blocks via sysfs only for a specific list of CMGs even though the kernel itself does not interpret CMG-dependent data. Change CIO to export measurement data and characteristics for all CMGs. This enables supporting new CMG data formats in userspace without the need for kernel changes. Reviewed-by: Vineeth Vijayan <vneethv@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-12s390/cio: export extended channel-path-measurement dataPeter Oberparleiter5-32/+69
Add a per-CHPID binary sysfs attribute named "ext_measurement" that provides access to extended channel-path-measurement data for the associated channel path. Note that while not all channel-paths provide extended measurement data this attribute is created unconditionally for all channel paths because channel-path measurement capabilities might change during run-time. Reading from the attribute will only return data for channel-paths that support extended measurement data. Example: $ echo 1 > /sys/devices/css0/cm_enable $ xxd /sys/devices/css0/chp0.32/ext_measurement 00000000: 53e0 8002 0000 0095 0000 0000 59cc e034 S...........Y..4 00000010: 38b8 cc45 0000 0000 0000 0000 3e24 fe94 8..E........>$.. 00000020: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000030: 0000 0000 0000 0000 0000 0000 0000 0000 ................ Reviewed-by: Vineeth Vijayan <vneethv@linux.ibm.com> Tested-by: Vineeth Vijayan <vneethv@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-12s390/cio: simplify measurement attribute registrationPeter Oberparleiter1-34/+15
Use attribute groups to simplify registration, removal and extension of measurement related sysfs attributes. Reviewed-by: Vineeth Vijayan <vneethv@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-12s390/cio: rework channel-utilization-block handlingPeter Oberparleiter3-25/+49
Convert channel-utilization-block (CUB) address variables from separate named fields to arrays of addresses. Also simplify error handling and introduce named constants. This is done in preparation of introducing additional CUBs. Note: With this change the __packed annotation of secm_area is required to prevent an alignment hole that would otherwise occur due to the switch from u32 to dma64_t. Reviewed-by: Vineeth Vijayan <vneethv@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-09s390/mm: Convert gmap_make_secure to use a folioMatthew Wilcox (Oracle)1-13/+14
Remove uses of deprecated page APIs, and move the check for large folios to here to avoid taking the folio lock if the folio is too large. We could do better here by attempting to split the large folio, but I'll leave that improvement for someone who can test it. Acked-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/20240322161149.2327518-3-willy@infradead.org Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-09s390/mm: Convert make_page_secure to use a folioMatthew Wilcox (Oracle)1-13/+16
These page APIs are deprecated, so convert the incoming page to a folio and use the folio APIs instead. The ultravisor API cannot handle large folios, so return -EINVAL if one has slipped through. Acked-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/20240322161149.2327518-2-willy@infradead.org Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-09s390/cpum_cf: make crypto counters upward compatible across machine typesThomas Richter2-9/+4
The CPU Measurement facility crypto counter set functionality is defined by the Second Counter Version Number. This number varies between machine types, but is upward compatible. Lessen the checks to reflect this behavior. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-09s390: adjust indentation of RELOCS command build step outHeiko Carstens1-1/+1
Common pattern in non-verbose build output for quiet commands is that the shorthand of a command including whitespace contains at least eight characters. Adjust this for the RELOCS command, which comes only with seven characters. Before: SORTTAB vmlinux CC arch/s390/boot/version.o RELOCS arch/s390/boot/relocs.S OBJCOPY arch/s390/boot/info.bin After: SORTTAB vmlinux CC arch/s390/boot/version.o RELOCS arch/s390/boot/relocs.S OBJCOPY arch/s390/boot/info.bin Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-09s390/cio: convert sprintf()/snprintf() to sysfs_emit()Li Zhijian1-7/+7
Per filesystems/sysfs.rst, show() should only use sysfs_emit() or sysfs_emit_at() when formatting the value to be returned to user space. coccinelle complains that there are still a couple of functions that use snprintf(). Convert them to sysfs_emit(). Generally, this patch is generated by make coccicheck M=<path/to/file> MODE=patch \ COCCI=scripts/coccinelle/api/device_attr_show.cocci No functional change intended. Cc: Vineeth Vijayan <vneethv@linux.ibm.com> Cc: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Link: https://lore.kernel.org/r/20240314095209.1325229-1-lizhijian@fujitsu.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-09s390/ap: rename ap debug configuration optionHolger Dengler3-19/+16
The configuration option ZCRYPT_DEBUG is used only in ap queue code, so rename it to AP_DEBUG. It also no longer depends on ZCRYPT but on AP. While at it, also update the help text. Signed-off-by: Holger Dengler <dengler@linux.ibm.com> Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-09s390/ap: modularize ap busHolger Dengler6-8/+44
There is no hard requirement to have the ap bus statically in the kernel, so add an option to compile it as module. Cc: Tony Krowiak <akrowiak@linux.ibm.com> Cc: Halil Pasic <pasic@linux.ibm.com> Signed-off-by: Holger Dengler <dengler@linux.ibm.com> Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Anthony Krowiak <akrowiak@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-09s390/chsc: use notifier for AP configuration changesHolger Dengler4-15/+50
The direct dependency of chsc and the AP bus prevents the modularization of ap bus. Introduce a notifier interface for AP changes, which decouples the producer of the change events (chsc) from the consumer (ap_bus). Remove the ap_cfg_chg() interface and replace it with the notifier invocation. The ap bus module registers a notification handler, which triggers the AP bus scan. Cc: Vineeth Vijayan <vneethv@linux.ibm.com> Cc: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Holger Dengler <dengler@linux.ibm.com> Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Acked-by: Vineeth Vijayan <vneethv@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-09s390/uv: export prot_virt_guest symbol in uvHolger Dengler1-0/+1
The inline function is_prot_virt_guest() in asm/uv.h makes use of the prot_virt_guest symbol. As this inline function can be called by other parts of the kernel (modules and built-in), the symbol should be exported, similar to the prot_virt_host symbol. One consumer of is_prot_virt_guest() will be the ap bus code. Cc: Janosch Frank <frankja@linux.ibm.com> Cc: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Holger Dengler <dengler@linux.ibm.com> Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Acked-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-09s390/ap: swap IRQ and bus/device registrationHolger Dengler1-9/+9
The IRQ handler may rely on the bus or the root device. Register the adapter IRQ after setting up the bus and the root device to avoid any race conditions. Signed-off-by: Holger Dengler <dengler@linux.ibm.com> Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-09s390/ap: rework ap initializationHolger Dengler1-30/+72
Rework the ap initialization and add missing cleanups to the error path. Errors during the registration of IRQ handler is now also detected. Suggested-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Holger Dengler <dengler@linux.ibm.com> Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-09s390/ap: use static qci informationHolger Dengler2-63/+30
Since qci is available on most of the current machines, move away from the dynamic buffers for qci information and store it instead in a statically defined buffer. The new flags member in struct ap_config_info is now used as an indicator, if qci is available in the system (at least one of these bits is set). Suggested-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Holger Dengler <dengler@linux.ibm.com> Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-01Linux 6.9-rc2v6.9-rc2Linus Torvalds1-1/+1
2024-03-31Merge tag 'kbuild-fixes-v6.9' of ↵Linus Torvalds31-87/+66
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Deduplicate Kconfig entries for CONFIG_CXL_PMU - Fix unselectable choice entry in MIPS Kconfig, and forbid this structure - Remove unused include/asm-generic/export.h - Fix a NULL pointer dereference bug in modpost - Enable -Woverride-init warning consistently with W=1 - Drop KCSAN flags from *.mod.c files * tag 'kbuild-fixes-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kconfig: Fix typo HEIGTH to HEIGHT Documentation/llvm: Note s390 LLVM=1 support with LLVM 18.1.0 and newer kbuild: Disable KCSAN for autogenerated *.mod.c intermediaries kbuild: make -Woverride-init warnings more consistent modpost: do not make find_tosym() return NULL export.h: remove include/asm-generic/export.h kconfig: do not reparent the menu inside a choice block MIPS: move unselectable FIT_IMAGE_FDT_EPM5 out of the "System type" choice cxl: remove CONFIG_CXL_PMU entry in drivers/cxl/Kconfig
2024-03-31Merge tag 'edac_urgent_for_v6.9_rc2' of ↵Linus Torvalds2-18/+43
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC fixes from Borislav Petkov: - Fix more issues in the AMD FMPM driver * tag 'edac_urgent_for_v6.9_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: RAS: Avoid build errors when CONFIG_DEBUG_FS=n RAS/AMD/FMPM: Safely handle saved records of various sizes RAS/AMD/FMPM: Avoid NULL ptr deref in get_saved_records()
2024-03-31Merge tag 'irq_urgent_for_v6.9_rc2' of ↵Linus Torvalds4-4/+12
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Borislav Petkov: - Fix an unused function warning on irqchip/irq-armada-370-xp - Fix the IRQ sharing with pinctrl-amd and ACPI OSL * tag 'irq_urgent_for_v6.9_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/armada-370-xp: Suppress unused-function warning genirq: Introduce IRQF_COND_ONESHOT and use it in pinctrl-amd
2024-03-31Merge tag 'perf_urgent_for_v6.9_rc2' of ↵Linus Torvalds7-16/+62
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 perf fixes from Borislav Petkov: - Define the correct set of default hw events on AMD Zen4 - Use the correct stalled cycles PMCs on AMD Zen2 and newer - Fix detection of the LBR freeze feature on AMD * tag 'perf_urgent_for_v6.9_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/amd/core: Define a proper ref-cycles event for Zen 4 and later perf/x86/amd/core: Update and fix stalled-cycles-* events for Zen 2 and later perf/x86/amd/lbr: Use freeze based on availability x86/cpufeatures: Add new word for scattered features
2024-03-31Merge tag 'timers_urgent_for_v6.9_rc2' of ↵Linus Torvalds1-8/+27
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timers update from Borislav Petkov: - Volunteer in Anna-Maria and Frederic as timers co-maintainers so that tglx can relax more :-P * tag 'timers_urgent_for_v6.9_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: MAINTAINERS: Add co-maintainers for time[rs]
2024-03-31Merge tag 'objtool_urgent_for_v6.9_rc2' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool fix from Borislav Petkov: - Fix a format specifier build error in objtool during an x32 build * tag 'objtool_urgent_for_v6.9_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Fix compile failure when using the x32 compiler
2024-03-31Merge tag 'x86_urgent_for_v6.9_rc2' of ↵Linus Torvalds16-64/+64
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Make sure single object builds in arch/x86/virt/ ala make ... arch/x86/virt/vmx/tdx/seamcall.o work again - Do not do ROM range scans and memory validation when the kernel is running as a SEV-SNP guest as those can get problematic and, before that, are not really needed in such a guest - Exclude the build-time generated vdso-image-x32.o object from objtool validation and in particular the return sites in there due to a warning which fires when an unpatched return thunk is being used - Improve the NMI CPUs stall message to show additional information about the state of each CPU wrt the NMI handler - Enable gcc named address spaces support only on !KCSAN configs due to compiler options incompatibility - Revert a change which was trying to use GB pages for mapping regions only when the regions would be large enough but that change lead to kexec failing - A documentation fixlet * tag 'x86_urgent_for_v6.9_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/build: Use obj-y to descend into arch/x86/virt/ x86/sev: Skip ROM range scans and validation for SEV-SNP guests x86/vdso: Fix rethunk patching for vdso-image-x32.o too x86/nmi: Upgrade NMI backtrace stall checks & messages x86/percpu: Disable named address spaces for KCSAN Revert "x86/mm/ident_map: Use gbpages only where full GB page should be mapped." Documentation/x86: Fix title underline length
2024-03-31kconfig: Fix typo HEIGTH to HEIGHTIsak Ellmer8-14/+14
Fixed a typo in some variables where height was misspelled as heigth. Signed-off-by: Isak Ellmer <isak01@gmail.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-03-31Documentation/llvm: Note s390 LLVM=1 support with LLVM 18.1.0 and newerNathan Chancellor1-1/+1
As of the first s390 pull request during the 6.9 merge window, commit 691632f0e869 ("Merge tag 's390-6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux"), s390 can be built with LLVM=1 when using LLVM 18.1.0, which is the first version that has SystemZ support implemented in ld.lld and llvm-objcopy. Update the supported architectures table in the Kbuild LLVM documentation to note this explicitly to make it more discoverable by users and other developers. Additionally, this brings s390 in line with the rest of the architectures in the table, which all support LLVM=1. Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-03-31kbuild: Disable KCSAN for autogenerated *.mod.c intermediariesBorislav Petkov (AMD)1-1/+1
When KCSAN and CONSTRUCTORS are enabled, one can trigger the "Unpatched return thunk in use. This should not happen!" catch-all warning. Usually, when objtool runs on the .o objects, it does generate a section .return_sites which contains all offsets in the objects to the return thunks of the functions present there. Those return thunks then get patched at runtime by the alternatives. KCSAN and CONSTRUCTORS add this to the object file's .text.startup section: ------------------- Disassembly of section .text.startup: ... 0000000000000010 <_sub_I_00099_0>: 10: f3 0f 1e fa endbr64 14: e8 00 00 00 00 call 19 <_sub_I_00099_0+0x9> 15: R_X86_64_PLT32 __tsan_init-0x4 19: e9 00 00 00 00 jmp 1e <__UNIQUE_ID___addressable_cryptd_alloc_aead349+0x6> 1a: R_X86_64_PLT32 __x86_return_thunk-0x4 ------------------- which, if it is built as a module goes through the intermediary stage of creating a <module>.mod.c file which, when translated, receives a second constructor: ------------------- Disassembly of section .text.startup: 0000000000000010 <_sub_I_00099_0>: 10: f3 0f 1e fa endbr64 14: e8 00 00 00 00 call 19 <_sub_I_00099_0+0x9> 15: R_X86_64_PLT32 __tsan_init-0x4 19: e9 00 00 00 00 jmp 1e <_sub_I_00099_0+0xe> 1a: R_X86_64_PLT32 __x86_return_thunk-0x4 ... 0000000000000030 <_sub_I_00099_0>: 30: f3 0f 1e fa endbr64 34: e8 00 00 00 00 call 39 <_sub_I_00099_0+0x9> 35: R_X86_64_PLT32 __tsan_init-0x4 39: e9 00 00 00 00 jmp 3e <__ksymtab_cryptd_alloc_ahash+0x2> 3a: R_X86_64_PLT32 __x86_return_thunk-0x4 ------------------- in the .ko file. Objtool has run already so that second constructor's return thunk cannot be added to the .return_sites section and thus the return thunk remains unpatched and the warning rightfully fires. Drop KCSAN flags from the mod.c generation stage as those constructors do not contain data races one would be interested about. Debugged together with David Kaplan <David.Kaplan@amd.com> and Nikolay Borisov <nik.borisov@suse.com>. Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Closes: https://lore.kernel.org/r/0851a207-7143-417e-be31-8bf2b3afb57d@molgen.mpg.de Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Paul Menzel <pmenzel@molgen.mpg.de> # Dell XPS 13 Reviewed-by: Nikolay Borisov <nik.borisov@suse.com> Reviewed-by: Marco Elver <elver@google.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-03-31kbuild: make -Woverride-init warnings more consistentArnd Bergmann13-23/+18
The -Woverride-init warn about code that may be intentional or not, but the inintentional ones tend to be real bugs, so there is a bit of disagreement on whether this warning option should be enabled by default and we have multiple settings in scripts/Makefile.extrawarn as well as individual subsystems. Older versions of clang only supported -Wno-initializer-overrides with the same meaning as gcc's -Woverride-init, though all supported versions now work with both. Because of this difference, an earlier cleanup of mine accidentally turned the clang warning off for W=1 builds and only left it on for W=2, while it's still enabled for gcc with W=1. There is also one driver that only turns the warning off for newer versions of gcc but not other compilers, and some but not all the Makefiles still use a cc-disable-warning conditional that is no longer needed with supported compilers here. Address all of the above by removing the special cases for clang and always turning the warning off unconditionally where it got in the way, using the syntax that is supported by both compilers. Fixes: 2cd3271b7a31 ("kbuild: avoid duplicate warning options") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Andrew Jeffery <andrew@codeconstruct.com.au> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-03-31objtool: Fix compile failure when using the x32 compilerMikulas Patocka1-1/+1
When compiling the v6.9-rc1 kernel with the x32 compiler, the following errors are reported. The reason is that we take an "unsigned long" variable and print it using "PRIx64" format string. In file included from check.c:16: check.c: In function ‘add_dead_ends’: /usr/src/git/linux-2.6/tools/objtool/include/objtool/warn.h:46:17: error: format ‘%llx’ expects argument of type ‘long long unsigned int’, but argument 5 has type ‘long unsigned int’ [-Werror=format=] 46 | "%s: warning: objtool: " format "\n", \ | ^~~~~~~~~~~~~~~~~~~~~~~~ check.c:613:33: note: in expansion of macro ‘WARN’ 613 | WARN("can't find unreachable insn at %s+0x%" PRIx64, | ^~~~ ... Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: linux-kernel@vger.kernel.org
2024-03-30Merge tag 'xfs-6.9-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds6-32/+41
Pull xfs fixes from Chandan Babu: - Allow stripe unit/width value passed via mount option to be written over existing values in the super block - Do not set current->journal_info to avoid its value from being miused by another filesystem context * tag 'xfs-6.9-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: don't use current->journal_info xfs: allow sunit mount option to repair bad primary sb stripe values
2024-03-30Merge tag 'scsi-fixes' of ↵Linus Torvalds47-355/+479
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes and updates from James Bottomley: "Fully half this pull is updates to lpfc and qla2xxx which got committed just as the merge window opened. A sizeable fraction of the driver updates are simple bug fixes (and lock reworks for bug fixes in the case of lpfc), so rather than splitting the few actual enhancements out, we're just adding the drivers to the -rc1 pull. The enhancements for lpfc are log message removals, copyright updates and three patches redefining types. For qla2xxx it's just removing a debug message on module removal and the manufacturer detail update. The two major fixes are the sg teardown race and a core error leg problem with the procfs directory not being removed if we destroy a created host that never got to the running state. The rest are minor fixes and constifications" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (41 commits) scsi: bnx2fc: Remove spin_lock_bh while releasing resources after upload scsi: core: Fix unremoved procfs host directory regression scsi: mpi3mr: Avoid memcpy field-spanning write WARNING scsi: sd: Fix TCG OPAL unlock on system resume scsi: sg: Avoid sg device teardown race scsi: lpfc: Copyright updates for 14.4.0.1 patches scsi: lpfc: Update lpfc version to 14.4.0.1 scsi: lpfc: Define types in a union for generic void *context3 ptr scsi: lpfc: Define lpfc_dmabuf type for ctx_buf ptr scsi: lpfc: Define lpfc_nodelist type for ctx_ndlp ptr scsi: lpfc: Use a dedicated lock for ras_fwlog state scsi: lpfc: Release hbalock before calling lpfc_worker_wake_up() scsi: lpfc: Replace hbalock with ndlp lock in lpfc_nvme_unregister_port() scsi: lpfc: Update lpfc_ramp_down_queue_handler() logic scsi: lpfc: Remove IRQF_ONESHOT flag from threaded IRQ handling scsi: lpfc: Move NPIV's transport unregistration to after resource clean up scsi: lpfc: Remove unnecessary log message in queuecommand path scsi: qla2xxx: Update version to 10.02.09.200-k scsi: qla2xxx: Delay I/O Abort on PCI error scsi: qla2xxx: Change debug message during driver unload ...
2024-03-30Merge tag 'i2c-for-6.9-rc2' of ↵Linus Torvalds1-3/+4
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fix from Wolfram Sang: "A fix from Andi for I2C host drivers" * tag 'i2c-for-6.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: i801: Fix a refactoring that broke a touchpad on Lenovo P1
2024-03-30Merge tag 'usb-6.9-rc2' of ↵Linus Torvalds27-160/+376
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are a bunch of small USB fixes for reported problems and regressions for 6.9-rc2. Included in here are: - deadlock fixes for long-suffering issues - USB phy driver revert for reported problem - typec fixes for reported problems - duplicate id in dwc3 dropped - dwc2 driver fixes - udc driver warning fix - cdc-wdm race bugfix - other tiny USB bugfixes All of these have been in linux-next this past week with no reported issues" * tag 'usb-6.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (26 commits) USB: core: Fix deadlock in port "disable" sysfs attribute USB: core: Add hub_get() and hub_put() routines usb: typec: ucsi: Check capabilities before cable and identity discovery usb: typec: ucsi: Clear UCSI_CCI_RESET_COMPLETE before reset usb: typec: ucsi_acpi: Refactor and fix DELL quirk usb: typec: ucsi: Ack unsupported commands usb: typec: ucsi: Check for notifications after init usb: typec: ucsi: Clear EVENT_PENDING under PPM lock usb: typec: Return size of buffer if pd_set operation succeeds usb: udc: remove warning when queue disabled ep usb: dwc3: pci: Drop duplicate ID usb: dwc3: Properly set system wakeup Revert "usb: phy: generic: Get the vbus supply" usb: cdc-wdm: close race between read and workqueue usb: dwc2: gadget: LPM flow fix usb: dwc2: gadget: Fix exiting from clock gating usb: dwc2: host: Fix ISOC flow in DDMA mode usb: dwc2: host: Fix remote wakeup from hibernation usb: dwc2: host: Fix hibernation flow USB: core: Fix deadlock in usb_deauthorize_interface() ...