The following commit has been merged in the master branch: commit d68b4b6f307d155475cce541f2aee938032ed22e Merge: b96a3e9142fdf346b05b20e867b4f0dfca119e96 dce8f8ed1de1d9d6d27c5ccd202ce4ec163b100c Author: Linus Torvalds torvalds@linux-foundation.org Date: Tue Aug 29 14:53:51 2023 -0700
Merge tag 'mm-nonmm-stable-2023-08-28-22-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- An extensive rework of kexec and crash Kconfig from Eric DeVolder ("refactor Kconfig to consolidate KEXEC and CRASH options")
- kernel.h slimming work from Andy Shevchenko ("kernel.h: Split out a couple of macros to args.h")
- gdb feature work from Kuan-Ying Lee ("Add GDB memory helper commands")
- vsprintf inclusion rationalization from Andy Shevchenko ("lib/vsprintf: Rework header inclusions")
- Switch the handling of kdump from a udev scheme to in-kernel handling, by Eric DeVolder ("crash: Kernel handling of CPU and memory hot un/plug")
- Many singleton patches to various parts of the tree
* tag 'mm-nonmm-stable-2023-08-28-22-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (81 commits) document while_each_thread(), change first_tid() to use for_each_thread() drivers/char/mem.c: shrink character device's devlist[] array x86/crash: optimize CPU changes crash: change crash_prepare_elf64_headers() to for_each_possible_cpu() crash: hotplug support for kexec_load() x86/crash: add x86 crash hotplug support crash: memory and CPU hotplug sysfs attributes kexec: exclude elfcorehdr from the segment digest crash: add generic infrastructure for crash hotplug support crash: move a few code bits to setup support of crash hotplug kstrtox: consistently use _tolower() kill do_each_thread() nilfs2: fix WARNING in mark_buffer_dirty due to discarded buffer reuse scripts/bloat-o-meter: count weak symbol sizes treewide: drop CONFIG_EMBEDDED lockdep: fix static memory detection even more lib/vsprintf: declare no_hash_pointers in sprintf.h lib/vsprintf: split out sprintf() and friends kernel/fork: stop playing lockless games for exe_file replacement adfs: delete unused "union adfs_dirtail" definition ...
diff --combined Documentation/ABI/testing/sysfs-devices-system-cpu index 183a07c4f191,31189da7ef57..7ecd5c8161a6 --- a/Documentation/ABI/testing/sysfs-devices-system-cpu +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu @@@ -513,18 -513,17 +513,18 @@@ Description: information about CPUs het cpu_capacity: capacity of cpuX.
What: /sys/devices/system/cpu/vulnerabilities + /sys/devices/system/cpu/vulnerabilities/gather_data_sampling + /sys/devices/system/cpu/vulnerabilities/itlb_multihit + /sys/devices/system/cpu/vulnerabilities/l1tf + /sys/devices/system/cpu/vulnerabilities/mds /sys/devices/system/cpu/vulnerabilities/meltdown + /sys/devices/system/cpu/vulnerabilities/mmio_stale_data + /sys/devices/system/cpu/vulnerabilities/retbleed + /sys/devices/system/cpu/vulnerabilities/spec_store_bypass /sys/devices/system/cpu/vulnerabilities/spectre_v1 /sys/devices/system/cpu/vulnerabilities/spectre_v2 - /sys/devices/system/cpu/vulnerabilities/spec_store_bypass - /sys/devices/system/cpu/vulnerabilities/l1tf - /sys/devices/system/cpu/vulnerabilities/mds /sys/devices/system/cpu/vulnerabilities/srbds /sys/devices/system/cpu/vulnerabilities/tsx_async_abort - /sys/devices/system/cpu/vulnerabilities/itlb_multihit - /sys/devices/system/cpu/vulnerabilities/mmio_stale_data - /sys/devices/system/cpu/vulnerabilities/retbleed Date: January 2018 Contact: Linux kernel mailing list linux-kernel@vger.kernel.org Description: Information about CPU vulnerabilities @@@ -556,7 -555,6 +556,7 @@@ Description: Control Symmetric Multi Th ================ ========================================= "on" SMT is enabled "off" SMT is disabled + "<N>" SMT is enabled with N threads per core. "forceoff" SMT is force disabled. Cannot be changed. "notsupported" SMT is not supported by the CPU "notimplemented" SMT runtime toggling is not @@@ -688,3 -686,11 +688,11 @@@ Description (RO) the list of CPUs that are isolated and don't participate in load balancing. These CPUs are set by boot parameter "isolcpus=". + + What: /sys/devices/system/cpu/crash_hotplug + Date: Aug 2023 + Contact: Linux kernel mailing list linux-kernel@vger.kernel.org + Description: + (RO) indicates whether or not the kernel directly supports + modifying the crash elfcorehdr for CPU hot un/plug and/or + on/offline changes. diff --combined Documentation/admin-guide/mm/memory-hotplug.rst index 2994958c7ce8,eb99d79223a3..cfe034cf1e87 --- a/Documentation/admin-guide/mm/memory-hotplug.rst +++ b/Documentation/admin-guide/mm/memory-hotplug.rst @@@ -291,6 -291,14 +291,14 @@@ The following files are currently defin Availability depends on the CONFIG_ARCH_MEMORY_PROBE kernel configuration option. ``uevent`` read-write: generic udev file for device subsystems. + ``crash_hotplug`` read-only: when changes to the system memory map + occur due to hot un/plug of memory, this file contains + '1' if the kernel updates the kdump capture kernel memory + map itself (via elfcorehdr), or '0' if userspace must update + the kdump capture kernel memory map. + + Availability depends on the CONFIG_MEMORY_HOTPLUG kernel + configuration option. ====================== =========================================================
.. note:: @@@ -433,18 -441,6 +441,18 @@@ The following module parameters are cur memory in a way that huge pages in bigger granularity cannot be formed on hotplugged memory. + + With value "force" it could result in memory + wastage due to memmap size limitations. For + example, if the memmap for a memory block + requires 1 MiB, but the pageblock size is 2 + MiB, 1 MiB of hotplugged memory will be wasted. + Note that there are still cases where the + feature cannot be enforced: for example, if the + memmap is smaller than a single page, or if the + architecture does not support the forced mode + in all configurations. + ``online_policy`` read-write: Set the basic policy used for automatic zone selection when onlining memory blocks without specifying a target zone. @@@ -681,7 -677,7 +689,7 @@@ when still encountering permanently unm (-> BUG), memory offlining will keep retrying until it eventually succeeds.
When offlining is triggered from user space, the offlining context can be -terminated by sending a fatal signal. A timeout based offlining can easily be +terminated by sending a signal. A timeout based offlining can easily be implemented via::
% timeout $TIMEOUT offline_block | failure_handling diff --combined Documentation/core-api/cpu_hotplug.rst index b9ae591d0b18,d6d470d7dda0..9511e405aabd --- a/Documentation/core-api/cpu_hotplug.rst +++ b/Documentation/core-api/cpu_hotplug.rst @@@ -395,8 -395,8 +395,8 @@@ multi-instance state the following func * cpuhp_setup_state_multi(state, name, startup, teardown)
The @state argument is either a statically allocated state or one of the -constants for dynamically allocated states - CPUHP_PREPARE_DYN, -CPUHP_ONLINE_DYN - depending on the state section (PREPARE, ONLINE) for +constants for dynamically allocated states - CPUHP_BP_PREPARE_DYN, +CPUHP_AP_ONLINE_DYN - depending on the state section (PREPARE, ONLINE) for which a dynamic state should be allocated.
The @name argument is used for sysfs output and for instrumentation. The @@@ -588,7 -588,7 +588,7 @@@ notifications on online and offline ope Setup and teardown a dynamically allocated state in the ONLINE section for notifications on offline operations::
- state = cpuhp_setup_state(CPUHP_ONLINE_DYN, "subsys:offline", NULL, subsys_cpu_offline); + state = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "subsys:offline", NULL, subsys_cpu_offline); if (state < 0) return state; .... @@@ -597,7 -597,7 +597,7 @@@ Setup and teardown a dynamically allocated state in the ONLINE section for notifications on online operations without invoking the callbacks::
- state = cpuhp_setup_state_nocalls(CPUHP_ONLINE_DYN, "subsys:online", subsys_cpu_online, NULL); + state = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN, "subsys:online", subsys_cpu_online, NULL); if (state < 0) return state; .... @@@ -606,7 -606,7 +606,7 @@@ Setup, use and teardown a dynamically allocated multi-instance state in the ONLINE section for notifications on online and offline operation::
- state = cpuhp_setup_state_multi(CPUHP_ONLINE_DYN, "subsys:online", subsys_cpu_online, subsys_cpu_offline); + state = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, "subsys:online", subsys_cpu_online, subsys_cpu_offline); if (state < 0) return state; .... @@@ -741,6 -741,24 +741,24 @@@ will receive all events. A script like:
can process the event further.
+ When changes to the CPUs in the system occur, the sysfs file + /sys/devices/system/cpu/crash_hotplug contains '1' if the kernel + updates the kdump capture kernel list of CPUs itself (via elfcorehdr), + or '0' if userspace must update the kdump capture kernel list of CPUs. + + The availability depends on the CONFIG_HOTPLUG_CPU kernel configuration + option. + + To skip userspace processing of CPU hot un/plug events for kdump + (i.e. the unload-then-reload to obtain a current list of CPUs), this sysfs + file can be used in a udev rule as follows: + + SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end" + + For a CPU hot un/plug event, if the architecture supports kernel updates + of the elfcorehdr (which contains the list of CPUs), then the rule skips + the unload-then-reload of the kdump capture kernel. + Kernel Inline Documentations Reference ======================================
diff --combined arch/Kconfig index 63c5d6a2022b,94050a3f094e..ec49c0100550 --- a/arch/Kconfig +++ b/arch/Kconfig @@@ -11,19 -11,6 +11,6 @@@ source "arch/$(SRCARCH)/Kconfig
menu "General architecture-dependent options"
- config CRASH_CORE - bool - - config KEXEC_CORE - select CRASH_CORE - bool - - config KEXEC_ELF - bool - - config HAVE_IMA_KEXEC - bool - config ARCH_HAS_SUBPAGE_FAULTS bool help @@@ -34,9 -21,6 +21,9 @@@ config HOTPLUG_SMT bool
+config SMT_NUM_THREADS_DYNAMIC + bool + # Selected by HOTPLUG_CORE_SYNC_DEAD or HOTPLUG_CORE_SYNC_FULL config HOTPLUG_CORE_SYNC bool @@@ -748,7 -732,9 +735,9 @@@ config HAS_LTO_CLAN depends on $(success,$(AR) --help | head -n 1 | grep -qi llvm) depends on ARCH_SUPPORTS_LTO_CLANG depends on !FTRACE_MCOUNT_USE_RECORDMCOUNT - depends on !KASAN || KASAN_HW_TAGS + # https://github.com/ClangBuiltLinux/linux/issues/1721 + depends on (!KASAN || KASAN_HW_TAGS || CLANG_VERSION >= 170000) || !DEBUG_INFO + depends on (!KCOV || CLANG_VERSION >= 170000) || !DEBUG_INFO depends on !GCOV_KERNEL help The compiler and Kconfig options support building with Clang's diff --combined arch/arm64/Kconfig index c060a26dec28,7635ac65c0dc..b10515c0200b --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@@ -78,7 -78,6 +78,7 @@@ config ARM6 select ARCH_INLINE_SPIN_UNLOCK_IRQ if !PREEMPTION select ARCH_INLINE_SPIN_UNLOCK_IRQRESTORE if !PREEMPTION select ARCH_KEEP_MEMBLOCK + select ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE select ARCH_USE_CMPXCHG_LOCKREF select ARCH_USE_GNU_PROPERTY select ARCH_USE_MEMTEST @@@ -97,7 -96,6 +97,7 @@@ select ARCH_SUPPORTS_NUMA_BALANCING select ARCH_SUPPORTS_PAGE_TABLE_CHECK select ARCH_SUPPORTS_PER_VMA_LOCK + select ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH select ARCH_WANT_COMPAT_IPC_PARSE_VERSION if COMPAT select ARCH_WANT_DEFAULT_BPF_JIT select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT @@@ -350,6 -348,9 +350,6 @@@ config GENERIC_CSU config GENERIC_CALIBRATE_DELAY def_bool y
-config ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE - def_bool y - config SMP def_bool y
@@@ -1460,60 -1461,28 +1460,28 @@@ config PARAVIRT_TIME_ACCOUNTIN
If in doubt, say N here.
- config KEXEC - depends on PM_SLEEP_SMP - select KEXEC_CORE - bool "kexec system call" - help - kexec is a system call that implements the ability to shutdown your - current kernel, and to start another kernel. It is like a reboot - but it is independent of the system firmware. And like a reboot - you can start any kernel with it, not just Linux. - - config KEXEC_FILE - bool "kexec file based system call" - select KEXEC_CORE - select HAVE_IMA_KEXEC if IMA - help - This is new version of kexec system call. This system call is - file based and takes file descriptors as system call argument - for kernel and initramfs as opposed to list of segments as - accepted by previous system call. + config ARCH_SUPPORTS_KEXEC + def_bool PM_SLEEP_SMP
- config KEXEC_SIG - bool "Verify kernel signature during kexec_file_load() syscall" - depends on KEXEC_FILE - help - Select this option to verify a signature with loaded kernel - image. If configured, any attempt of loading a image without - valid signature will fail. + config ARCH_SUPPORTS_KEXEC_FILE + def_bool y
- In addition to that option, you need to enable signature - verification for the corresponding kernel image type being - loaded in order for this to work. + config ARCH_SELECTS_KEXEC_FILE + def_bool y + depends on KEXEC_FILE + select HAVE_IMA_KEXEC if IMA
- config KEXEC_IMAGE_VERIFY_SIG - bool "Enable Image signature verification support" - default y - depends on KEXEC_SIG - depends on EFI && SIGNED_PE_FILE_VERIFICATION - help - Enable Image signature verification support. + config ARCH_SUPPORTS_KEXEC_SIG + def_bool y
- comment "Support for PE file signature verification disabled" - depends on KEXEC_SIG - depends on !EFI || !SIGNED_PE_FILE_VERIFICATION + config ARCH_SUPPORTS_KEXEC_IMAGE_VERIFY_SIG + def_bool y
- config CRASH_DUMP - bool "Build kdump crash kernel" - help - Generate crash dump after being started by kexec. This should - be normally only set in special crash dump kernels which are - loaded in the main kernel with kexec-tools into a specially - reserved region and then later executed after a crash by - kdump/kexec. + config ARCH_DEFAULT_KEXEC_IMAGE_VERIFY_SIG + def_bool y
- For more details see Documentation/admin-guide/kdump/kdump.rst + config ARCH_SUPPORTS_CRASH_DUMP + def_bool y
config TRANS_TABLE def_bool y @@@ -1792,6 -1761,9 +1760,6 @@@ config ARM64_PA The feature is detected at runtime, and will remain as a 'nop' instruction if the cpu does not implement the feature.
-config AS_HAS_LDAPR - def_bool $(as-instr,.arch_extension rcpc) - config AS_HAS_LSE_ATOMICS def_bool $(as-instr,.arch_extension lse)
@@@ -1929,9 -1901,6 +1897,9 @@@ config AS_HAS_ARMV8_ config AS_HAS_CFI_NEGATE_RA_STATE def_bool $(as-instr,.cfi_startproc\n.cfi_negate_ra_state\n.cfi_endproc\n)
+config AS_HAS_LDAPR + def_bool $(as-instr,.arch_extension rcpc) + endmenu # "ARMv8.3 architectural features"
menu "ARMv8.4 architectural features" diff --combined arch/ia64/Kconfig index 3ab75f36c037,88382f105301..53faa122b0f4 --- a/arch/ia64/Kconfig +++ b/arch/ia64/Kconfig @@@ -47,7 -47,6 +47,7 @@@ config IA6 select GENERIC_IRQ_LEGACY select ARCH_HAVE_NMI_SAFE_CMPXCHG select GENERIC_IOMAP + select GENERIC_IOREMAP select GENERIC_SMP_IDLE_THREAD select ARCH_TASK_STRUCT_ON_STACK select ARCH_TASK_STRUCT_ALLOCATOR @@@ -362,31 -361,13 +362,13 @@@ config IA64_HP_AML_NF the "force" module parameter, e.g., with the "aml_nfw.force" kernel command line option.
- config KEXEC - bool "kexec system call" - depends on !SMP || HOTPLUG_CPU - select KEXEC_CORE - help - kexec is a system call that implements the ability to shutdown your - current kernel, and to start another kernel. It is like a reboot - but it is independent of the system firmware. And like a reboot - you can start any kernel with it, not just Linux. - - The name comes from the similarity to the exec system call. - - It is an ongoing process to be certain the hardware in a machine - is properly shutdown, so do not be surprised if this code does not - initially work for you. As of this writing the exact hardware - interface is strongly in flux, so no good recommendation can be - made. + endmenu
- config CRASH_DUMP - bool "kernel crash dumps" - depends on IA64_MCA_RECOVERY && (!SMP || HOTPLUG_CPU) - help - Generate crash dump after being started by kexec. + config ARCH_SUPPORTS_KEXEC + def_bool !SMP || HOTPLUG_CPU
- endmenu + config ARCH_SUPPORTS_CRASH_DUMP + def_bool IA64_MCA_RECOVERY && (!SMP || HOTPLUG_CPU)
menu "Power management and ACPI options"
diff --combined arch/loongarch/Kconfig index 34d7a3de9a79,9f81b786c882..ecf282dee513 --- a/arch/loongarch/Kconfig +++ b/arch/loongarch/Kconfig @@@ -60,7 -60,7 +60,7 @@@ config LOONGARC select ARCH_USE_QUEUED_SPINLOCKS select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT select ARCH_WANT_LD_ORPHAN_WARN - select ARCH_WANT_OPTIMIZE_VMEMMAP + select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP select ARCH_WANTS_NO_INSTR select BUILDTIME_TABLE_SORT select COMMON_CLK @@@ -538,28 -538,16 +538,16 @@@ config CPU_HAS_PREFETC bool default y
- config KEXEC - bool "Kexec system call" - select KEXEC_CORE - help - kexec is a system call that implements the ability to shutdown your - current kernel, and to start another kernel. It is like a reboot - but it is independent of the system firmware. And like a reboot - you can start any kernel with it, not just Linux. + config ARCH_SUPPORTS_KEXEC + def_bool y
- The name comes from the similarity to the exec system call. + config ARCH_SUPPORTS_CRASH_DUMP + def_bool y
- config CRASH_DUMP - bool "Build kdump crash kernel" + config ARCH_SELECTS_CRASH_DUMP + def_bool y + depends on CRASH_DUMP select RELOCATABLE - help - Generate crash dump after being started by kexec. This should - be normally only set in special crash dump kernels which are - loaded in the main kernel with kexec-tools into a specially - reserved region and then later executed after a crash by - kdump/kexec. - - For more details see Documentation/admin-guide/kdump/kdump.rst
config RELOCATABLE bool "Relocatable kernel" @@@ -662,3 -650,5 +650,3 @@@ source "kernel/power/Kconfig source "drivers/acpi/Kconfig"
endmenu - -source "drivers/firmware/Kconfig" diff --combined arch/loongarch/kernel/process.c index 4ee1e9d6a65f,778e8d09953e..ba457e43f5be --- a/arch/loongarch/kernel/process.c +++ b/arch/loongarch/kernel/process.c @@@ -61,6 -61,13 +61,6 @@@ EXPORT_SYMBOL(__stack_chk_guard) unsigned long boot_option_idle_override = IDLE_NO_OVERRIDE; EXPORT_SYMBOL(boot_option_idle_override);
-#ifdef CONFIG_HOTPLUG_CPU -void __noreturn arch_cpu_idle_dead(void) -{ - play_dead(); -} -#endif - asmlinkage void ret_from_fork(void); asmlinkage void ret_from_kernel_thread(void);
@@@ -338,9 -345,9 +338,9 @@@ static void raise_backtrace(cpumask_t * } }
- void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self) + void arch_trigger_cpumask_backtrace(const cpumask_t *mask, int exclude_cpu) { - nmi_trigger_cpumask_backtrace(mask, exclude_self, raise_backtrace); + nmi_trigger_cpumask_backtrace(mask, exclude_cpu, raise_backtrace); }
#ifdef CONFIG_64BIT diff --combined arch/parisc/Kconfig index 4fd36642a779,2ef6843aae60..a15ab147af2e --- a/arch/parisc/Kconfig +++ b/arch/parisc/Kconfig @@@ -36,7 -36,6 +36,7 @@@ config PARIS select GENERIC_ATOMIC64 if !64BIT select GENERIC_IRQ_PROBE select GENERIC_PCI_IOMAP + select GENERIC_IOREMAP select ARCH_HAVE_NMI_SAFE_CMPXCHG select GENERIC_SMP_IDLE_THREAD select GENERIC_ARCH_TOPOLOGY if SMP @@@ -50,9 -49,6 +50,9 @@@ select TTY # Needed for pdc_cons.c select HAS_IOPORT if PCI || EISA select HAVE_DEBUG_STACKOVERFLOW + select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT + select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT + select HAVE_ARCH_MMAP_RND_BITS select HAVE_ARCH_AUDITSYSCALL select HAVE_ARCH_HASH select HAVE_ARCH_JUMP_LABEL @@@ -60,8 -56,6 +60,8 @@@ select HAVE_ARCH_KFENCE select HAVE_ARCH_SECCOMP_FILTER select HAVE_ARCH_TRACEHOOK + select HAVE_EBPF_JIT + select ARCH_WANT_DEFAULT_BPF_JIT select HAVE_REGS_AND_STACK_ACCESS_API select HOTPLUG_CORE_SYNC_DEAD if HOTPLUG_CPU select GENERIC_SCHED_CLOCK @@@ -130,20 -124,6 +130,20 @@@ config TIME_LOW_RE depends on SMP default y
+config ARCH_MMAP_RND_BITS_MIN + default 18 if 64BIT + default 8 + +config ARCH_MMAP_RND_COMPAT_BITS_MIN + default 8 + +config ARCH_MMAP_RND_BITS_MAX + default 24 if 64BIT + default 17 + +config ARCH_MMAP_RND_COMPAT_BITS_MAX + default 17 + # unless you want to implement ACPI on PA-RISC ... ;-) config PM bool @@@ -359,29 -339,17 +359,17 @@@ config NR_CPU default "8" if 64BIT default "16"
- config KEXEC - bool "Kexec system call" - select KEXEC_CORE - help - kexec is a system call that implements the ability to shutdown your - current kernel, and to start another kernel. It is like a reboot - but it is independent of the system firmware. And like a reboot - you can start any kernel with it, not just Linux. - - It is an ongoing process to be certain the hardware in a machine - shutdown, so do not be surprised if this code does not - initially work for you. - - config KEXEC_FILE - bool "kexec file based system call" - select KEXEC_CORE - select KEXEC_ELF - help - This enables the kexec_file_load() System call. This is - file based and takes file descriptors as system call argument - for kernel and initramfs as opposed to list of segments as - accepted by previous system call. - endmenu
+ config ARCH_SUPPORTS_KEXEC + def_bool y + + config ARCH_SUPPORTS_KEXEC_FILE + def_bool y + + config ARCH_SELECTS_KEXEC_FILE + def_bool y + depends on KEXEC_FILE + select KEXEC_ELF + source "drivers/parisc/Kconfig" diff --combined arch/powerpc/Kconfig index 938294c996dc,7709b62e6843..21edd664689e --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@@ -157,7 -157,6 +157,7 @@@ config PP select ARCH_HAS_UBSAN_SANITIZE_ALL select ARCH_HAVE_NMI_SAFE_CMPXCHG select ARCH_KEEP_MEMBLOCK + select ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE if PPC_RADIX_MMU select ARCH_MIGHT_HAVE_PC_PARPORT select ARCH_MIGHT_HAVE_PC_SERIO select ARCH_OPTIONAL_KERNEL_RWX if ARCH_HAS_STRICT_KERNEL_RWX @@@ -175,7 -174,6 +175,7 @@@ select ARCH_WANT_IPC_PARSE_VERSION select ARCH_WANT_IRQS_OFF_ACTIVATE_MM select ARCH_WANT_LD_ORPHAN_WARN + select ARCH_WANT_OPTIMIZE_DAX_VMEMMAP if PPC_RADIX_MMU select ARCH_WANTS_MODULES_DATA_IN_VMALLOC if PPC_BOOK3S_32 || PPC_8xx select ARCH_WEAK_RELEASE_ACQUIRE select BINFMT_ELF @@@ -195,7 -193,6 +195,7 @@@ select GENERIC_CPU_VULNERABILITIES if PPC_BARRIER_NOSPEC select GENERIC_EARLY_IOREMAP select GENERIC_GETTIMEOFDAY + select GENERIC_IOREMAP select GENERIC_IRQ_SHOW select GENERIC_IRQ_SHOW_LEVEL select GENERIC_PCI_IOMAP if PCI @@@ -592,41 -589,21 +592,21 @@@ config PPC64_SUPPORTS_MEMORY_FAILUR default "y" if PPC_POWERNV select ARCH_SUPPORTS_MEMORY_FAILURE
- config KEXEC - bool "kexec system call" - depends on PPC_BOOK3S || PPC_E500 || (44x && !SMP) - select KEXEC_CORE - help - kexec is a system call that implements the ability to shutdown your - current kernel, and to start another kernel. It is like a reboot - but it is independent of the system firmware. And like a reboot - you can start any kernel with it, not just Linux. - - The name comes from the similarity to the exec system call. - - It is an ongoing process to be certain the hardware in a machine - is properly shutdown, so do not be surprised if this code does not - initially work for you. As of this writing the exact hardware - interface is strongly in flux, so no good recommendation can be - made. - - config KEXEC_FILE - bool "kexec file based system call" - select KEXEC_CORE - select HAVE_IMA_KEXEC if IMA - select KEXEC_ELF - depends on PPC64 - depends on CRYPTO=y - depends on CRYPTO_SHA256=y - help - This is a new version of the kexec system call. This call is - file based and takes in file descriptors as system call arguments - for kernel and initramfs as opposed to a list of segments as is the - case for the older kexec call. + config ARCH_SUPPORTS_KEXEC + def_bool PPC_BOOK3S || PPC_E500 || (44x && !SMP) + + config ARCH_SUPPORTS_KEXEC_FILE + def_bool PPC64 && CRYPTO=y && CRYPTO_SHA256=y
- config ARCH_HAS_KEXEC_PURGATORY + config ARCH_SUPPORTS_KEXEC_PURGATORY def_bool KEXEC_FILE
+ config ARCH_SELECTS_KEXEC_FILE + def_bool y + depends on KEXEC_FILE + select KEXEC_ELF + select HAVE_IMA_KEXEC if IMA + config PPC64_BIG_ENDIAN_ELF_ABI_V2 # Option is available to BFD, but LLD does not support ELFv1 so this is # always true there. @@@ -686,14 -663,13 +666,13 @@@ config RELOCATABLE_TES loaded at, which tends to be non-zero and therefore test the relocation code.
- config CRASH_DUMP - bool "Build a dump capture kernel" - depends on PPC64 || PPC_BOOK3S_32 || PPC_85xx || (44x && !SMP) + config ARCH_SUPPORTS_CRASH_DUMP + def_bool PPC64 || PPC_BOOK3S_32 || PPC_85xx || (44x && !SMP) + + config ARCH_SELECTS_CRASH_DUMP + def_bool y + depends on CRASH_DUMP select RELOCATABLE if PPC64 || 44x || PPC_85xx - help - Build a kernel suitable for use as a dump capture kernel. - The same kernel binary can be used as production kernel and dump - capture kernel.
config FA_DUMP bool "Firmware-assisted dump" diff --combined arch/riscv/Kconfig index b08d0e1a93b9,a39c5d03f59c..f9fbfbf11ad2 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@@ -53,7 -53,7 +53,7 @@@ config RISC select ARCH_WANT_GENERAL_HUGETLB if !RISCV_ISA_SVNAPOT select ARCH_WANT_HUGE_PMD_SHARE if 64BIT select ARCH_WANT_LD_ORPHAN_WARN if !XIP_KERNEL - select ARCH_WANT_OPTIMIZE_VMEMMAP + select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP select ARCH_WANTS_THP_SWAP if HAVE_ARCH_TRANSPARENT_HUGEPAGE select BINFMT_FLAT_NO_DATA_START_OFFSET if !MMU select BUILDTIME_TABLE_SORT if MMU @@@ -570,30 -570,24 +570,30 @@@ config TOOLCHAIN_HAS_ZIHINTPAUS config TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI def_bool y # https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=aed44286efa8ae8717... - depends on AS_IS_GNU && AS_VERSION >= 23800 + # https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=98416dbb0a62579d4a7a4a76bab51b... + depends on AS_IS_GNU && AS_VERSION >= 23600 help - Newer binutils versions default to ISA spec version 20191213 which - moves some instructions from the I extension to the Zicsr and Zifencei - extensions. + Binutils-2.38 and GCC-12.1.0 bumped the default ISA spec to the newer + 20191213 version, which moves some instructions from the I extension to + the Zicsr and Zifencei extensions. This requires explicitly specifying + Zicsr and Zifencei when binutils >= 2.38 or GCC >= 12.1.0. Zicsr + and Zifencei are supported in binutils from version 2.36 onwards. + To make life easier, and avoid forcing toolchains that default to a + newer ISA spec to version 2.2, relax the check to binutils >= 2.36. + For clang < 17 or GCC < 11.3.0, for which this is not possible or need + special treatment, this is dealt with in TOOLCHAIN_NEEDS_OLD_ISA_SPEC.
config TOOLCHAIN_NEEDS_OLD_ISA_SPEC def_bool y depends on TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI # https://github.com/llvm/llvm-project/commit/22e199e6afb1263c943c0c0d4498694e... - depends on CC_IS_CLANG && CLANG_VERSION < 170000 + # https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=d29f5d6ab513c52fd872f532c492e3... + depends on (CC_IS_CLANG && CLANG_VERSION < 170000) || (CC_IS_GCC && GCC_VERSION < 110300) help - Certain versions of clang do not support zicsr and zifencei via -march - but newer versions of binutils require it for the reasons noted in the - help text of CONFIG_TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI. This - option causes an older ISA spec compatible with these older versions - of clang to be passed to GAS, which has the same result as passing zicsr - and zifencei to -march. + Certain versions of clang and GCC do not support zicsr and zifencei via + -march. This option causes an older ISA spec compatible with these older + versions of clang and GCC to be passed to GAS, which has the same result + as passing zicsr and zifencei to -march.
config FPU bool "FPU support" @@@ -656,48 -650,30 +656,30 @@@ config RISCV_BOOT_SPINWAI
If unsure what to do here, say N.
- config KEXEC - bool "Kexec system call" - depends on MMU + config ARCH_SUPPORTS_KEXEC + def_bool MMU + + config ARCH_SELECTS_KEXEC + def_bool y + depends on KEXEC select HOTPLUG_CPU if SMP - select KEXEC_CORE - help - kexec is a system call that implements the ability to shutdown your - current kernel, and to start another kernel. It is like a reboot - but it is independent of the system firmware. And like a reboot - you can start any kernel with it, not just Linux.
- The name comes from the similarity to the exec system call. + config ARCH_SUPPORTS_KEXEC_FILE + def_bool 64BIT && MMU
- config KEXEC_FILE - bool "kexec file based systmem call" - depends on 64BIT && MMU + config ARCH_SELECTS_KEXEC_FILE + def_bool y + depends on KEXEC_FILE select HAVE_IMA_KEXEC if IMA - select KEXEC_CORE select KEXEC_ELF - help - This is new version of kexec system call. This system call is - file based and takes file descriptors as system call argument - for kernel and initramfs as opposed to list of segments as - accepted by previous system call. - - If you don't know what to do here, say Y.
- config ARCH_HAS_KEXEC_PURGATORY + config ARCH_SUPPORTS_KEXEC_PURGATORY def_bool KEXEC_FILE depends on CRYPTO=y depends on CRYPTO_SHA256=y
- config CRASH_DUMP - bool "Build kdump crash kernel" - help - Generate crash dump after being started by kexec. This should - be normally only set in special crash dump kernels which are - loaded in the main kernel with kexec-tools into a specially - reserved region and then later executed after a crash by - kdump/kexec. - - For more details see Documentation/admin-guide/kdump/kdump.rst + config ARCH_SUPPORTS_CRASH_DUMP + def_bool y
config COMPAT bool "Kernel support for 32-bit U-mode" diff --combined arch/riscv/kernel/elf_kexec.c index c08bb5c3b385,cc556beb293a..f4099059ed8f --- a/arch/riscv/kernel/elf_kexec.c +++ b/arch/riscv/kernel/elf_kexec.c @@@ -260,7 -260,7 +260,7 @@@ static void *elf_kexec_load(struct kima cmdline = modified_cmdline; }
- #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY + #ifdef CONFIG_ARCH_SUPPORTS_KEXEC_PURGATORY /* Add purgatory to the image */ kbuf.top_down = true; kbuf.mem = KEXEC_BUF_MEM_UNKNOWN; @@@ -274,14 -274,14 +274,14 @@@ sizeof(kernel_start), 0); if (ret) pr_err("Error update purgatory ret=%d\n", ret); - #endif /* CONFIG_ARCH_HAS_KEXEC_PURGATORY */ + #endif /* CONFIG_ARCH_SUPPORTS_KEXEC_PURGATORY */
/* Add the initrd to the image */ if (initrd != NULL) { kbuf.buffer = initrd; kbuf.bufsz = kbuf.memsz = initrd_len; kbuf.buf_align = PAGE_SIZE; - kbuf.top_down = false; + kbuf.top_down = true; kbuf.mem = KEXEC_BUF_MEM_UNKNOWN; ret = kexec_add_buffer(&kbuf); if (ret) @@@ -425,7 -425,6 +425,7 @@@ int arch_kexec_apply_relocations_add(st * sym, instead of searching the whole relsec. */ case R_RISCV_PCREL_HI20: + case R_RISCV_CALL_PLT: case R_RISCV_CALL: *(u64 *)loc = CLEAN_IMM(UITYPE, *(u64 *)loc) | ENCODE_UJTYPE_IMM(val - addr); diff --combined arch/s390/Kbuild index 8e4d74f51115,48a3588d703c..a5d3503b353c --- a/arch/s390/Kbuild +++ b/arch/s390/Kbuild @@@ -3,11 -3,11 +3,11 @@@ obj-y += kernel obj-y += mm/ obj-$(CONFIG_KVM) += kvm/ obj-y += crypto/ -obj-$(CONFIG_S390_HYPFS_FS) += hypfs/ +obj-$(CONFIG_S390_HYPFS) += hypfs/ obj-$(CONFIG_APPLDATA_BASE) += appldata/ obj-y += net/ obj-$(CONFIG_PCI) += pci/ - obj-$(CONFIG_ARCH_HAS_KEXEC_PURGATORY) += purgatory/ + obj-$(CONFIG_ARCH_SUPPORTS_KEXEC_PURGATORY) += purgatory/
# for cleaning subdir- += boot tools diff --combined arch/s390/Kconfig index 4f364cfe9dd0,4d011f7c26e5..661b6de69c27 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@@ -127,7 -127,7 +127,7 @@@ config S39 select ARCH_WANTS_NO_INSTR select ARCH_WANT_DEFAULT_BPF_JIT select ARCH_WANT_IPC_PARSE_VERSION - select ARCH_WANT_OPTIMIZE_VMEMMAP + select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP select BUILDTIME_TABLE_SORT select CLONE_BACKWARDS2 select DMA_OPS if PCI @@@ -143,7 -143,6 +143,7 @@@ select GENERIC_SMP_IDLE_THREAD select GENERIC_TIME_VSYSCALL select GENERIC_VDSO_TIME_NS + select GENERIC_IOREMAP if PCI select HAVE_ALIGNED_STRUCT_PAGE if SLUB select HAVE_ARCH_AUDITSYSCALL select HAVE_ARCH_JUMP_LABEL @@@ -175,7 -174,6 +175,7 @@@ select HAVE_FTRACE_MCOUNT_RECORD select HAVE_FUNCTION_ARG_ACCESS_API select HAVE_FUNCTION_ERROR_INJECTION + select HAVE_FUNCTION_GRAPH_RETVAL select HAVE_FUNCTION_GRAPH_TRACER select HAVE_FUNCTION_TRACER select HAVE_GCC_PLUGINS @@@ -215,6 -213,7 +215,7 @@@ select HAVE_VIRT_CPU_ACCOUNTING_IDLE select IOMMU_HELPER if PCI select IOMMU_SUPPORT if PCI + select KEXEC select MMU_GATHER_MERGE_VMAS select MMU_GATHER_NO_GATHER select MMU_GATHER_RCU_TABLE_FREE @@@ -246,6 -245,25 +247,25 @@@ config PGTABLE_LEVEL
source "kernel/livepatch/Kconfig"
+ config ARCH_SUPPORTS_KEXEC + def_bool y + + config ARCH_SUPPORTS_KEXEC_FILE + def_bool CRYPTO && CRYPTO_SHA256 && CRYPTO_SHA256_S390 + + config ARCH_SUPPORTS_KEXEC_SIG + def_bool MODULE_SIG_FORMAT + + config ARCH_SUPPORTS_KEXEC_PURGATORY + def_bool KEXEC_FILE + + config ARCH_SUPPORTS_CRASH_DUMP + def_bool y + help - Refer to file:Documentation/s390/zfcpdump.rst for more details on this. ++ Refer to file:Documentation/arch/s390/zfcpdump.rst for more details on this. + This option also enables s390 zfcpdump. - See also file:Documentation/s390/zfcpdump.rst ++ See also file:Documentation/arch/s390/zfcpdump.rst + menu "Processor type and features"
config HAVE_MARCH_Z10_FEATURES @@@ -484,47 -502,6 +504,17 @@@ config SCHED_TOPOLOG
source "kernel/Kconfig.hz"
- config KEXEC - def_bool y - select KEXEC_CORE - - config KEXEC_FILE - bool "kexec file based system call" - select KEXEC_CORE - depends on CRYPTO - depends on CRYPTO_SHA256 - depends on CRYPTO_SHA256_S390 - help - Enable the kexec file based system call. In contrast to the normal - kexec system call this system call takes file descriptors for the - kernel and initramfs as arguments. - - config ARCH_HAS_KEXEC_PURGATORY - def_bool y - depends on KEXEC_FILE - - config KEXEC_SIG - bool "Verify kernel signature during kexec_file_load() syscall" - depends on KEXEC_FILE && MODULE_SIG_FORMAT - help - This option makes kernel signature verification mandatory for - the kexec_file_load() syscall. - - In addition to that option, you need to enable signature - verification for the corresponding kernel image type being - loaded in order for this to work. - +config CERT_STORE + bool "Get user certificates via DIAG320" + depends on KEYS + select CRYPTO_LIB_SHA256 + help + Enable this option if you want to access user-provided secure boot + certificates via DIAG 0x320. + + These certificates will be made available via the keyring named + 'cert_store'. + config KERNEL_NOBP def_bool n prompt "Enable modified branch prediction for the kernel by default" @@@ -746,22 -723,6 +736,6 @@@ config VFIO_A
endmenu
- menu "Dump support" - - config CRASH_DUMP - bool "kernel crash dumps" - select KEXEC - help - Generate crash dump after being started by kexec. - Crash dump kernels are loaded in the main kernel with kexec-tools - into a specially reserved region and then later executed after - a crash by kdump/kexec. - Refer to file:Documentation/arch/s390/zfcpdump.rst for more details on this. - This option also enables s390 zfcpdump. - See also file:Documentation/arch/s390/zfcpdump.rst - - endmenu - config CCW def_bool y
@@@ -880,24 -841,13 +854,24 @@@ config APPLDATA_NET_SU This can also be compiled as a module, which will be called appldata_net_sum.o.
-config S390_HYPFS_FS +config S390_HYPFS def_bool y + prompt "s390 hypervisor information" + help + This provides several binary files at (debugfs)/s390_hypfs/ to + provide accounting information in an s390 hypervisor environment. + +config S390_HYPFS_FS + def_bool n prompt "s390 hypervisor file system support" select SYS_HYPERVISOR + depends on S390_HYPFS help This is a virtual file system intended to provide accounting - information in an s390 hypervisor environment. + information in an s390 hypervisor environment. This file system + is deprecated and should not be used. + + Say N if you are unsure.
source "arch/s390/kvm/Kconfig"
diff --combined arch/sh/Kconfig index 6be32254211c,1cf6603781c7..33530b044953 --- a/arch/sh/Kconfig +++ b/arch/sh/Kconfig @@@ -29,7 -29,6 +29,7 @@@ config SUPER select GENERIC_SMP_IDLE_THREAD select GUP_GET_PXX_LOW_HIGH if X2TLB select HAS_IOPORT if HAS_IOPORT_MAP + select GENERIC_IOREMAP if MMU select HAVE_ARCH_AUDITSYSCALL select HAVE_ARCH_KGDB select HAVE_ARCH_SECCOMP_FILTER @@@ -549,44 -548,14 +549,14 @@@ menu "Kernel features
source "kernel/Kconfig.hz"
- config KEXEC - bool "kexec system call (EXPERIMENTAL)" - depends on MMU - select KEXEC_CORE - help - kexec is a system call that implements the ability to shutdown your - current kernel, and to start another kernel. It is like a reboot - but it is independent of the system firmware. And like a reboot - you can start any kernel with it, not just Linux. - - The name comes from the similarity to the exec system call. - - It is an ongoing process to be certain the hardware in a machine - is properly shutdown, so do not be surprised if this code does not - initially work for you. As of this writing the exact hardware - interface is strongly in flux, so no good recommendation can be - made. - - config CRASH_DUMP - bool "kernel crash dumps (EXPERIMENTAL)" - depends on BROKEN_ON_SMP - help - Generate crash dump after being started by kexec. - This should be normally only set in special crash dump kernels - which are loaded in the main kernel with kexec-tools into - a specially reserved region and then later executed after - a crash by kdump/kexec. The crash dump kernel must be compiled - to a memory address not used by the main kernel using - PHYSICAL_START. - - For more details see Documentation/admin-guide/kdump/kdump.rst - - config KEXEC_JUMP - bool "kexec jump (EXPERIMENTAL)" - depends on KEXEC && HIBERNATION - help - Jump between original kernel and kexeced kernel and invoke - code via KEXEC + config ARCH_SUPPORTS_KEXEC + def_bool MMU + + config ARCH_SUPPORTS_CRASH_DUMP + def_bool BROKEN_ON_SMP + + config ARCH_SUPPORTS_KEXEC_JUMP + def_bool y
config PHYSICAL_START hex "Physical address where the kernel is loaded" if (EXPERT || CRASH_DUMP) diff --combined arch/x86/Kconfig index ad3a53e28aac,d9fc80b9ef84..bd9a1804cf72 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@@ -102,7 -102,6 +102,7 @@@ config X8 select ARCH_HAS_DEBUG_WX select ARCH_HAS_ZONE_DMA_SET if EXPERT select ARCH_HAVE_NMI_SAFE_CMPXCHG + select ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE select ARCH_MIGHT_HAVE_ACPI_PDC if ACPI select ARCH_MIGHT_HAVE_PC_PARPORT select ARCH_MIGHT_HAVE_PC_SERIO @@@ -129,8 -128,7 +129,8 @@@ select ARCH_WANT_GENERAL_HUGETLB select ARCH_WANT_HUGE_PMD_SHARE select ARCH_WANT_LD_ORPHAN_WARN - select ARCH_WANT_OPTIMIZE_VMEMMAP if X86_64 + select ARCH_WANT_OPTIMIZE_DAX_VMEMMAP if X86_64 + select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP if X86_64 select ARCH_WANTS_THP_SWAP if X86_64 select ARCH_HAS_PARANOID_L1D_FLUSH select BUILDTIME_TABLE_SORT @@@ -1310,8 -1308,44 +1310,8 @@@ config X86_REBOOTFIXUP Say N otherwise.
config MICROCODE - bool "CPU microcode loading support" - default y + def_bool y depends on CPU_SUP_AMD || CPU_SUP_INTEL - help - If you say Y here, you will be able to update the microcode on - Intel and AMD processors. The Intel support is for the IA32 family, - e.g. Pentium Pro, Pentium II, Pentium III, Pentium 4, Xeon etc. The - AMD support is for families 0x10 and later. You will obviously need - the actual microcode binary data itself which is not shipped with - the Linux kernel. - - The preferred method to load microcode from a detached initrd is described - in Documentation/arch/x86/microcode.rst. For that you need to enable - CONFIG_BLK_DEV_INITRD in order for the loader to be able to scan the - initrd for microcode blobs. - - In addition, you can build the microcode into the kernel. For that you - need to add the vendor-supplied microcode to the CONFIG_EXTRA_FIRMWARE - config option. - -config MICROCODE_INTEL - bool "Intel microcode loading support" - depends on CPU_SUP_INTEL && MICROCODE - default MICROCODE - help - This options enables microcode patch loading support for Intel - processors. - - For the current Intel microcode data package go to - https://downloadcenter.intel.com and search for - 'Linux Processor Microcode Data File'. - -config MICROCODE_AMD - bool "AMD microcode loading support" - depends on CPU_SUP_AMD && MICROCODE - help - If you select this option, microcode patch loading support for AMD - processors will be enabled.
config MICROCODE_LATE_LOADING bool "Late microcode loading (DANGEROUS)" @@@ -2006,88 -2040,37 +2006,37 @@@ config EFI_RUNTIME_MA
source "kernel/Kconfig.hz"
- config KEXEC - bool "kexec system call" - select KEXEC_CORE - help - kexec is a system call that implements the ability to shutdown your - current kernel, and to start another kernel. It is like a reboot - but it is independent of the system firmware. And like a reboot - you can start any kernel with it, not just Linux. - - The name comes from the similarity to the exec system call. - - It is an ongoing process to be certain the hardware in a machine - is properly shutdown, so do not be surprised if this code does not - initially work for you. As of this writing the exact hardware - interface is strongly in flux, so no good recommendation can be - made. - - config KEXEC_FILE - bool "kexec file based system call" - select KEXEC_CORE - select HAVE_IMA_KEXEC if IMA - depends on X86_64 - depends on CRYPTO=y - depends on CRYPTO_SHA256=y - help - This is new version of kexec system call. This system call is - file based and takes file descriptors as system call argument - for kernel and initramfs as opposed to list of segments as - accepted by previous system call. + config ARCH_SUPPORTS_KEXEC + def_bool y
- config ARCH_HAS_KEXEC_PURGATORY - def_bool KEXEC_FILE + config ARCH_SUPPORTS_KEXEC_FILE + def_bool X86_64 && CRYPTO && CRYPTO_SHA256
- config KEXEC_SIG - bool "Verify kernel signature during kexec_file_load() syscall" + config ARCH_SELECTS_KEXEC_FILE + def_bool y depends on KEXEC_FILE - help + select HAVE_IMA_KEXEC if IMA
- This option makes the kexec_file_load() syscall check for a valid - signature of the kernel image. The image can still be loaded without - a valid signature unless you also enable KEXEC_SIG_FORCE, though if - there's a signature that we can check, then it must be valid. + config ARCH_SUPPORTS_KEXEC_PURGATORY + def_bool KEXEC_FILE
- In addition to this option, you need to enable signature - verification for the corresponding kernel image type being - loaded in order for this to work. + config ARCH_SUPPORTS_KEXEC_SIG + def_bool y
- config KEXEC_SIG_FORCE - bool "Require a valid signature in kexec_file_load() syscall" - depends on KEXEC_SIG - help - This option makes kernel signature verification mandatory for - the kexec_file_load() syscall. + config ARCH_SUPPORTS_KEXEC_SIG_FORCE + def_bool y
- config KEXEC_BZIMAGE_VERIFY_SIG - bool "Enable bzImage signature verification support" - depends on KEXEC_SIG - depends on SIGNED_PE_FILE_VERIFICATION - select SYSTEM_TRUSTED_KEYRING - help - Enable bzImage signature verification support. + config ARCH_SUPPORTS_KEXEC_BZIMAGE_VERIFY_SIG + def_bool y
- config CRASH_DUMP - bool "kernel crash dumps" - depends on X86_64 || (X86_32 && HIGHMEM) - help - Generate crash dump after being started by kexec. - This should be normally only set in special crash dump kernels - which are loaded in the main kernel with kexec-tools into - a specially reserved region and then later executed after - a crash by kdump/kexec. The crash dump kernel must be compiled - to a memory address not used by the main kernel or BIOS using - PHYSICAL_START, or it must be built as a relocatable image - (CONFIG_RELOCATABLE=y). - For more details see Documentation/admin-guide/kdump/kdump.rst + config ARCH_SUPPORTS_KEXEC_JUMP + def_bool y
- config KEXEC_JUMP - bool "kexec jump" - depends on KEXEC && HIBERNATION - help - Jump between original kernel and kexeced kernel and invoke - code in physical address mode via KEXEC + config ARCH_SUPPORTS_CRASH_DUMP + def_bool X86_64 || (X86_32 && HIGHMEM) + + config ARCH_SUPPORTS_CRASH_HOTPLUG + def_bool y
config PHYSICAL_START hex "Physical address where the kernel is loaded" if (EXPERT || CRASH_DUMP) @@@ -2559,13 -2542,6 +2508,13 @@@ config CPU_IBRS_ENTR This mitigates both spectre_v2 and retbleed at great cost to performance.
+config CPU_SRSO + bool "Mitigate speculative RAS overflow on AMD" + depends on CPU_SUP_AMD && X86_64 && RETHUNK + default y + help + Enable the SRSO mitigation needed on AMD Zen1-4 machines. + config SLS bool "Mitigate Straight-Line-Speculation" depends on CC_HAS_SLS && X86_64 @@@ -2576,31 -2552,15 +2525,31 @@@ against straight line speculation. The kernel image might be slightly larger.
+config GDS_FORCE_MITIGATION + bool "Force GDS Mitigation" + depends on CPU_SUP_INTEL + default n + help + Gather Data Sampling (GDS) is a hardware vulnerability which allows + unprivileged speculative access to data which was previously stored in + vector registers. + + This option is equivalent to setting gather_data_sampling=force on the + command line. The microcode mitigation is used if present, otherwise + AVX is disabled as a mitigation. On affected systems that are missing + the microcode any userspace code that unconditionally uses AVX will + break with this option set. + + Setting this option on systems not vulnerable to GDS has no effect. + + If in doubt, say N. + endif
config ARCH_HAS_ADD_PAGES def_bool y depends on ARCH_ENABLE_MEMORY_HOTPLUG
-config ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE - def_bool y - menu "Power management and ACPI options"
config ARCH_HIBERNATION_HEADER diff --combined drivers/base/cpu.c index fe6690ecf563,c9204d69e616..43dab03958f1 --- a/drivers/base/cpu.c +++ b/drivers/base/cpu.c @@@ -282,6 -282,16 +282,16 @@@ static ssize_t print_cpus_nohz_full(str static DEVICE_ATTR(nohz_full, 0444, print_cpus_nohz_full, NULL); #endif
+ #ifdef CONFIG_CRASH_HOTPLUG + static ssize_t crash_hotplug_show(struct device *dev, + struct device_attribute *attr, + char *buf) + { + return sysfs_emit(buf, "%d\n", crash_hotplug_cpu_support()); + } + static DEVICE_ATTR_ADMIN_RO(crash_hotplug); + #endif + static void cpu_device_release(struct device *dev) { /* @@@ -469,6 -479,9 +479,9 @@@ static struct attribute *cpu_root_attrs #ifdef CONFIG_NO_HZ_FULL &dev_attr_nohz_full.attr, #endif + #ifdef CONFIG_CRASH_HOTPLUG + &dev_attr_crash_hotplug.attr, + #endif #ifdef CONFIG_GENERIC_CPU_AUTOPROBE &dev_attr_modalias.attr, #endif @@@ -509,30 -522,73 +522,30 @@@ static void __init cpu_dev_register_gen }
#ifdef CONFIG_GENERIC_CPU_VULNERABILITIES - -ssize_t __weak cpu_show_meltdown(struct device *dev, - struct device_attribute *attr, char *buf) -{ - return sysfs_emit(buf, "Not affected\n"); -} - -ssize_t __weak cpu_show_spectre_v1(struct device *dev, - struct device_attribute *attr, char *buf) -{ - return sysfs_emit(buf, "Not affected\n"); -} - -ssize_t __weak cpu_show_spectre_v2(struct device *dev, - struct device_attribute *attr, char *buf) -{ - return sysfs_emit(buf, "Not affected\n"); -} - -ssize_t __weak cpu_show_spec_store_bypass(struct device *dev, - struct device_attribute *attr, char *buf) -{ - return sysfs_emit(buf, "Not affected\n"); -} - -ssize_t __weak cpu_show_l1tf(struct device *dev, - struct device_attribute *attr, char *buf) -{ - return sysfs_emit(buf, "Not affected\n"); -} - -ssize_t __weak cpu_show_mds(struct device *dev, - struct device_attribute *attr, char *buf) -{ - return sysfs_emit(buf, "Not affected\n"); -} - -ssize_t __weak cpu_show_tsx_async_abort(struct device *dev, - struct device_attribute *attr, - char *buf) -{ - return sysfs_emit(buf, "Not affected\n"); -} - -ssize_t __weak cpu_show_itlb_multihit(struct device *dev, - struct device_attribute *attr, char *buf) -{ - return sysfs_emit(buf, "Not affected\n"); -} - -ssize_t __weak cpu_show_srbds(struct device *dev, +static ssize_t cpu_show_not_affected(struct device *dev, struct device_attribute *attr, char *buf) { return sysfs_emit(buf, "Not affected\n"); }
-ssize_t __weak cpu_show_mmio_stale_data(struct device *dev, - struct device_attribute *attr, char *buf) -{ - return sysfs_emit(buf, "Not affected\n"); -} - -ssize_t __weak cpu_show_retbleed(struct device *dev, - struct device_attribute *attr, char *buf) -{ - return sysfs_emit(buf, "Not affected\n"); -} +#define CPU_SHOW_VULN_FALLBACK(func) \ + ssize_t cpu_show_##func(struct device *, \ + struct device_attribute *, char *) \ + __attribute__((weak, alias("cpu_show_not_affected"))) + +CPU_SHOW_VULN_FALLBACK(meltdown); +CPU_SHOW_VULN_FALLBACK(spectre_v1); +CPU_SHOW_VULN_FALLBACK(spectre_v2); +CPU_SHOW_VULN_FALLBACK(spec_store_bypass); +CPU_SHOW_VULN_FALLBACK(l1tf); +CPU_SHOW_VULN_FALLBACK(mds); +CPU_SHOW_VULN_FALLBACK(tsx_async_abort); +CPU_SHOW_VULN_FALLBACK(itlb_multihit); +CPU_SHOW_VULN_FALLBACK(srbds); +CPU_SHOW_VULN_FALLBACK(mmio_stale_data); +CPU_SHOW_VULN_FALLBACK(retbleed); +CPU_SHOW_VULN_FALLBACK(spec_rstack_overflow); +CPU_SHOW_VULN_FALLBACK(gds);
static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL); static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL); @@@ -545,8 -601,6 +558,8 @@@ static DEVICE_ATTR(itlb_multihit, 0444 static DEVICE_ATTR(srbds, 0444, cpu_show_srbds, NULL); static DEVICE_ATTR(mmio_stale_data, 0444, cpu_show_mmio_stale_data, NULL); static DEVICE_ATTR(retbleed, 0444, cpu_show_retbleed, NULL); +static DEVICE_ATTR(spec_rstack_overflow, 0444, cpu_show_spec_rstack_overflow, NULL); +static DEVICE_ATTR(gather_data_sampling, 0444, cpu_show_gds, NULL);
static struct attribute *cpu_root_vulnerabilities_attrs[] = { &dev_attr_meltdown.attr, @@@ -560,8 -614,6 +573,8 @@@ &dev_attr_srbds.attr, &dev_attr_mmio_stale_data.attr, &dev_attr_retbleed.attr, + &dev_attr_spec_rstack_overflow.attr, + &dev_attr_gather_data_sampling.attr, NULL };
diff --combined drivers/base/memory.c index 8191709c9ad2,15bb416e58ce..f3b9a4d0fa3b --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@@ -105,8 -105,7 +105,8 @@@ EXPORT_SYMBOL(unregister_memory_notifie static void memory_block_release(struct device *dev) { struct memory_block *mem = to_memory_block(dev); - + /* Verify that the altmap is freed */ + WARN_ON(mem->altmap); kfree(mem); }
@@@ -184,7 -183,7 +184,7 @@@ static int memory_block_online(struct m { unsigned long start_pfn = section_nr_to_pfn(mem->start_section_nr); unsigned long nr_pages = PAGES_PER_SECTION * sections_per_block; - unsigned long nr_vmemmap_pages = mem->nr_vmemmap_pages; + unsigned long nr_vmemmap_pages = 0; struct zone *zone; int ret;
@@@ -201,9 -200,6 +201,9 @@@ * stage helps to keep accounting easier to follow - e.g vmemmaps * belong to the same zone as the memory they backed. */ + if (mem->altmap) + nr_vmemmap_pages = mem->altmap->free; + if (nr_vmemmap_pages) { ret = mhp_init_memmap_on_memory(start_pfn, nr_vmemmap_pages, zone); if (ret) @@@ -234,7 -230,7 +234,7 @@@ static int memory_block_offline(struct { unsigned long start_pfn = section_nr_to_pfn(mem->start_section_nr); unsigned long nr_pages = PAGES_PER_SECTION * sections_per_block; - unsigned long nr_vmemmap_pages = mem->nr_vmemmap_pages; + unsigned long nr_vmemmap_pages = 0; int ret;
if (!mem->zone) @@@ -244,9 -240,6 +244,9 @@@ * Unaccount before offlining, such that unpopulated zone and kthreads * can properly be torn down in offline_pages(). */ + if (mem->altmap) + nr_vmemmap_pages = mem->altmap->free; + if (nr_vmemmap_pages) adjust_present_page_count(pfn_to_page(start_pfn), mem->group, -nr_vmemmap_pages); @@@ -497,6 -490,16 +497,16 @@@ static ssize_t auto_online_blocks_store
static DEVICE_ATTR_RW(auto_online_blocks);
+ #ifdef CONFIG_CRASH_HOTPLUG + #include <linux/kexec.h> + static ssize_t crash_hotplug_show(struct device *dev, + struct device_attribute *attr, char *buf) + { + return sysfs_emit(buf, "%d\n", crash_hotplug_memory_support()); + } + static DEVICE_ATTR_RO(crash_hotplug); + #endif + /* * Some architectures will have custom drivers to do this, and * will not need to do it from userspace. The fake hot-add code @@@ -733,7 -736,7 +743,7 @@@ void memory_block_add_nid(struct memory #endif
static int add_memory_block(unsigned long block_id, unsigned long state, - unsigned long nr_vmemmap_pages, + struct vmem_altmap *altmap, struct memory_group *group) { struct memory_block *mem; @@@ -751,7 -754,7 +761,7 @@@ mem->start_section_nr = block_id * sections_per_block; mem->state = state; mem->nid = NUMA_NO_NODE; - mem->nr_vmemmap_pages = nr_vmemmap_pages; + mem->altmap = altmap; INIT_LIST_HEAD(&mem->group_next);
#ifndef CONFIG_NUMA @@@ -790,14 -793,14 +800,14 @@@ static int __init add_boot_memory_block if (section_count == 0) return 0; return add_memory_block(memory_block_id(base_section_nr), - MEM_ONLINE, 0, NULL); + MEM_ONLINE, NULL, NULL); }
static int add_hotplug_memory_block(unsigned long block_id, - unsigned long nr_vmemmap_pages, + struct vmem_altmap *altmap, struct memory_group *group) { - return add_memory_block(block_id, MEM_OFFLINE, nr_vmemmap_pages, group); + return add_memory_block(block_id, MEM_OFFLINE, altmap, group); }
static void remove_memory_block(struct memory_block *memory) @@@ -825,7 -828,7 +835,7 @@@ * Called under device_hotplug_lock. */ int create_memory_block_devices(unsigned long start, unsigned long size, - unsigned long vmemmap_pages, + struct vmem_altmap *altmap, struct memory_group *group) { const unsigned long start_block_id = pfn_to_block_id(PFN_DOWN(start)); @@@ -839,7 -842,7 +849,7 @@@ return -EINVAL;
for (block_id = start_block_id; block_id != end_block_id; block_id++) { - ret = add_hotplug_memory_block(block_id, vmemmap_pages, group); + ret = add_hotplug_memory_block(block_id, altmap, group); if (ret) break; } @@@ -896,6 -899,9 +906,9 @@@ static struct attribute *memory_root_at
&dev_attr_block_size_bytes.attr, &dev_attr_auto_online_blocks.attr, + #ifdef CONFIG_CRASH_HOTPLUG + &dev_attr_crash_hotplug.attr, + #endif NULL };
diff --combined fs/exec.c index 0b9484358a49,dc41180d4e70..6518e33ea813 --- a/fs/exec.c +++ b/fs/exec.c @@@ -701,7 -701,6 +701,7 @@@ static int shift_arg_pages(struct vm_ar if (vma != vma_next(&vmi)) return -EFAULT;
+ vma_iter_prev_range(&vmi); /* * cover the whole range: [new_start, old_end) */ @@@ -1277,8 -1276,8 +1277,8 @@@ int begin_new_exec(struct linux_binprm
/* * Must be called _before_ exec_mmap() as bprm->mm is - * not visible until then. This also enables the update - * to be lockless. + * not visible until then. Doing it here also ensures + * we don't race against replace_mm_exe_file(). */ retval = set_mm_exe_file(bprm->mm, bprm->file); if (retval) diff --combined fs/nilfs2/inode.c index d588c719d743,5c9154c29678..1a8bd5993476 --- a/fs/nilfs2/inode.c +++ b/fs/nilfs2/inode.c @@@ -366,7 -366,7 +366,7 @@@ struct inode *nilfs_new_inode(struct in atomic64_inc(&root->inodes_count); inode_init_owner(&nop_mnt_idmap, inode, dir, mode); inode->i_ino = ino; - inode->i_mtime = inode->i_atime = inode->i_ctime = current_time(inode); + inode->i_mtime = inode->i_atime = inode_set_ctime_current(inode);
if (S_ISREG(mode) || S_ISDIR(mode) || S_ISLNK(mode)) { err = nilfs_bmap_read(ii->i_bmap, NULL); @@@ -450,10 -450,10 +450,10 @@@ int nilfs_read_inode_common(struct inod set_nlink(inode, le16_to_cpu(raw_inode->i_links_count)); inode->i_size = le64_to_cpu(raw_inode->i_size); inode->i_atime.tv_sec = le64_to_cpu(raw_inode->i_mtime); - inode->i_ctime.tv_sec = le64_to_cpu(raw_inode->i_ctime); + inode_set_ctime(inode, le64_to_cpu(raw_inode->i_ctime), + le32_to_cpu(raw_inode->i_ctime_nsec)); inode->i_mtime.tv_sec = le64_to_cpu(raw_inode->i_mtime); inode->i_atime.tv_nsec = le32_to_cpu(raw_inode->i_mtime_nsec); - inode->i_ctime.tv_nsec = le32_to_cpu(raw_inode->i_ctime_nsec); inode->i_mtime.tv_nsec = le32_to_cpu(raw_inode->i_mtime_nsec); if (nilfs_is_metadata_file_inode(inode) && !S_ISREG(inode->i_mode)) return -EIO; /* this inode is for metadata and corrupted */ @@@ -768,9 -768,9 +768,9 @@@ void nilfs_write_inode_common(struct in raw_inode->i_gid = cpu_to_le32(i_gid_read(inode)); raw_inode->i_links_count = cpu_to_le16(inode->i_nlink); raw_inode->i_size = cpu_to_le64(inode->i_size); - raw_inode->i_ctime = cpu_to_le64(inode->i_ctime.tv_sec); + raw_inode->i_ctime = cpu_to_le64(inode_get_ctime(inode).tv_sec); raw_inode->i_mtime = cpu_to_le64(inode->i_mtime.tv_sec); - raw_inode->i_ctime_nsec = cpu_to_le32(inode->i_ctime.tv_nsec); + raw_inode->i_ctime_nsec = cpu_to_le32(inode_get_ctime(inode).tv_nsec); raw_inode->i_mtime_nsec = cpu_to_le32(inode->i_mtime.tv_nsec); raw_inode->i_blocks = cpu_to_le64(inode->i_blocks);
@@@ -875,7 -875,7 +875,7 @@@ void nilfs_truncate(struct inode *inode
nilfs_truncate_bmap(ii, blkoff);
- inode->i_mtime = inode->i_ctime = current_time(inode); + inode->i_mtime = inode_set_ctime_current(inode); if (IS_SYNC(inode)) nilfs_set_transaction_flag(NILFS_TI_SYNC);
@@@ -1025,7 -1025,7 +1025,7 @@@ int nilfs_load_inode_block(struct inod int err;
spin_lock(&nilfs->ns_inode_lock); - if (ii->i_bh == NULL) { + if (ii->i_bh == NULL || unlikely(!buffer_uptodate(ii->i_bh))) { spin_unlock(&nilfs->ns_inode_lock); err = nilfs_ifile_get_inode_block(ii->i_root->ifile, inode->i_ino, pbh); @@@ -1034,7 -1034,10 +1034,10 @@@ spin_lock(&nilfs->ns_inode_lock); if (ii->i_bh == NULL) ii->i_bh = *pbh; - else { + else if (unlikely(!buffer_uptodate(ii->i_bh))) { + __brelse(ii->i_bh); + ii->i_bh = *pbh; + } else { brelse(*pbh); *pbh = ii->i_bh; } @@@ -1101,17 -1104,9 +1104,17 @@@ int nilfs_set_file_dirty(struct inode *
int __nilfs_mark_inode_dirty(struct inode *inode, int flags) { + struct the_nilfs *nilfs = inode->i_sb->s_fs_info; struct buffer_head *ibh; int err;
+ /* + * Do not dirty inodes after the log writer has been detached + * and its nilfs_root struct has been freed. + */ + if (unlikely(nilfs_purging(nilfs))) + return 0; + err = nilfs_load_inode_block(inode, &ibh); if (unlikely(err)) { nilfs_warn(inode->i_sb, diff --combined fs/ocfs2/journal.c index c19c730c26e2,4e779efe2a4e..e8e7d47265aa --- a/fs/ocfs2/journal.c +++ b/fs/ocfs2/journal.c @@@ -114,9 -114,9 +114,9 @@@ int ocfs2_compute_replay_slots(struct o if (osb->replay_map) return 0;
- replay_map = kzalloc(sizeof(struct ocfs2_replay_map) + - (osb->max_slots * sizeof(char)), GFP_KERNEL); - + replay_map = kzalloc(struct_size(replay_map, rm_replay_slots, + osb->max_slots), + GFP_KERNEL); if (!replay_map) { mlog_errno(-ENOMEM); return -ENOMEM; @@@ -178,16 -178,13 +178,13 @@@ int ocfs2_recovery_init(struct ocfs2_su osb->recovery_thread_task = NULL; init_waitqueue_head(&osb->recovery_event);
- rm = kzalloc(sizeof(struct ocfs2_recovery_map) + - osb->max_slots * sizeof(unsigned int), + rm = kzalloc(struct_size(rm, rm_entries, osb->max_slots), GFP_KERNEL); if (!rm) { mlog_errno(-ENOMEM); return -ENOMEM; }
- rm->rm_entries = (unsigned int *)((char *)rm + - sizeof(struct ocfs2_recovery_map)); osb->recovery_map = rm;
return 0; @@@ -557,7 -554,7 +554,7 @@@ static void ocfs2_abort_trigger(struct (unsigned long)bh, (unsigned long long)bh->b_blocknr);
- ocfs2_error(bh->b_bdev->bd_super, + ocfs2_error(bh->b_assoc_map->host->i_sb, "JBD2 has aborted our journal, ocfs2 cannot continue\n"); }
@@@ -780,14 -777,14 +777,14 @@@ void ocfs2_journal_dirty(handle_t *hand mlog_errno(status); if (!is_handle_aborted(handle)) { journal_t *journal = handle->h_transaction->t_journal; - struct super_block *sb = bh->b_bdev->bd_super;
mlog(ML_ERROR, "jbd2_journal_dirty_metadata failed. " "Aborting transaction and journal.\n"); handle->h_err = status; jbd2_journal_abort_handle(handle); jbd2_journal_abort(journal, status); - ocfs2_abort(sb, "Journal already aborted.\n"); + ocfs2_abort(bh->b_assoc_map->host->i_sb, + "Journal already aborted.\n"); } } } diff --combined fs/ocfs2/namei.c index e4a684d45308,03bccfd183f3..5cd6d7771cea --- a/fs/ocfs2/namei.c +++ b/fs/ocfs2/namei.c @@@ -793,10 -793,10 +793,10 @@@ static int ocfs2_link(struct dentry *ol }
inc_nlink(inode); - inode->i_ctime = current_time(inode); + inode_set_ctime_current(inode); ocfs2_set_links_count(fe, inode->i_nlink); - fe->i_ctime = cpu_to_le64(inode->i_ctime.tv_sec); - fe->i_ctime_nsec = cpu_to_le32(inode->i_ctime.tv_nsec); + fe->i_ctime = cpu_to_le64(inode_get_ctime(inode).tv_sec); + fe->i_ctime_nsec = cpu_to_le32(inode_get_ctime(inode).tv_nsec); ocfs2_journal_dirty(handle, fe_bh);
err = ocfs2_add_entry(handle, dentry, inode, @@@ -995,7 -995,7 +995,7 @@@ static int ocfs2_unlink(struct inode *d ocfs2_set_links_count(fe, inode->i_nlink); ocfs2_journal_dirty(handle, fe_bh);
- dir->i_ctime = dir->i_mtime = current_time(dir); + dir->i_mtime = inode_set_ctime_current(dir); if (S_ISDIR(inode->i_mode)) drop_nlink(dir);
@@@ -1535,9 -1535,13 +1535,13 @@@ static int ocfs2_rename(struct mnt_idma status = ocfs2_add_entry(handle, new_dentry, old_inode, OCFS2_I(old_inode)->ip_blkno, new_dir_bh, &target_insert); + if (status < 0) { + mlog_errno(status); + goto bail; + } }
- old_inode->i_ctime = current_time(old_inode); + inode_set_ctime_current(old_inode); mark_inode_dirty(old_inode);
status = ocfs2_journal_access_di(handle, INODE_CACHE(old_inode), @@@ -1546,8 -1550,8 +1550,8 @@@ if (status >= 0) { old_di = (struct ocfs2_dinode *) old_inode_bh->b_data;
- old_di->i_ctime = cpu_to_le64(old_inode->i_ctime.tv_sec); - old_di->i_ctime_nsec = cpu_to_le32(old_inode->i_ctime.tv_nsec); + old_di->i_ctime = cpu_to_le64(inode_get_ctime(old_inode).tv_sec); + old_di->i_ctime_nsec = cpu_to_le32(inode_get_ctime(old_inode).tv_nsec); ocfs2_journal_dirty(handle, old_inode_bh); } else mlog_errno(status); @@@ -1586,9 -1590,9 +1590,9 @@@
if (new_inode) { drop_nlink(new_inode); - new_inode->i_ctime = current_time(new_inode); + inode_set_ctime_current(new_inode); } - old_dir->i_ctime = old_dir->i_mtime = current_time(old_dir); + old_dir->i_mtime = inode_set_ctime_current(old_dir);
if (update_dot_dot) { status = ocfs2_update_entry(old_inode, handle, @@@ -1610,8 -1614,7 +1614,8 @@@
if (old_dir != new_dir) { /* Keep the same times on both directories.*/ - new_dir->i_ctime = new_dir->i_mtime = old_dir->i_ctime; + new_dir->i_mtime = inode_set_ctime_to_ts(new_dir, + inode_get_ctime(old_dir));
/* * This will also pick up the i_nlink change from the diff --combined fs/proc/base.c index 02e1cf03d94b,483a3edebdd1..ffd54617c354 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@@ -1902,7 -1902,7 +1902,7 @@@ struct inode *proc_pid_make_inode(struc ei = PROC_I(inode); inode->i_mode = mode; inode->i_ino = get_next_ino(); - inode->i_mtime = inode->i_atime = inode->i_ctime = current_time(inode); + inode->i_mtime = inode->i_atime = inode_set_ctime_current(inode); inode->i_op = &proc_def_inode_operations;
/* @@@ -1966,7 -1966,7 +1966,7 @@@ int pid_getattr(struct mnt_idmap *idmap struct proc_fs_info *fs_info = proc_sb_info(inode->i_sb); struct task_struct *task;
- generic_fillattr(&nop_mnt_idmap, inode, stat); + generic_fillattr(&nop_mnt_idmap, request_mask, inode, stat);
stat->uid = GLOBAL_ROOT_UID; stat->gid = GLOBAL_ROOT_GID; @@@ -2817,7 -2817,7 +2817,7 @@@ static int proc_##LSM##_attr_dir_iterat \ static const struct file_operations proc_##LSM##_attr_dir_ops = { \ .read = generic_read_dir, \ - .iterate = proc_##LSM##_attr_dir_iterate, \ + .iterate_shared = proc_##LSM##_attr_dir_iterate, \ .llseek = default_llseek, \ }; \ \ @@@ -3207,7 -3207,6 +3207,7 @@@ static int proc_pid_ksm_stat(struct seq mm = get_task_mm(task); if (mm) { seq_printf(m, "ksm_rmap_items %lu\n", mm->ksm_rmap_items); + seq_printf(m, "ksm_zero_pages %lu\n", mm->ksm_zero_pages); seq_printf(m, "ksm_merging_pages %lu\n", mm->ksm_merging_pages); seq_printf(m, "ksm_process_profit %ld\n", ksm_process_profit(mm)); mmput(mm); @@@ -3584,8 -3583,7 +3584,8 @@@ static int proc_tid_comm_permission(str }
static const struct inode_operations proc_tid_comm_inode_operations = { - .permission = proc_tid_comm_permission, + .setattr = proc_setattr, + .permission = proc_tid_comm_permission, };
/* @@@ -3815,11 -3813,10 +3815,10 @@@ static struct task_struct *first_tid(st /* If we haven't found our starting place yet start * with the leader and walk nr threads forward. */ - pos = task = task->group_leader; - do { + for_each_thread(task, pos) { if (!nr--) goto found; - } while_each_thread(task, pos); + }; fail: pos = NULL; goto out; @@@ -3901,7 -3898,7 +3900,7 @@@ static int proc_task_getattr(struct mnt { struct inode *inode = d_inode(path->dentry); struct task_struct *p = get_proc_task(inode); - generic_fillattr(&nop_mnt_idmap, inode, stat); + generic_fillattr(&nop_mnt_idmap, request_mask, inode, stat);
if (p) { stat->nlink += get_nr_threads(p); diff --combined include/kunit/test.h index d33114097d0d,107c81431634..68ff01aee244 --- a/include/kunit/test.h +++ b/include/kunit/test.h @@@ -12,6 -12,7 +12,7 @@@ #include <kunit/assert.h> #include <kunit/try-catch.h>
+ #include <linux/args.h> #include <linux/compiler.h> #include <linux/container_of.h> #include <linux/err.h> @@@ -63,35 -64,12 +64,35 @@@ enum kunit_status KUNIT_SKIPPED, };
+/* Attribute struct/enum definitions */ + +/* + * Speed Attribute is stored as an enum and separated into categories of + * speed: very_slowm, slow, and normal. These speeds are relative to + * other KUnit tests. + * + * Note: unset speed attribute acts as default of KUNIT_SPEED_NORMAL. + */ +enum kunit_speed { + KUNIT_SPEED_UNSET, + KUNIT_SPEED_VERY_SLOW, + KUNIT_SPEED_SLOW, + KUNIT_SPEED_NORMAL, + KUNIT_SPEED_MAX = KUNIT_SPEED_NORMAL, +}; + +/* Holds attributes for each test case and suite */ +struct kunit_attributes { + enum kunit_speed speed; +}; + /** * struct kunit_case - represents an individual test case. * * @run_case: the function representing the actual test case. * @name: the name of the test case. * @generate_params: the generator function for parameterized tests. + * @attr: the attributes associated with the test * * A test case is a function with the signature, * ``void (*)(struct kunit *)`` @@@ -127,11 -105,9 +128,11 @@@ struct kunit_case void (*run_case)(struct kunit *test); const char *name; const void* (*generate_params)(const void *prev, char *desc); + struct kunit_attributes attr;
/* private: internal use only. */ enum kunit_status status; + char *module_name; char *log; };
@@@ -156,32 -132,7 +157,32 @@@ static inline char *kunit_status_to_ok_ * &struct kunit_case object from it. See the documentation for * &struct kunit_case for an example on how to use it. */ -#define KUNIT_CASE(test_name) { .run_case = test_name, .name = #test_name } +#define KUNIT_CASE(test_name) \ + { .run_case = test_name, .name = #test_name, \ + .module_name = KBUILD_MODNAME} + +/** + * KUNIT_CASE_ATTR - A helper for creating a &struct kunit_case + * with attributes + * + * @test_name: a reference to a test case function. + * @attributes: a reference to a struct kunit_attributes object containing + * test attributes + */ +#define KUNIT_CASE_ATTR(test_name, attributes) \ + { .run_case = test_name, .name = #test_name, \ + .attr = attributes, .module_name = KBUILD_MODNAME} + +/** + * KUNIT_CASE_SLOW - A helper for creating a &struct kunit_case + * with the slow attribute + * + * @test_name: a reference to a test case function. + */ + +#define KUNIT_CASE_SLOW(test_name) \ + { .run_case = test_name, .name = #test_name, \ + .attr.speed = KUNIT_SPEED_SLOW, .module_name = KBUILD_MODNAME}
/** * KUNIT_CASE_PARAM - A helper for creation a parameterized &struct kunit_case @@@ -202,21 -153,7 +203,21 @@@ */ #define KUNIT_CASE_PARAM(test_name, gen_params) \ { .run_case = test_name, .name = #test_name, \ - .generate_params = gen_params } + .generate_params = gen_params, .module_name = KBUILD_MODNAME} + +/** + * KUNIT_CASE_PARAM_ATTR - A helper for creating a parameterized &struct + * kunit_case with attributes + * + * @test_name: a reference to a test case function. + * @gen_params: a reference to a parameter generator function. + * @attributes: a reference to a struct kunit_attributes object containing + * test attributes + */ +#define KUNIT_CASE_PARAM_ATTR(test_name, gen_params, attributes) \ + { .run_case = test_name, .name = #test_name, \ + .generate_params = gen_params, \ + .attr = attributes, .module_name = KBUILD_MODNAME}
/** * struct kunit_suite - describes a related collection of &struct kunit_case @@@ -227,7 -164,6 +228,7 @@@ * @init: called before every test case. * @exit: called after every test case. * @test_cases: a null terminated array of test cases. + * @attr: the attributes associated with the test suite * * A kunit_suite is a collection of related &struct kunit_case s, such that * @init is called before every test case and @exit is called after every @@@ -247,7 -183,6 +248,7 @@@ struct kunit_suite int (*init)(struct kunit *test); void (*exit)(struct kunit *test); struct kunit_case *test_cases; + struct kunit_attributes attr;
/* private: internal use only */ char status_comment[KUNIT_STATUS_COMMENT_SIZE]; @@@ -256,12 -191,6 +257,12 @@@ int suite_init_err; };
+/* Stores an array of suites, end points one past the end */ +struct kunit_suite_set { + struct kunit_suite * const *start; + struct kunit_suite * const *end; +}; + /** * struct kunit - represents a running instance of a test. * @@@ -309,10 -238,6 +310,10 @@@ static inline void kunit_set_failure(st }
bool kunit_enabled(void); +const char *kunit_action(void); +const char *kunit_filter_glob(void); +char *kunit_filter(void); +char *kunit_filter_action(void);
void kunit_init_test(struct kunit *test, const char *name, char *log);
@@@ -323,21 -248,10 +324,21 @@@ size_t kunit_suite_num_test_cases(struc unsigned int kunit_test_case_num(struct kunit_suite *suite, struct kunit_case *test_case);
+struct kunit_suite_set +kunit_filter_suites(const struct kunit_suite_set *suite_set, + const char *filter_glob, + char *filters, + char *filter_action, + int *err); +void kunit_free_suite_set(struct kunit_suite_set suite_set); + int __kunit_test_suites_init(struct kunit_suite * const * const suites, int num_suites);
void __kunit_test_suites_exit(struct kunit_suite **suites, int num_suites);
+void kunit_exec_run_tests(struct kunit_suite_set *suite_set, bool builtin); +void kunit_exec_list_tests(struct kunit_suite_set *suite_set, bool include_attr); + #if IS_BUILTIN(CONFIG_KUNIT) int kunit_run_all_tests(void); #else diff --combined init/Kconfig index 5e7d4885d1bf,0ec8d2a98761..6d35728b94b2 --- a/init/Kconfig +++ b/init/Kconfig @@@ -629,7 -629,6 +629,7 @@@ config TASK_IO_ACCOUNTIN
config PSI bool "Pressure stall information tracking" + select KERNFS help Collect metrics that indicate how overcommitted the CPU, memory, and IO capacity are in the system. @@@ -1791,14 -1790,6 +1791,6 @@@ config DEBUG_RSE
If unsure, say N.
- config EMBEDDED - bool "Embedded system" - select EXPERT - help - This option should be enabled if compiling the kernel for - an embedded system so certain expert options are available - for configuration. - config HAVE_PERF_EVENTS bool help @@@ -1928,6 -1919,8 +1920,8 @@@ config BINDGEN_VERSION_TEX config TRACEPOINTS bool
+ source "kernel/Kconfig.kexec" + endmenu # General setup
source "arch/Kconfig" diff --combined kernel/crash_core.c index 693445e1f7f6,7b87db9973a5..03a7932cde0a --- a/kernel/crash_core.c +++ b/kernel/crash_core.c @@@ -10,6 -10,9 +10,9 @@@ #include <linux/utsname.h> #include <linux/vmalloc.h> #include <linux/sizes.h> + #include <linux/kexec.h> + #include <linux/memory.h> + #include <linux/cpuhotplug.h>
#include <asm/page.h> #include <asm/sections.h> @@@ -17,6 -20,10 +20,10 @@@ #include <crypto/sha1.h>
#include "kallsyms_internal.h" + #include "kexec_internal.h" + + /* Per cpu memory for storing cpu states in case of system crash. */ + note_buf_t __percpu *crash_notes;
/* vmcoreinfo stuff */ unsigned char *vmcoreinfo_data; @@@ -314,6 -321,187 +321,187 @@@ static int __init parse_crashkernel_dum } early_param("crashkernel", parse_crashkernel_dummy);
+ int crash_prepare_elf64_headers(struct crash_mem *mem, int need_kernel_map, + void **addr, unsigned long *sz) + { + Elf64_Ehdr *ehdr; + Elf64_Phdr *phdr; + unsigned long nr_cpus = num_possible_cpus(), nr_phdr, elf_sz; + unsigned char *buf; + unsigned int cpu, i; + unsigned long long notes_addr; + unsigned long mstart, mend; + + /* extra phdr for vmcoreinfo ELF note */ + nr_phdr = nr_cpus + 1; + nr_phdr += mem->nr_ranges; + + /* + * kexec-tools creates an extra PT_LOAD phdr for kernel text mapping + * area (for example, ffffffff80000000 - ffffffffa0000000 on x86_64). + * I think this is required by tools like gdb. So same physical + * memory will be mapped in two ELF headers. One will contain kernel + * text virtual addresses and other will have __va(physical) addresses. + */ + + nr_phdr++; + elf_sz = sizeof(Elf64_Ehdr) + nr_phdr * sizeof(Elf64_Phdr); + elf_sz = ALIGN(elf_sz, ELF_CORE_HEADER_ALIGN); + + buf = vzalloc(elf_sz); + if (!buf) + return -ENOMEM; + + ehdr = (Elf64_Ehdr *)buf; + phdr = (Elf64_Phdr *)(ehdr + 1); + memcpy(ehdr->e_ident, ELFMAG, SELFMAG); + ehdr->e_ident[EI_CLASS] = ELFCLASS64; + ehdr->e_ident[EI_DATA] = ELFDATA2LSB; + ehdr->e_ident[EI_VERSION] = EV_CURRENT; + ehdr->e_ident[EI_OSABI] = ELF_OSABI; + memset(ehdr->e_ident + EI_PAD, 0, EI_NIDENT - EI_PAD); + ehdr->e_type = ET_CORE; + ehdr->e_machine = ELF_ARCH; + ehdr->e_version = EV_CURRENT; + ehdr->e_phoff = sizeof(Elf64_Ehdr); + ehdr->e_ehsize = sizeof(Elf64_Ehdr); + ehdr->e_phentsize = sizeof(Elf64_Phdr); + + /* Prepare one phdr of type PT_NOTE for each possible CPU */ + for_each_possible_cpu(cpu) { + phdr->p_type = PT_NOTE; + notes_addr = per_cpu_ptr_to_phys(per_cpu_ptr(crash_notes, cpu)); + phdr->p_offset = phdr->p_paddr = notes_addr; + phdr->p_filesz = phdr->p_memsz = sizeof(note_buf_t); + (ehdr->e_phnum)++; + phdr++; + } + + /* Prepare one PT_NOTE header for vmcoreinfo */ + phdr->p_type = PT_NOTE; + phdr->p_offset = phdr->p_paddr = paddr_vmcoreinfo_note(); + phdr->p_filesz = phdr->p_memsz = VMCOREINFO_NOTE_SIZE; + (ehdr->e_phnum)++; + phdr++; + + /* Prepare PT_LOAD type program header for kernel text region */ + if (need_kernel_map) { + phdr->p_type = PT_LOAD; + phdr->p_flags = PF_R|PF_W|PF_X; + phdr->p_vaddr = (unsigned long) _text; + phdr->p_filesz = phdr->p_memsz = _end - _text; + phdr->p_offset = phdr->p_paddr = __pa_symbol(_text); + ehdr->e_phnum++; + phdr++; + } + + /* Go through all the ranges in mem->ranges[] and prepare phdr */ + for (i = 0; i < mem->nr_ranges; i++) { + mstart = mem->ranges[i].start; + mend = mem->ranges[i].end; + + phdr->p_type = PT_LOAD; + phdr->p_flags = PF_R|PF_W|PF_X; + phdr->p_offset = mstart; + + phdr->p_paddr = mstart; + phdr->p_vaddr = (unsigned long) __va(mstart); + phdr->p_filesz = phdr->p_memsz = mend - mstart + 1; + phdr->p_align = 0; + ehdr->e_phnum++; + pr_debug("Crash PT_LOAD ELF header. phdr=%p vaddr=0x%llx, paddr=0x%llx, sz=0x%llx e_phnum=%d p_offset=0x%llx\n", + phdr, phdr->p_vaddr, phdr->p_paddr, phdr->p_filesz, + ehdr->e_phnum, phdr->p_offset); + phdr++; + } + + *addr = buf; + *sz = elf_sz; + return 0; + } + + int crash_exclude_mem_range(struct crash_mem *mem, + unsigned long long mstart, unsigned long long mend) + { + int i, j; + unsigned long long start, end, p_start, p_end; + struct range temp_range = {0, 0}; + + for (i = 0; i < mem->nr_ranges; i++) { + start = mem->ranges[i].start; + end = mem->ranges[i].end; + p_start = mstart; + p_end = mend; + + if (mstart > end || mend < start) + continue; + + /* Truncate any area outside of range */ + if (mstart < start) + p_start = start; + if (mend > end) + p_end = end; + + /* Found completely overlapping range */ + if (p_start == start && p_end == end) { + mem->ranges[i].start = 0; + mem->ranges[i].end = 0; + if (i < mem->nr_ranges - 1) { + /* Shift rest of the ranges to left */ + for (j = i; j < mem->nr_ranges - 1; j++) { + mem->ranges[j].start = + mem->ranges[j+1].start; + mem->ranges[j].end = + mem->ranges[j+1].end; + } + + /* + * Continue to check if there are another overlapping ranges + * from the current position because of shifting the above + * mem ranges. + */ + i--; + mem->nr_ranges--; + continue; + } + mem->nr_ranges--; + return 0; + } + + if (p_start > start && p_end < end) { + /* Split original range */ + mem->ranges[i].end = p_start - 1; + temp_range.start = p_end + 1; + temp_range.end = end; + } else if (p_start != start) + mem->ranges[i].end = p_start - 1; + else + mem->ranges[i].start = p_end + 1; + break; + } + + /* If a split happened, add the split to array */ + if (!temp_range.end) + return 0; + + /* Split happened */ + if (i == mem->max_nr_ranges - 1) + return -ENOMEM; + + /* Location where new range should go */ + j = i + 1; + if (j < mem->nr_ranges) { + /* Move over all ranges one slot towards the end */ + for (i = mem->nr_ranges - 1; i >= j; i--) + mem->ranges[i + 1] = mem->ranges[i]; + } + + mem->ranges[j].start = temp_range.start; + mem->ranges[j].end = temp_range.end; + mem->nr_ranges++; + return 0; + } + Elf_Word *append_elf_note(Elf_Word *buf, char *name, unsigned int type, void *data, size_t data_len) { @@@ -455,6 -643,8 +643,6 @@@ static int __init crash_save_vmcoreinfo VMCOREINFO_OFFSET(page, lru); VMCOREINFO_OFFSET(page, _mapcount); VMCOREINFO_OFFSET(page, private); - VMCOREINFO_OFFSET(folio, _folio_dtor); - VMCOREINFO_OFFSET(folio, _folio_order); VMCOREINFO_OFFSET(page, compound_head); VMCOREINFO_OFFSET(pglist_data, node_zones); VMCOREINFO_OFFSET(pglist_data, nr_zones); @@@ -488,7 -678,7 +676,7 @@@ #define PAGE_BUDDY_MAPCOUNT_VALUE (~PG_buddy) VMCOREINFO_NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE); #ifdef CONFIG_HUGETLB_PAGE - VMCOREINFO_NUMBER(HUGETLB_PAGE_DTOR); + VMCOREINFO_NUMBER(PG_hugetlb); #define PAGE_OFFLINE_MAPCOUNT_VALUE (~PG_offline) VMCOREINFO_NUMBER(PAGE_OFFLINE_MAPCOUNT_VALUE); #endif @@@ -513,3 -703,206 +701,206 @@@ }
subsys_initcall(crash_save_vmcoreinfo_init); + + static int __init crash_notes_memory_init(void) + { + /* Allocate memory for saving cpu registers. */ + size_t size, align; + + /* + * crash_notes could be allocated across 2 vmalloc pages when percpu + * is vmalloc based . vmalloc doesn't guarantee 2 continuous vmalloc + * pages are also on 2 continuous physical pages. In this case the + * 2nd part of crash_notes in 2nd page could be lost since only the + * starting address and size of crash_notes are exported through sysfs. + * Here round up the size of crash_notes to the nearest power of two + * and pass it to __alloc_percpu as align value. This can make sure + * crash_notes is allocated inside one physical page. + */ + size = sizeof(note_buf_t); + align = min(roundup_pow_of_two(sizeof(note_buf_t)), PAGE_SIZE); + + /* + * Break compile if size is bigger than PAGE_SIZE since crash_notes + * definitely will be in 2 pages with that. + */ + BUILD_BUG_ON(size > PAGE_SIZE); + + crash_notes = __alloc_percpu(size, align); + if (!crash_notes) { + pr_warn("Memory allocation for saving cpu register states failed\n"); + return -ENOMEM; + } + return 0; + } + subsys_initcall(crash_notes_memory_init); + + #ifdef CONFIG_CRASH_HOTPLUG + #undef pr_fmt + #define pr_fmt(fmt) "crash hp: " fmt + + /* + * This routine utilized when the crash_hotplug sysfs node is read. + * It reflects the kernel's ability/permission to update the crash + * elfcorehdr directly. + */ + int crash_check_update_elfcorehdr(void) + { + int rc = 0; + + /* Obtain lock while reading crash information */ + if (!kexec_trylock()) { + pr_info("kexec_trylock() failed, elfcorehdr may be inaccurate\n"); + return 0; + } + if (kexec_crash_image) { + if (kexec_crash_image->file_mode) + rc = 1; + else + rc = kexec_crash_image->update_elfcorehdr; + } + /* Release lock now that update complete */ + kexec_unlock(); + + return rc; + } + + /* + * To accurately reflect hot un/plug changes of cpu and memory resources + * (including onling and offlining of those resources), the elfcorehdr + * (which is passed to the crash kernel via the elfcorehdr= parameter) + * must be updated with the new list of CPUs and memories. + * + * In order to make changes to elfcorehdr, two conditions are needed: + * First, the segment containing the elfcorehdr must be large enough + * to permit a growing number of resources; the elfcorehdr memory size + * is based on NR_CPUS_DEFAULT and CRASH_MAX_MEMORY_RANGES. + * Second, purgatory must explicitly exclude the elfcorehdr from the + * list of segments it checks (since the elfcorehdr changes and thus + * would require an update to purgatory itself to update the digest). + */ + static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int cpu) + { + struct kimage *image; + + /* Obtain lock while changing crash information */ + if (!kexec_trylock()) { + pr_info("kexec_trylock() failed, elfcorehdr may be inaccurate\n"); + return; + } + + /* Check kdump is not loaded */ + if (!kexec_crash_image) + goto out; + + image = kexec_crash_image; + + /* Check that updating elfcorehdr is permitted */ + if (!(image->file_mode || image->update_elfcorehdr)) + goto out; + + if (hp_action == KEXEC_CRASH_HP_ADD_CPU || + hp_action == KEXEC_CRASH_HP_REMOVE_CPU) + pr_debug("hp_action %u, cpu %u\n", hp_action, cpu); + else + pr_debug("hp_action %u\n", hp_action); + + /* + * The elfcorehdr_index is set to -1 when the struct kimage + * is allocated. Find the segment containing the elfcorehdr, + * if not already found. + */ + if (image->elfcorehdr_index < 0) { + unsigned long mem; + unsigned char *ptr; + unsigned int n; + + for (n = 0; n < image->nr_segments; n++) { + mem = image->segment[n].mem; + ptr = kmap_local_page(pfn_to_page(mem >> PAGE_SHIFT)); + if (ptr) { + /* The segment containing elfcorehdr */ + if (memcmp(ptr, ELFMAG, SELFMAG) == 0) + image->elfcorehdr_index = (int)n; + kunmap_local(ptr); + } + } + } + + if (image->elfcorehdr_index < 0) { + pr_err("unable to locate elfcorehdr segment"); + goto out; + } + + /* Needed in order for the segments to be updated */ + arch_kexec_unprotect_crashkres(); + + /* Differentiate between normal load and hotplug update */ + image->hp_action = hp_action; + + /* Now invoke arch-specific update handler */ + arch_crash_handle_hotplug_event(image); + + /* No longer handling a hotplug event */ + image->hp_action = KEXEC_CRASH_HP_NONE; + image->elfcorehdr_updated = true; + + /* Change back to read-only */ + arch_kexec_protect_crashkres(); + + /* Errors in the callback is not a reason to rollback state */ + out: + /* Release lock now that update complete */ + kexec_unlock(); + } + + static int crash_memhp_notifier(struct notifier_block *nb, unsigned long val, void *v) + { + switch (val) { + case MEM_ONLINE: + crash_handle_hotplug_event(KEXEC_CRASH_HP_ADD_MEMORY, + KEXEC_CRASH_HP_INVALID_CPU); + break; + + case MEM_OFFLINE: + crash_handle_hotplug_event(KEXEC_CRASH_HP_REMOVE_MEMORY, + KEXEC_CRASH_HP_INVALID_CPU); + break; + } + return NOTIFY_OK; + } + + static struct notifier_block crash_memhp_nb = { + .notifier_call = crash_memhp_notifier, + .priority = 0 + }; + + static int crash_cpuhp_online(unsigned int cpu) + { + crash_handle_hotplug_event(KEXEC_CRASH_HP_ADD_CPU, cpu); + return 0; + } + + static int crash_cpuhp_offline(unsigned int cpu) + { + crash_handle_hotplug_event(KEXEC_CRASH_HP_REMOVE_CPU, cpu); + return 0; + } + + static int __init crash_hotplug_init(void) + { + int result = 0; + + if (IS_ENABLED(CONFIG_MEMORY_HOTPLUG)) + register_memory_notifier(&crash_memhp_nb); + + if (IS_ENABLED(CONFIG_HOTPLUG_CPU)) { + result = cpuhp_setup_state_nocalls(CPUHP_BP_PREPARE_DYN, + "crash/cpuhp", crash_cpuhp_online, crash_cpuhp_offline); + } + + return result; + } + + subsys_initcall(crash_hotplug_init); + #endif diff --combined kernel/fork.c index f81149739eb9,7b8b63fb0438..a9c18d480dc5 --- a/kernel/fork.c +++ b/kernel/fork.c @@@ -985,14 -985,6 +985,14 @@@ void __put_task_struct(struct task_stru } EXPORT_SYMBOL_GPL(__put_task_struct);
+void __put_task_struct_rcu_cb(struct rcu_head *rhp) +{ + struct task_struct *task = container_of(rhp, struct task_struct, rcu); + + __put_task_struct(task); +} +EXPORT_SYMBOL_GPL(__put_task_struct_rcu_cb); + void __init __weak arch_task_cache_init(void) { }
/* @@@ -1404,8 -1396,8 +1404,8 @@@ EXPORT_SYMBOL_GPL(mmput_async) * This changes mm's executable file (shown as symlink /proc/[pid]/exe). * * Main users are mmput() and sys_execve(). Callers prevent concurrent - * invocations: in mmput() nobody alive left, in execve task is single - * threaded. + * invocations: in mmput() nobody alive left, in execve it happens before + * the new mm is made visible to anyone. * * Can only fail if new_exe_file != NULL. */ @@@ -1440,9 -1432,7 +1440,7 @@@ int set_mm_exe_file(struct mm_struct *m /** * replace_mm_exe_file - replace a reference to the mm's executable file * - * This changes mm's executable file (shown as symlink /proc/[pid]/exe), - * dealing with concurrent invocation and without grabbing the mmap lock in - * write mode. + * This changes mm's executable file (shown as symlink /proc/[pid]/exe). * * Main user is sys_prctl(PR_SET_MM_MAP/EXE_FILE). */ @@@ -1472,22 -1462,20 +1470,20 @@@ int replace_mm_exe_file(struct mm_struc return ret; }
- /* set the new file, lockless */ ret = deny_write_access(new_exe_file); if (ret) return -EACCES; get_file(new_exe_file);
- old_exe_file = xchg(&mm->exe_file, new_exe_file); + /* set the new file */ + mmap_write_lock(mm); + old_exe_file = rcu_dereference_raw(mm->exe_file); + rcu_assign_pointer(mm->exe_file, new_exe_file); + mmap_write_unlock(mm); + if (old_exe_file) { - /* - * Don't race with dup_mmap() getting the file and disallowing - * write access while someone might open the file writable. - */ - mmap_read_lock(mm); allow_write_access(old_exe_file); fput(old_exe_file); - mmap_read_unlock(mm); } return 0; } diff --combined scripts/checkpatch.pl index a9841148cde2,6e789dc07420..7d16f863edf1 --- a/scripts/checkpatch.pl +++ b/scripts/checkpatch.pl @@@ -74,6 -74,8 +74,8 @@@ my $git_command ='export LANGUAGE=en_US my $tabsize = 8; my ${CONFIG_} = "CONFIG_";
+ my %maybe_linker_symbol; # for externs in c exceptions, when seen in *vmlinux.lds.h + sub help { my ($exitcode) = @_;
@@@ -3270,7 -3272,7 +3272,7 @@@ sub process # A Fixes:, link or signature tag line $commit_log_possible_stack_dump)) { WARN("COMMIT_LOG_LONG_LINE", - "Possible unwrapped commit description (prefer a maximum 75 chars per line)\n" . $herecurr); + "Prefer a maximum 75 chars per line (possible unwrapped commit description?)\n" . $herecurr); $commit_log_long_line = 1; }
@@@ -6051,6 -6053,9 +6053,9 @@@
# check for line continuations outside of #defines, preprocessor #, and asm
+ } elsif ($realfile =~ m@/vmlinux.lds.h$@) { + $line =~ s/(\w+)/$maybe_linker_symbol{$1}++/ge; + #print "REAL: $realfile\nln: $line\nkeys:", sort keys %maybe_linker_symbol; } else { if ($prevline !~ /^..*\$/ && $line !~ /^+\s*#.*\$/ && # preprocessor @@@ -7119,6 -7124,21 +7124,21 @@@ "arguments for function declarations should follow identifier\n" . $herecurr); }
+ } elsif ($realfile =~ /.c$/ && defined $stat && + $stat =~ /^+extern struct\s+(\w+)\s+(\w+)[];/) + { + my ($st_type, $st_name) = ($1, $2); + + for my $s (keys %maybe_linker_symbol) { + #print "Linker symbol? $st_name : $s\n"; + goto LIKELY_LINKER_SYMBOL + if $st_name =~ /$s/; + } + WARN("AVOID_EXTERNS", + "found a file-scoped extern type:$st_type name:$st_name in .c file\n" + . "is this a linker symbol ?\n" . $herecurr); + LIKELY_LINKER_SYMBOL: + } elsif ($realfile =~ /.c$/ && defined $stat && $stat =~ /^.\s*extern\s+/) { @@@ -7457,30 -7477,6 +7477,30 @@@ } }
+# Complain about RCU Tasks Trace used outside of BPF (and of course, RCU). + our $rcu_trace_funcs = qr{(?x: + rcu_read_lock_trace | + rcu_read_lock_trace_held | + rcu_read_unlock_trace | + call_rcu_tasks_trace | + synchronize_rcu_tasks_trace | + rcu_barrier_tasks_trace | + rcu_request_urgent_qs_task + )}; + our $rcu_trace_paths = qr{(?x: + kernel/bpf/ | + include/linux/bpf | + net/bpf/ | + kernel/rcu/ | + include/linux/rcu + )}; + if ($line =~ /\b($rcu_trace_funcs)\s*(/) { + if ($realfile !~ m{^$rcu_trace_paths}) { + WARN("RCU_TASKS_TRACE", + "use of RCU tasks trace is incorrect outside BPF or core RCU code\n" . $herecurr); + } + } + # check for lockdep_set_novalidate_class if ($line =~ /^.\s*lockdep_set_novalidate_class\s*(/ || $line =~ /__lockdep_no_validate__\s*)/ ) { diff --combined tools/testing/selftests/proc/proc-empty-vm.c index e1052443d070,dfbcb3ce2194..b16c13688b88 --- a/tools/testing/selftests/proc/proc-empty-vm.c +++ b/tools/testing/selftests/proc/proc-empty-vm.c @@@ -1,3 -1,4 +1,4 @@@ + #if defined __amd64__ || defined __i386__ /* * Copyright (c) 2022 Alexey Dobriyan adobriyan@gmail.com * @@@ -37,6 -38,10 +38,10 @@@ #include <sys/wait.h> #include <unistd.h>
+ #ifdef __amd64__ + #define TEST_VSYSCALL + #endif + /* * 0: vsyscall VMA doesn't exist vsyscall=none * 1: vsyscall VMA is --xp vsyscall=xonly @@@ -77,7 -82,7 +82,7 @@@ static const char proc_pid_smaps_vsysca "Swap: 0 kB\n" "SwapPss: 0 kB\n" "Locked: 0 kB\n" -"THPeligible: 0\n" +"THPeligible: 0\n" /* * "ProtectionKey:" field is conditional. It is possible to check it as well, * but I don't have such machine. @@@ -107,7 -112,7 +112,7 @@@ static const char proc_pid_smaps_vsysca "Swap: 0 kB\n" "SwapPss: 0 kB\n" "Locked: 0 kB\n" -"THPeligible: 0\n" +"THPeligible: 0\n" /* * "ProtectionKey:" field is conditional. It is possible to check it as well, * but I'm too tired. @@@ -119,6 -124,7 +124,7 @@@ static void sigaction_SIGSEGV(int _, si _exit(EXIT_FAILURE); }
+ #ifdef TEST_VSYSCALL static void sigaction_SIGSEGV_vsyscall(int _, siginfo_t *__, void *___) { _exit(g_vsyscall); @@@ -170,6 -176,7 +176,7 @@@ static void vsyscall(void exit(1); } } + #endif
static int test_proc_pid_maps(pid_t pid) { @@@ -299,7 -306,9 +306,9 @@@ int main(void { int rv = EXIT_SUCCESS;
+ #ifdef TEST_VSYSCALL vsyscall(); + #endif
switch (g_vsyscall) { case 0: @@@ -346,6 -355,14 +355,14 @@@
#ifdef __amd64__ munmap(NULL, ((size_t)1 << 47) - 4096); + #elif defined __i386__ + { + size_t len; + + for (len = -4096;; len -= 4096) { + munmap(NULL, len); + } + } #else #error "implement 'unmap everything'" #endif @@@ -386,3 -403,9 +403,9 @@@
return rv; } + #else + int main(void) + { + return 4; + } + #endif