> I am using vanilla Linux 6.8.10, and I've just noticed this BUG in my
dmesg log. I have no idea what triggered it, and especially since I
have not even mounted any NFS filesystems?!
Hi all,
I have the exact same bug. I'm using the NixOS kernel but as soon as it
was updated to 6.8.10 my server has gone in a crash-reboot-loop.
The server is hosting an NFS deamon and it crashes about 10 seconds
after the tty login prompt is displayed.
Dowgrading to 6.8.9 fixes the issue.
Regards,
Paul Grandperrin
Hi all,
I am running a dual Xeon machine as my personal virtualization server at home, using
Proxmox VE, and with their latest update 8.2 which brings kernel 6.8.4-2-pve, I am seeing
a serious regression which breaks my setup because it does not boot any more. The last
message I see displayed during boot is: "Timed out for waiting the udev queue being
empty.", and then it hangs indefinitely.
Previous kernel 6.5.13-5-pve worked fine, with the following caveat: I had similar
problems initially with earlier kernels too, so from the very beginning with this machine
using PVE, I had to set grub parameter rootdelay=60. With that, everything was fine, the
busses settled and RAID controller and root device was found and system booted. With the
newer 6.8.4 kernel, not any more, although I even tried to increase rootdelay parameter to
120.
I was able to reproduce and bisect this regression also with mainline kernels (also with
stable 6.8.8 and 6.9-rc), so I thought it would be a good idea to report it upstream to
you guys.
This is an older server machine: 2-socket Ivy Bridge Xeon E5-2697 v2 (24C/48T) in an Asus
Z9PE-D16/2L motherboard (Intel C-602A chipset); BIOS patched to the latest available from
Asus. All memory slots occupied, so 256 GB RAM in total. It also has Asus ASMB6 iKVM BMC,
which supplies virtual storage devices (seel below dmesg) to which ISO images can be
attached via network to boot/install OS from.
Storage config:
I have two single M4 256 GiB SATA SSD drives attached to internal mainboard SATA ports;
one of them is my root device and PVE installation drive. The other one I use for storing
ISO images. My main VM storage is attached to a battery backed-up Adaptec 5805 SATA/SAS
RAID controller (w/ latest FW build 18948) attached to SATA/SAS enclosure of my Supermicro
server casing, having eight disk drives in total: I have one RAID1 Array, consisting of
two Samsung 1 TiB SATA SSDs for VM root disk images, and one RAID5 Array, consisting of 6
Hitachi 1 TiB HDDs which I use for storing VM data disk images. On both arrays, I use a
LVM thin pool as PVE storage location. When everything boots up, the system is running
just fine and smoothly with ~15 VMs at the same time (and has for years!). Although this
is "only" a homelab server, I love it dearly and use it for many private projects VMs,
among them runing Windows Server VM with MS SQL Server, and Linux server VMs running
Oracle Database Server (I'm a database guy).
I attach dmesg output of previous working kernel 6.5.13-5-pve, my git bisect log and
output of lspci -v. The last successful kernel messages I see from the failing kernels
version is this:
...
[ 5.540424] usb-storage 1-1.3.4:1.0: USB Mass Storage device detected
[ 5.540670] scsi host10: usb-storage 1-1.3.4:1.0
[ 5.947794] scsi 8:0:0:0: CD-ROM AMI Virtual CDROM0 1.00 PQ: 0 ANSI:
0 CCS
[ 6.267830] scsi 9:0:0:0: Direct-Access AMI Virtual Floppy0 1.00 PQ: 0 ANSI:
0 CCS
[ 6.555845] scsi 10:0:0:0: Direct-Access AMI Virtual HDISK0 1.00 PQ: 0 ANSI:
0 CCS
and then the error message "Timed out for waiting the udev queue being empty." and the
system hangs. In case of working kernels, the boot process would continue with this:
...
[ 5.947794] scsi 8:0:0:0: CD-ROM AMI Virtual CDROM0 1.00 PQ: 0 ANSI:
0 CCS
[ 6.267830] scsi 9:0:0:0: Direct-Access AMI Virtual Floppy0 1.00 PQ: 0 ANSI:
0 CCS
[ 6.555845] scsi 10:0:0:0: Direct-Access AMI Virtual HDISK0 1.00 PQ: 0 ANSI:
0 CCS
[ 32.592054] scsi 0:3:1:0: Enclosure ADAPTEC Virtual SGPIO 1 0001 PQ: 0 ANSI: 5
[ 61.536097] sd 0:0:0:0: Attached scsi generic sg0 type 0
[ 61.536215] sd 0:0:0:0: [sda] 1998565376 512-byte logical blocks: (1.02 TB/953 GiB)
[ 61.536236] sd 0:0:1:0: Attached scsi generic sg1 type 0
[ 61.536239] sd 0:0:0:0: [sda] Write Protect is off
[ 61.536246] sd 0:0:0:0: [sda] Mode Sense: 12 00 10 08
[ 61.536283] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, supports DPO
and FUA
[ 61.536340] scsi 0:1:0:0: Attached scsi generic sg2 type 0
[ 61.536383] sd 0:0:1:0: [sdb] Very big device. Trying to use READ CAPACITY(16).
[ 61.536400] sd 0:0:1:0: [sdb] 9762222080 512-byte logical blocks: (5.00 TB/4.54 TiB)
[ 61.536414] sd 0:0:1:0: [sdb] Write Protect is off
[ 61.536418] sd 0:0:1:0: [sdb] Mode Sense: 12 00 10 08
[ 61.536439] sd 0:0:1:0: [sdb] Write cache: disabled, read cache: enabled, supports DPO
and FUA
[ 61.536455] scsi 0:1:1:0: Attached scsi generic sg3 type 0
[ 61.536616] scsi 0:1:2:0: Attached scsi generic sg4 type 0
[ 61.536750] scsi 0:1:3:0: Attached scsi generic sg5 type 0
[ 61.536840] scsi 0:1:4:0: Attached scsi generic sg6 type 0
[ 61.536930] scsi 0:1:5:0: Attached scsi generic sg7 type 0
[ 61.537027] scsi 0:1:6:0: Attached scsi generic sg8 type 0
[ 61.537122] scsi 0:1:7:0: Attached scsi generic sg9 type 0
[ 61.537248] sd 0:0:1:0: [sdb] Very big device. Trying to use READ CAPACITY(16).
[ 61.537274] scsi 0:3:0:0: Attached scsi generic sg10 type 13
[ 61.537390] scsi 0:3:1:0: Attached scsi generic sg11 type 13
[ 61.537558] scsi 1:0:0:0: Direct-Access ATA M4-CT256M4SSD2 0309 PQ: 0 ANSI: 5
[ 61.537851] sd 1:0:0:0: Attached scsi generic sg12 type 0
[ 61.537919] scsi: waiting for bus probes to complete ...
[ 61.537973] sd 1:0:0:0: [sdc] 500118192 512-byte logical blocks: (256 GB/238 GiB)
[ 61.537986] sd 1:0:0:0: [sdc] Write Protect is off
[ 61.537989] sd 1:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[ 61.538002] sd 1:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't
support DPO or FUA
[ 61.538022] sd 1:0:0:0: [sdc] Preferred minimum I/O size 512 bytes
[ 61.538924] sdc: sdc1 sdc2 < sdc5 >
...
so it seems to me the initialiation of the the Adaptec controller is the culprit.
I have tested and reproduced the regression with mainline kernels according to the
following list (please excuse me if it's too long ;-)
See at the very bottom for first bad commit I found this way. I always built as "make
olddefconfig" using the 6.5.13-5-pve config as starting point.
-------------------------------------------------------------------
Proxmox Virtual Environmet (PVE) Kernels
========================================
6.5.13-5-pve WORKS last working PVE (8.1) kernel; 5.15-pve and 6.2-pve work too
6.8.4-2-pve NOPE PVE release 8.2
Mainline Kernels
================
6.9.0-rc6+ NOPE Most recent (2024-05-01)
6.9.0-rc5+ NOPE Most recent (2024-04-27)
6.8.8 NOPE Most recent released (2024-04-29)
6.8.7 NOPE Most recent released (2024-04-27)
6.8.4 NOPE Same version as most recent released PVE 8.2 Kernel
6.5.13 WORKS
My tests, reverts on top of 6.8.8
=================================
6.8.8+ WORKS Revert "Merge tag 'scsi-fixes' of
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi" - This reverts commit
6d20acbf3e3a32d331947dbc3802cf2d1a399e7d, reversing changes made to
fef85269a19d277f23fc5ff08a3c356beeb54cb3
6.8.8+ WORKS Revert "scsi: core: Consult supported VPD page list prior to
fetching page" - This reverts commit b5fc07a5fb56216a49e6c1d0b172d5464d99a89b (this is the
first bad commit of my bisect session, see below, and a single patch as part of the above
merged tag 'scsi-fixes')
Bisecting, starting from 6.9.0-rc5 (bad) and 6.5.13 (good)
==========================================================
root@linus:/usr/src/linux# git checkout master
Bereits auf 'master'
Ihr Branch ist auf demselben Stand wie 'origin/master'.
root@linus:/usr/src/linux# git log
commit 9d1ddab261f3e2af7c384dc02238784ce0cf9f98 (HEAD -> master, origin/master, origin/HEAD)
Merge: 71b1543c83d6 77d8aa79ecfb
Author: Linus Torvalds <torvalds(a)linux-foundation.org>
Date: Tue Apr 23 09:37:32 2024 -0700
Merge tag '6.9-rc5-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
root@linus:/usr/src/linux# cp /boot/config-6.5.13-5-pve .config
root@linus:/usr/src/linux# git bisect start
Status: warte auf guten und schlechten Commit
root@linus:/usr/src/linux# git bisect bad
Status: warte auf gute(n) Commit(s), schlechter Commit bekannt
root@linus:/usr/src/linux# git bisect good v6.5.13
Binäre Suche: eine Merge-Basis muss geprüft werden
[2dde18cd1d8fac735875f2e4987f11817cc0bc2c] Linux 6.5
root@linus:/usr/src/linux# make olddefconfig
.config:10571:warning: symbol value 'm' invalid for ANDROID_BINDER_IPC
.config:10572:warning: symbol value 'm' invalid for ANDROID_BINDERFS
#
# configuration written to .config
#
root@linus:/usr/src/linux# make -j 48
=> 6.5.0 (Merge Base) WORKS
root@linus:/usr/src/linux# git bisect good
Binäre Suche: danach noch 32111 Commits zum Testen übrig (ungefähr 15 Schritte)
[0f5cc96c367f2e780eb492cc9cab84e3b2ca88da] Merge tag 's390-6.7-3' of
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
root@linus:/usr/src/linux# make -j 48
=> 6.7.0-rc2+ WORKS
root@linus:/usr/src/linux# git bisect good
Binäre Suche: danach noch 16056 Commits zum Testen übrig (ungefähr 14 Schritte)
[ee138217c32ccbfa75d5ea6b766158148e98f6fa] Merge tag 'btree-remove-btnum-6.9_2024-02-23'
of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
=> 6.8.0-rc4+ WORKS
root@linus:/usr/src/linux# git bisect good
Binäre Suche: danach noch 8214 Commits zum Testen übrig (ungefähr 13 Schritte)
[e5e038b7ae9da96b93974bf072ca1876899a01a3] Merge tag 'fs_for_v6.9-rc1' of
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
=> 6.8.0+ NOPE => does not find root device, does not boot;
message: "BUG: arch topology borken the CPU
domain not a subset of > the NUMA domain"
message: "Timed out for waiting the udev
queue being empty."
root@linus:/usr/src/linux# git bisect bad
Binäre Suche: danach noch 3954 Commits zum Testen übrig (ungefähr 12 Schritte)
[f153fbe1ea11939e2514ba4b3b62bbd946e2892c] Merge tag 'erofs-for-6.9-rc1' of
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
=> 6.8.0+ (HEAD losgelöst bei f153fbe1ea11) NOPE => same as above
root@linus:/usr/src/linux# git bisect bad
Binäre Suche: danach noch 1945 Commits zum Testen übrig (ungefähr 11 Schritte)
[1ddeeb2a058d7b2a58ed9e820396b4ceb715d529] Merge tag 'for-6.9/block-20240310' of
git://git.kernel.dk/linux
=> 6.8.0+ (HEAD losgelöst bei 1ddeeb2a058d) NOPE => same as above
root@linus:/usr/src/linux# git bisect bad
Binäre Suche: danach noch 970 Commits zum Testen übrig (ungefähr 10 Schritte)
[2652b99e43403dc464f3648483ffb38e48872fe4] ice: virtchnl: stop pretending to support RSS
over AQ or registers
=> 6.8.0-rc6+ (2652b99e4340) NOPE => same
root@linus:/usr/src/linux# git bisect bad
Binäre Suche: danach noch 506 Commits zum Testen übrig (ungefähr 9 Schritte)
[efa80dcbb7a3ecc4a1b2f54624c49b5a612f92b3] Merge tag 'trace-v6.8-rc5' of
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
=> 6.8.0-rc5+ (efa80dcbb7a3) WORKS
root@linus:/usr/src/linux# git bisect good
Binäre Suche: danach noch 251 Commits zum Testen übrig (ungefähr 8 Schritte)
[c6a597fcc7ad7335a3ecf8f5287a0459f793a257] Merge tag 'loongarch-fixes-6.8-3' of
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
=> 6.8.0-rc5+ (c6a597fcc7ad) WORKS
root@linus:/usr/src/linux# git bisect good
Binäre Suche: danach noch 126 Commits zum Testen übrig (ungefähr 7 Schritte)
[cf1182944c7cc9f1c21a8a44e0d29abe12527412] Merge tag 'lsm-pr-20240227' of
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
=> 6.8.0-rc6+ (cf1182944c7c) NOPE
root@linus:/usr/src/linux# git bisect bad
Binäre Suche: danach noch 62 Commits zum Testen übrig (ungefähr 6 Schritte)
[4ca0d9894fd517a2f2c0c10d26ebe99ab4396fe3] Merge tag 'erofs-for-6.8-rc6-fixes' of
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
=> 6.8.0-rc5+ (4ca0d9894fd5) NOPE
root@linus:/usr/src/linux# git bisect bad
Binäre Suche: danach noch 36 Commits zum Testen übrig (ungefähr 5 Schritte)
[ac389bc0ca56e1a2f92b2a17e58298390a3879a8] Merge tag 'cxl-fixes-6.8-rc6' of
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
=> 6.8.0-rc5+ (ac389bc0ca56) NOPE
root@linus:/usr/src/linux# git bisect bad
Binäre Suche: danach noch 12 Commits zum Testen übrig (ungefähr 4 Schritte)
[40de53fd002c6ba087a623722915e8006ed68a02] Merge branch 'for-6.8/cxl-cper' into for-6.8/cxl
=> 6.8.0-rc5+ (40de53fd002c) WORKS
root@linus:/usr/src/linux# git bisect good
Binäre Suche: danach noch 6 Commits zum Testen übrig (ungefähr 3 Schritte)
[9ddf190a7df77b77817f955fdb9c2ae9d1c9c9a3] scsi: jazz_esp: Only build if SCSI core is builtin
=> 6.8.0-rc1+ (9ddf190a7df7) NOPE
root@linus:/usr/src/linux# git bisect bad
Binäre Suche: danach noch 2 Commits zum Testen übrig (ungefähr 2 Schritte)
[de959094eb2197636f7c803af0943cb9d3b35804] scsi: target: pscsi: Fix bio_put() for error case
=> 6.8.0-rc1+ (de959094eb21) NOPE
root@linus:/usr/src/linux# git bisect bad
Binäre Suche: danach noch 0 Commits zum Testen übrig (ungefähr 1 Schritt)
[b5fc07a5fb56216a49e6c1d0b172d5464d99a89b] scsi: core: Consult supported VPD page list
prior to fetching page
=> 6.8.0-rc1+ (b5fc07a5fb56) NOPE
root@linus:/usr/src/linux# git bisect bad
Binäre Suche: danach noch 0 Commits zum Testen übrig (ungefähr 0 Schritte)
[321da3dc1f3c92a12e3c5da934090d2992a8814c] scsi: sd: usb_storage: uas: Access media prior
to querying device properties
=> 6.8.0-rc1+ (321da3dc1f3c) WORKS
root@linus:/usr/src/linux# git bisect good
b5fc07a5fb56216a49e6c1d0b172d5464d99a89b is the first bad commit
commit b5fc07a5fb56216a49e6c1d0b172d5464d99a89b
Author: Martin K. Petersen <martin.petersen(a)oracle.com>
Date: Wed Feb 14 17:14:11 2024 -0500
scsi: core: Consult supported VPD page list prior to fetching page
Commit c92a6b5d6335 ("scsi: core: Query VPD size before getting full
page") removed the logic which checks whether a VPD page is present on
the supported pages list before asking for the page itself. That was
done because SPC helpfully states "The Supported VPD Pages VPD page
list may or may not include all the VPD pages that are able to be
returned by the device server". Testing had revealed a few devices
that supported some of the 0xBn pages but didn't actually list them in
page 0.
Julian Sikorski bisected a problem with his drive resetting during
discovery to the commit above. As it turns out, this particular drive
firmware will crash if we attempt to fetch page 0xB9.
Various approaches were attempted to work around this. In the end,
reinstating the logic that consults VPD page 0 before fetching any
other page was the path of least resistance. A firmware update for the
devices which originally compelled us to remove the check has since
been released.
Link: https://lore.kernel.org/r/20240214221411.2888112-1-martin.petersen@oracle.c…
Fixes: c92a6b5d6335 ("scsi: core: Query VPD size before getting full page")
Cc: stable(a)vger.kernel.org
Cc: Bart Van Assche <bvanassche(a)acm.org>
Reported-by: Julian Sikorski <belegdol(a)gmail.com>
Tested-by: Julian Sikorski <belegdol(a)gmail.com>
Reviewed-by: Lee Duncan <lee.duncan(a)suse.com>
Reviewed-by: Bart Van Assche <bvanassche(a)acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
drivers/scsi/scsi.c | 22 ++++++++++++++++++++--
include/scsi/scsi_device.h | 4 ----
2 files changed, 20 insertions(+), 6 deletions(-)
root@linus:/usr/src/linux#
-------------------------------------------------------------------
Beste Grüße,
Peter Schneider
--
Climb the mountain not to plant your flag, but to embrace the challenge,
enjoy the air and behold the view. Climb it so you can see the world,
not so the world can see you. -- David McCullough Jr.
OpenPGP: 0xA3828BD796CCE11A8CADE8866E3A92C92C3FF244
Download: https://www.peters-netzplatz.de/download/pschneider1968_pub.aschttps://keys.mailvelope.com/pks/lookup?op=get&search=pschneider1968@googlem…https://keys.mailvelope.com/pks/lookup?op=get&search=pschneider1968@gmail.c…
On Fri, May 24, 2024 at 01:07:18AM +0000, Lin Gui (桂林) wrote:
> Dear @Greg KH<mailto:gregkh@linuxfoundation.org>,
>
> Base : kernel-5.15.159
>
> diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c
> index a569066..d656964 100644
> --- a/drivers/mmc/core/mmc.c
> +++ b/drivers/mmc/core/mmc.c
> @@ -1800,7 +1800,13 @@ static int mmc_init_card(struct mmc_host *host, u32 ocr,
> if (err)
> goto free_card;
>
> - } else if (!mmc_card_hs400es(card)) {
> + } else if (mmc_card_hs400es(card)){
> + if (host->ops->execute_hs400_tuning) {
> + err = host->ops->execute_hs400_tuning(host, card);
> + if (err)
> + goto free_card;
> + }
> + } else {
> /* Select the desired bus width optionally */
> err = mmc_select_bus_width(card);
> if (err > 0 && mmc_card_hs(card)) {
>
The patch is corrupted, and sent in html format.
But most importantly, you did not test this to verify it works at all,
which means that you don't really need it?
confused,
greg k-h
From: Jeff Xu <jeffxu(a)google.com>
Add documentation for MFD_NOEXEC_SEAL and MFD_EXEC
Cc: stable(a)vger.kernel.org
Signed-off-by: Jeff Xu <jeffxu(a)google.com>
---
Documentation/userspace-api/index.rst | 1 +
Documentation/userspace-api/mfd_noexec.rst | 90 ++++++++++++++++++++++
2 files changed, 91 insertions(+)
create mode 100644 Documentation/userspace-api/mfd_noexec.rst
diff --git a/Documentation/userspace-api/index.rst b/Documentation/userspace-api/index.rst
index 5926115ec0ed..8a251d71fa6e 100644
--- a/Documentation/userspace-api/index.rst
+++ b/Documentation/userspace-api/index.rst
@@ -32,6 +32,7 @@ Security-related interfaces
seccomp_filter
landlock
lsm
+ mfd_noexec
spec_ctrl
tee
diff --git a/Documentation/userspace-api/mfd_noexec.rst b/Documentation/userspace-api/mfd_noexec.rst
new file mode 100644
index 000000000000..6f11ad86b076
--- /dev/null
+++ b/Documentation/userspace-api/mfd_noexec.rst
@@ -0,0 +1,90 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==================================
+Introduction of non executable mfd
+==================================
+:Author:
+ Daniel Verkamp <dverkamp(a)chromium.org>
+ Jeff Xu <jeffxu(a)google.com>
+
+:Contributor:
+ Aleksa Sarai <cyphar(a)cyphar.com>
+ Barnabás Pőcze <pobrn(a)protonmail.com>
+ David Rheinsberg <david(a)readahead.eu>
+
+Since Linux introduced the memfd feature, memfd have always had their
+execute bit set, and the memfd_create() syscall doesn't allow setting
+it differently.
+
+However, in a secure by default system, such as ChromeOS, (where all
+executables should come from the rootfs, which is protected by Verified
+boot), this executable nature of memfd opens a door for NoExec bypass
+and enables “confused deputy attack”. E.g, in VRP bug [1]: cros_vm
+process created a memfd to share the content with an external process,
+however the memfd is overwritten and used for executing arbitrary code
+and root escalation. [2] lists more VRP in this kind.
+
+On the other hand, executable memfd has its legit use, runc uses memfd’s
+seal and executable feature to copy the contents of the binary then
+execute them, for such system, we need a solution to differentiate runc's
+use of executable memfds and an attacker's [3].
+
+To address those above.
+ - Let memfd_create() set X bit at creation time.
+ - Let memfd be sealed for modifying X bit when NX is set.
+ - A new pid namespace sysctl: vm.memfd_noexec to help applications to
+ migrating and enforcing non-executable MFD.
+
+User API
+========
+``int memfd_create(const char *name, unsigned int flags)``
+
+``MFD_NOEXEC_SEAL``
+ When MFD_NOEXEC_SEAL bit is set in the ``flags``, memfd is created
+ with NX. F_SEAL_EXEC is set and the memfd can't be modified to
+ add X later.
+ This is the most common case for the application to use memfd.
+
+``MFD_EXEC``
+ When MFD_EXEC bit is set in the ``flags``, memfd is created with X.
+
+Note:
+ ``MFD_NOEXEC_SEAL`` and ``MFD_EXEC`` doesn't change the sealable
+ characteristic of memfd, which is controlled by ``MFD_ALLOW_SEALING``.
+
+
+Sysctl:
+========
+``pid namespaced sysctl vm.memfd_noexec``
+
+The new pid namespaced sysctl vm.memfd_noexec has 3 values:
+
+ - 0: MEMFD_NOEXEC_SCOPE_EXEC
+ memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL acts like
+ MFD_EXEC was set.
+
+ - 1: MEMFD_NOEXEC_SCOPE_NOEXEC_SEAL
+ memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL acts like
+ MFD_NOEXEC_SEAL was set.
+
+ - 2: MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED
+ memfd_create() without MFD_NOEXEC_SEAL will be rejected.
+
+The sysctl allows finer control of memfd_create for old-software that
+doesn't set the executable bit, for example, a container with
+vm.memfd_noexec=1 means the old-software will create non-executable memfd
+by default while new-software can create executable memfd by setting
+MFD_EXEC.
+
+The value of memfd_noexec is passed to child namespace at creation time,
+in addition, the setting is hierarchical, i.e. during memfd_create,
+we will search from current ns to root ns and use the most restrictive
+setting.
+
+Reference:
+==========
+[1] https://crbug.com/1305267
+
+[2] https://bugs.chromium.org/p/chromium/issues/list?q=type%3Dbug-security%20me…
+
+[3] https://lwn.net/Articles/781013/
--
2.45.1.288.g0e0cd299f1-goog
Hello José,
I'm testing on the 6.6 kernel with a "0b95:1790 ASIX Electronics Corp.
AX88179 Gigabit Ethernet" device.
after applying commit 56f78615bcb1 ("net: usb: ax88179_178a: avoid
writing the mac address before first reading")
the network will no longer work after brining the device down.
After plugging in the device, it generally will work with ifconfig:
$ ifconfig eth0 <ip address>
However, if I then try bringing the devcie down and back up, it no longer works.
$ ifconfig eth0 down
$ ifconfig eth0 <ip address>
$ ethtool eth0 | grep detected
Link detected: no
The link will continue to report as undetected.
If I revert 56f78615bcb1 the device will work after bringing it down
and back up.
If I build at commit d7a319889498 ("net: usb: ax88179_178a: avoid two
consecutive device resets") and its
parent d7a319889498^ these also work.
Is this something you have seen before with your test devices?
Regards,
Jeff
Hi all,
This series fixed some issues on bootloader - kernel
interface.
The first two fixed booting with devicetree, the last two
enhanced kernel's tolerance on different bootloader implementation.
Please review.
Thanks
Signed-off-by: Jiaxun Yang <jiaxun.yang(a)flygoat.com>
---
Jiaxun Yang (4):
LoongArch: Fix built-in DTB detection
LoongArch: smp: Add all CPUs enabled by fdt to NUMA node 0
LoongArch: Fix entry point in image header
LoongArch: Clear higher address bits in JUMP_VIRT_ADDR
arch/loongarch/include/asm/stackframe.h | 4 +++-
arch/loongarch/kernel/head.S | 2 +-
arch/loongarch/kernel/setup.c | 6 ++++--
arch/loongarch/kernel/smp.c | 5 ++++-
4 files changed, 12 insertions(+), 5 deletions(-)
---
base-commit: 124cfbcd6d185d4f50be02d5f5afe61578916773
change-id: 20240521-loongarch-booting-fixes-366e13e7ca55
Best regards,
--
Jiaxun Yang <jiaxun.yang(a)flygoat.com>
From: Jorge Ramirez-Ortiz <jorge(a)foundries.io>
commit 67380251e8bbd3302c64fea07f95c31971b91c22 upstream
Requesting a retune before switching to the RPMB partition has been
observed to cause CRC errors on the RPMB reads (-EILSEQ).
Since RPMB reads can not be retried, the clients would be directly
affected by the errors.
This commit disables the retune request prior to switching to the RPMB
partition: mmc_retune_pause() no longer triggers a retune before the
pause period begins.
This was verified with the sdhci-of-arasan driver (ZynqMP) configured
for HS200 using two separate eMMC cards (DG4064 and 064GB2). In both
cases, the error was easy to reproduce triggering every few tenths of
reads.
With this commit, systems that were utilizing OP-TEE to access RPMB
variables will experience an enhanced performance. Specifically, when
OP-TEE is configured to employ RPMB as a secure storage solution, it not
only writes the data but also the secure filesystem within the
partition. As a result, retrieving any variable involves multiple RPMB
reads, typically around five.
For context, on ZynqMP, each retune request consumed approximately
8ms. Consequently, reading any RPMB variable used to take at the very
minimum 40ms.
After droping the need to retune before switching to the RPMB partition,
this is no longer the case.
Signed-off-by: Jorge Ramirez-Ortiz <jorge(a)foundries.io>
Acked-by: Avri Altman <avri.altman(a)wdc.com>
Acked-by: Adrian Hunter <adrian.hunter(a)intel.com>
Link: https://lore.kernel.org/r/20240103112911.2954632-1-jorge@foundries.io
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
Signed-off-by: Florian Fainelli <florian.fainelli(a)broadcom.com>
---
drivers/mmc/core/host.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/mmc/core/host.c b/drivers/mmc/core/host.c
index 3e94401c0eb3..23d95d2bdf05 100644
--- a/drivers/mmc/core/host.c
+++ b/drivers/mmc/core/host.c
@@ -68,13 +68,12 @@ void mmc_retune_enable(struct mmc_host *host)
/*
* Pause re-tuning for a small set of operations. The pause begins after the
- * next command and after first doing re-tuning.
+ * next command.
*/
void mmc_retune_pause(struct mmc_host *host)
{
if (!host->retune_paused) {
host->retune_paused = 1;
- mmc_retune_needed(host);
mmc_retune_hold(host);
}
}
--
2.34.1
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
When the inode is being dropped from the dentry, the TRACEFS_EVENT_INODE
flag needs to be cleared to prevent a remount from calling
eventfs_remount() on the tracefs_inode private data. There's a race
between the inode is dropped (and the dentry freed) to where the inode is
actually freed. If a remount happens between the two, the eventfs_inode
could be accessed after it is freed (only the dentry keeps a ref count on
it).
Currently the TRACEFS_EVENT_INODE flag is cleared from the dentry iput()
function. But this is incorrect, as it is possible that the inode has
another reference to it. The flag should only be cleared when the inode is
really being dropped and has no more references. That happens in the
drop_inode callback of the inode, as that gets called when the last
reference of the inode is released.
Remove the tracefs_d_iput() function and move its logic to the more
appropriate tracefs_drop_inode() callback function.
Link: https://lore.kernel.org/linux-trace-kernel/20240523051539.908205106@goodmis…
Cc: stable(a)vger.kernel.org
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Masahiro Yamada <masahiroy(a)kernel.org>
Fixes: baa23a8d4360d ("tracefs: Reset permissions on remount if permissions are options")
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
fs/tracefs/inode.c | 33 +++++++++++++++++----------------
1 file changed, 17 insertions(+), 16 deletions(-)
diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
index 9252e0d78ea2..7c29f4afc23d 100644
--- a/fs/tracefs/inode.c
+++ b/fs/tracefs/inode.c
@@ -426,10 +426,26 @@ static int tracefs_show_options(struct seq_file *m, struct dentry *root)
return 0;
}
+static int tracefs_drop_inode(struct inode *inode)
+{
+ struct tracefs_inode *ti = get_tracefs(inode);
+
+ /*
+ * This inode is being freed and cannot be used for
+ * eventfs. Clear the flag so that it doesn't call into
+ * eventfs during the remount flag updates. The eventfs_inode
+ * gets freed after an RCU cycle, so the content will still
+ * be safe if the iteration is going on now.
+ */
+ ti->flags &= ~TRACEFS_EVENT_INODE;
+
+ return 1;
+}
+
static const struct super_operations tracefs_super_operations = {
.alloc_inode = tracefs_alloc_inode,
.free_inode = tracefs_free_inode,
- .drop_inode = generic_delete_inode,
+ .drop_inode = tracefs_drop_inode,
.statfs = simple_statfs,
.show_options = tracefs_show_options,
};
@@ -455,22 +471,7 @@ static int tracefs_d_revalidate(struct dentry *dentry, unsigned int flags)
return !(ei && ei->is_freed);
}
-static void tracefs_d_iput(struct dentry *dentry, struct inode *inode)
-{
- struct tracefs_inode *ti = get_tracefs(inode);
-
- /*
- * This inode is being freed and cannot be used for
- * eventfs. Clear the flag so that it doesn't call into
- * eventfs during the remount flag updates. The eventfs_inode
- * gets freed after an RCU cycle, so the content will still
- * be safe if the iteration is going on now.
- */
- ti->flags &= ~TRACEFS_EVENT_INODE;
-}
-
static const struct dentry_operations tracefs_dentry_operations = {
- .d_iput = tracefs_d_iput,
.d_revalidate = tracefs_d_revalidate,
.d_release = tracefs_d_release,
};
--
2.43.0
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
The change to update the permissions of the eventfs_inode had the
misconception that using the tracefs_inode would find all the
eventfs_inodes that have been updated and reset them on remount.
The problem with this approach is that the eventfs_inodes are freed when
they are no longer used (basically the reason the eventfs system exists).
When they are freed, the updated eventfs_inodes are not reset on a remount
because their tracefs_inodes have been freed.
Instead, since the events directory eventfs_inode always has a
tracefs_inode pointing to it (it is not freed when finished), and the
events directory has a link to all its children, have the
eventfs_remount() function only operate on the events eventfs_inode and
have it descend into its children updating their uid and gids.
Link: https://lore.kernel.org/all/CAK7LNARXgaWw3kH9JgrnH4vK6fr8LDkNKf3wq8NhMWJrVw…
Link: https://lore.kernel.org/linux-trace-kernel/20240523051539.754424703@goodmis…
Cc: stable(a)vger.kernel.org
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Fixes: baa23a8d4360d ("tracefs: Reset permissions on remount if permissions are options")
Reported-by: Masahiro Yamada <masahiroy(a)kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
fs/tracefs/event_inode.c | 44 ++++++++++++++++++++++++++++------------
1 file changed, 31 insertions(+), 13 deletions(-)
diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 5dfb1ccd56ea..129d0f54ba62 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -305,27 +305,27 @@ static const struct file_operations eventfs_file_operations = {
.llseek = generic_file_llseek,
};
-/*
- * On a remount of tracefs, if UID or GID options are set, then
- * the mount point inode permissions should be used.
- * Reset the saved permission flags appropriately.
- */
-void eventfs_remount(struct tracefs_inode *ti, bool update_uid, bool update_gid)
+static void eventfs_set_attrs(struct eventfs_inode *ei, bool update_uid, kuid_t uid,
+ bool update_gid, kgid_t gid, int level)
{
- struct eventfs_inode *ei = ti->private;
+ struct eventfs_inode *ei_child;
- if (!ei)
+ /* Update events/<system>/<event> */
+ if (WARN_ON_ONCE(level > 3))
return;
if (update_uid) {
ei->attr.mode &= ~EVENTFS_SAVE_UID;
- ei->attr.uid = ti->vfs_inode.i_uid;
+ ei->attr.uid = uid;
}
-
if (update_gid) {
ei->attr.mode &= ~EVENTFS_SAVE_GID;
- ei->attr.gid = ti->vfs_inode.i_gid;
+ ei->attr.gid = gid;
+ }
+
+ list_for_each_entry(ei_child, &ei->children, list) {
+ eventfs_set_attrs(ei_child, update_uid, uid, update_gid, gid, level + 1);
}
if (!ei->entry_attrs)
@@ -334,13 +334,31 @@ void eventfs_remount(struct tracefs_inode *ti, bool update_uid, bool update_gid)
for (int i = 0; i < ei->nr_entries; i++) {
if (update_uid) {
ei->entry_attrs[i].mode &= ~EVENTFS_SAVE_UID;
- ei->entry_attrs[i].uid = ti->vfs_inode.i_uid;
+ ei->entry_attrs[i].uid = uid;
}
if (update_gid) {
ei->entry_attrs[i].mode &= ~EVENTFS_SAVE_GID;
- ei->entry_attrs[i].gid = ti->vfs_inode.i_gid;
+ ei->entry_attrs[i].gid = gid;
}
}
+
+}
+
+/*
+ * On a remount of tracefs, if UID or GID options are set, then
+ * the mount point inode permissions should be used.
+ * Reset the saved permission flags appropriately.
+ */
+void eventfs_remount(struct tracefs_inode *ti, bool update_uid, bool update_gid)
+{
+ struct eventfs_inode *ei = ti->private;
+
+ /* Only the events directory does the updates */
+ if (!ei || !ei->is_events || ei->is_freed)
+ return;
+
+ eventfs_set_attrs(ei, update_uid, ti->vfs_inode.i_uid,
+ update_gid, ti->vfs_inode.i_gid, 0);
}
/* Return the evenfs_inode of the "events" directory */
--
2.43.0