August 2020 - Linux-stable-mirror

[PATCH v3] lib/string.c: implement stpcpy

by Nick Desaulniers

LLVM implemented a recent "libcall optimization" that lowers calls to `sprintf(dest, "%s", str)` where the return value is used to `stpcpy(dest, str) - dest`. This generally avoids the machinery involved in parsing format strings. `stpcpy` is just like `strcpy` except it returns the pointer to the new tail of `dest`. This optimization was introduced into clang-12. Implement this so that we don't observe linkage failures due to missing symbol definitions for `stpcpy`. Similar to last year's fire drill with: commit 5f074f3e192f ("lib/string.c: implement a basic bcmp") The kernel is somewhere between a "freestanding" environment (no full libc) and "hosted" environment (many symbols from libc exist with the same type, function signature, and semantics). As H. Peter Anvin notes, there's not really a great way to inform the compiler that you're targeting a freestanding environment but would like to opt-in to some libcall optimizations (see pr/47280 below), rather than opt-out. Arvind notes, -fno-builtin-* behaves slightly differently between GCC and Clang, and Clang is missing many __builtin_* definitions, which I consider a bug in Clang and am working on fixing. Masahiro summarizes the subtle distinction between compilers justly: To prevent transformation from foo() into bar(), there are two ways in Clang to do that; -fno-builtin-foo, and -fno-builtin-bar. There is only one in GCC; -fno-buitin-foo. (Any difference in that behavior in Clang is likely a bug from a missing __builtin_* definition.) Masahiro also notes: We want to disable optimization from foo() to bar(), but we may still benefit from the optimization from foo() into something else. If GCC implements the same transform, we would run into a problem because it is not -fno-builtin-bar, but -fno-builtin-foo that disables that optimization. In this regard, -fno-builtin-foo would be more future-proof than -fno-built-bar, but -fno-builtin-foo is still potentially overkill. We may want to prevent calls from foo() being optimized into calls to bar(), but we still may want other optimization on calls to foo(). It seems that compilers today don't quite provide the fine grain control over which libcall optimizations pseudo-freestanding environments would prefer. Finally, Kees notes that this interface is unsafe, so we should not encourage its use. As such, I've removed the declaration from any header, but it still needs to be exported to avoid linkage errors in modules. Cc: stable(a)vger.kernel.org Link: https://bugs.llvm.org/show_bug.cgi?id=47162 Link: https://bugs.llvm.org/show_bug.cgi?id=47280 Link: https://github.com/ClangBuiltLinux/linux/issues/1126 Link: https://man7.org/linux/man-pages/man3/stpcpy.3.html Link: https://pubs.opengroup.org/onlinepubs/9699919799/functions/stpcpy.html Link: https://reviews.llvm.org/D85963 Suggested-by: Andy Lavr <andy.lavr(a)gmail.com> Suggested-by: Arvind Sankar <nivedita(a)alum.mit.edu> Suggested-by: Joe Perches <joe(a)perches.com> Suggested-by: Masahiro Yamada <masahiroy(a)kernel.org> Suggested-by: Rasmus Villemoes <linux(a)rasmusvillemoes.dk> Reported-by: Sami Tolvanen <samitolvanen(a)google.com> Signed-off-by: Nick Desaulniers <ndesaulniers(a)google.com> --- Changes V3: * Drop Sami's Tested by tag; newer patch. * Add EXPORT_SYMBOL as per Andy. * Rewrite commit message, rewrote part of what Masahiro said to be generic in terms of foo() and bar(). * Prefer %NUL-terminated to NULL terminated. NUL is the ASCII character '\0', as per Arvind and Rasmus. Changes V2: * Added Sami's Tested by; though the patch changed implementation, the missing symbol at link time was the problem Sami was observing. * Fix __restrict -> __restrict__ typo as per Joe. * Drop note about restrict from commit message as per Arvind. * Fix NULL -> NUL as per Arvind; NUL is ASCII '\0'. TIL * Fix off by one error as per Arvind; I had another off by one error in my test program that was masking this. lib/string.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/lib/string.c b/lib/string.c index 6012c385fb31..6bd0cf0fb009 100644 --- a/lib/string.c +++ b/lib/string.c @@ -272,6 +272,30 @@ ssize_t strscpy_pad(char *dest, const char *src, size_t count) } EXPORT_SYMBOL(strscpy_pad); +/** + * stpcpy - copy a string from src to dest returning a pointer to the new end + * of dest, including src's %NUL-terminator. May overrun dest. + * @dest: pointer to end of string being copied into. Must be large enough + * to receive copy. + * @src: pointer to the beginning of string being copied from. Must not overlap + * dest. + * + * stpcpy differs from strcpy in a key way: the return value is the new + * %NUL-terminated character. (for strcpy, the return value is a pointer to + * src. This interface is considered unsafe as it doesn't perform bounds + * checking of the inputs. As such it's not recommended for usage. Instead, + * its definition is provided in case the compiler lowers other libcalls to + * stpcpy. + */ +char *stpcpy(char *__restrict__ dest, const char *__restrict__ src); +char *stpcpy(char *__restrict__ dest, const char *__restrict__ src) +{ + while ((*dest++ = *src++) != '\0') + /* nothing */; + return --dest; +} +EXPORT_SYMBOL(stpcpy); + #ifndef __HAVE_ARCH_STRCAT /** * strcat - Append one %NUL-terminated string to another -- 2.28.0.297.g1956fa8f8d-goog

4 years, 10 months

8
22
0 0

[5.4.y] Found 27 commits that might missed

by SeongJae Park

From: SeongJae Park <sjpark(a)amazon.de> Hello, We found below 27 commits in the 'v5.5..linus/master (upstream)' seems fixing or mentioning commits in the 'v5.4..stable/linux-5.4.y (downstream)' but are not merged in the 'downstream' yet. Could you please review if those need to be merged in? A commit is considered as fix of another if the complete 'Fixed:' tag is in the commit message. If the tag is not found but the commit message contains the title or the hash id of the other commit, it is considered mentioning it. So, the 'mentions' might have many false positives, but it could cover the typos (I found such cases before). The commits are grouped as 'fixes cleanly applicable', 'fixes not cleanly applicable (need manual backporting to be applied)', 'mentions cleanly applicable', and 'mentions not cleanly applicable'. Also, the commits in each group are sorted by the commit dates (oldest first). Both the finding of the commits and the writeup of this report is automatically done by a little script[1]. I'm going to run the tool and post this kind of report every couple of weeks or every month. Any comment (e.g., regarding posting period, new features request, bug report, ...) is welcome. Especially, if you find some commits that don't need to be merged in the downstream, please let me know so that I can mark those as unnecessary and don't bother you again. [1] https://github.com/sjp38/stream-track Thanks, SeongJae # v5.5: 4e3112a240ba9986cc3f67a6880da6529a955006 # linus/master: 15bc20c6af4ceee97a1f90b43c0e386643c071b4 # v5.4: 6e815efe19a99a33b16cc720c3d3a727565a4fa1 # stable/linux-5.4.y: 6576d69aac94cd8409636dfa86e0df39facdf0d2 Fixes cleanly applicable ------------------------ 2fb75ceaf71a ("remoteproc: Add missing '\n' in log messages") # commit date: 2020-04-22, author: Christophe JAILLET <christophe.jaillet(a)wanadoo.fr> # fixes 'remoteproc: Fix NULL pointer dereference in rproc_virtio_notify' 1b9ae0c92925 ("wireless: Use linux/stddef.h instead of stddef.h") # commit date: 2020-05-27, author: Hauke Mehrtens <hauke(a)hauke-m.de> # fixes 'wireless: Use offsetof instead of custom macro.' e4b0e41fee94 ("ALSA: usb-audio: Use the new macro for HP Dock rename quirks") # commit date: 2020-06-08, author: Takashi Iwai <tiwai(a)suse.de> # fixes 'ALSA: usb-audio: Add vendor, product and profile name for HP Thunderbolt Dock' efb94790852a ("drm/panel-simple: fix connector type for LogicPD Type28 Display") # commit date: 2020-06-21, author: Adam Ford <aford173(a)gmail.com> # fixes 'drm/panel: simple: Add Logic PD Type 28 display support' 2f57b8d57673 ("dmaengine: imx-sdma: Fix: Remove 'always true' comparison") # commit date: 2020-06-24, author: Fabio Estevam <festevam(a)gmail.com> # fixes 'dmaengine: imx-sdma: Fix the event id check to include RX event for UART6' 10de795a5add ("kprobes: Fix compiler warning for !CONFIG_KPROBES_ON_FTRACE") # commit date: 2020-08-06, author: Muchun Song <songmuchun(a)bytedance.com> # fixes 'kprobes: Fix NULL pointer dereference at kprobe_ftrace_handler' Fixes not cleanly applicable ---------------------------- 3907ccfaec5d ("crypto: atmel-aes - Fix CTR counter overflow when multiple fragments") # commit date: 2019-12-20, author: Tudor Ambarus <tudor.ambarus(a)microchip.com> # fixes 'crypto: atmel-aes - Fix counter overflow in CTR mode' 9210c075cef2 ("nvme-pci: avoid race between nvme_reap_pending_cqes() and nvme_poll()") # commit date: 2020-05-27, author: Dongli Zhang <dongli.zhang(a)oracle.com> # fixes 'nvme/pci: move cqe check after device shutdown' 6e2f83884c09 ("bnxt_en: Fix AER reset logic on 57500 chips.") # commit date: 2020-06-15, author: Michael Chan <michael.chan(a)broadcom.com> # fixes 'bnxt_en: Improve AER slot reset.' 695cf5ab401c ("ALSA: usb-audio: Fix packet size calculation") # commit date: 2020-06-30, author: Alexander Tsoy <alexander(a)tsoy.me> # fixes 'ALSA: usb-audio: Improve frames size computation' 2fb2799a2abb ("net: rmnet: do not allow to add multiple bridge interfaces") # commit date: 2020-07-04, author: Taehee Yoo <ap420073(a)gmail.com> # fixes 'net: rmnet: use upper/lower device infrastructure' Mentions cleanly applicable --------------------------- 32ada3b9e04c ("x86/resctrl: Clean up unused function parameter in mkdir path") # commit date: 2020-01-20, author: Xiaochen Shen <xiaochen.shen(a)intel.com> # mentions 'x86/resctrl: Fix a deadlock due to inaccurate reference' 20f513091caf ("crypto: ccree - remove set but not used variable 'du_size'") # commit date: 2020-02-13, author: YueHaibing <yuehaibing(a)huawei.com> # mentions 'crypto: ccree - fix FDE descriptor sequence' b182a66792fe ("net: ena: remove set but not used variable 'hash_key'") # commit date: 2020-02-17, author: YueHaibing <yuehaibing(a)huawei.com> # mentions 'net: ena: rss: do not allocate key when not supported' cbb5494ebce5 ("Revert "thunderbolt: Prevent crash if non-active NVMem file is read"") # commit date: 2020-04-16, author: Nicholas Johnson <nicholas.johnson-opensource(a)outlook.com.au> # mentions 'thunderbolt: Prevent crash if non-active NVMem file is read' 1a33e10e4a95 ("net: partially revert dynamic lockdep key changes") # commit date: 2020-05-04, author: Cong Wang <xiyou.wangcong(a)gmail.com> # mentions 'bonding: add missing netdev_update_lockdep_key()' 7e579f3a074c ("tools arch x86 uapi: Synch asm/unistd.h with the kernel sources") # commit date: 2020-06-09, author: Arnaldo Carvalho de Melo <acme(a)redhat.com> # mentions 'x86/syscalls: Revert "x86/syscalls: Make __X32_SYSCALL_BIT be unsigned long"' c3dbe541ef77 ("blktrace: Avoid sparse warnings when assigning q->blk_trace") # commit date: 2020-06-17, author: Jan Kara <jack(a)suse.cz> # mentions 'blktrace: Protect q->blk_trace with RCU' 6d548b9e5d56 ("btrfs: fix reclaim_size counter leak after stealing from global reserve") # commit date: 2020-07-02, author: Filipe Manana <fdmanana(a)suse.com> # mentions 'btrfs: improve global reserve stealing logic' Mentions not cleanly applicable ------------------------------- 6dbd54e4154d ("Revert "tty/serial: atmel: fix out of range clock divider handling"") # commit date: 2019-12-18, author: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> # mentions 'tty/serial: atmel: fix out of range clock divider handling' 48d7fb181a91 ("drm/i915: Remove lite restore defines") # commit date: 2020-02-08, author: Mika Kuoppala <mika.kuoppala(a)linux.intel.com> # mentions 'drm/i915/gt: Detect if we miss WaIdleLiteRestore' ff7e06a55676 ("ALSA: pcm: oss: Fix regression by buffer overflow fix (again)") # commit date: 2020-04-03, author: Takashi Iwai <tiwai(a)suse.de> # mentions 'ALSA: pcm: oss: Avoid plugin buffer overflow' ac957e8c5411 ("ALSA: pcm: oss: Place the plugin buffer overflow checks correctly (for 5.7)") # commit date: 2020-04-24, author: Takashi Iwai <tiwai(a)suse.de> # mentions 'ALSA: pcm: oss: Avoid plugin buffer overflow' e7511f560f54 ("bonding: remove useless stats_lock_key") # commit date: 2020-05-04, author: Cong Wang <xiyou.wangcong(a)gmail.com> # mentions 'bonding: fix lockdep warning in bond_get_stats()' 39f23ce07b93 ("sched/fair: Fix unthrottle_cfs_rq() for leaf_cfs_rq list") # commit date: 2020-05-19, author: Vincent Guittot <vincent.guittot(a)linaro.org> # mentions 'sched/fair: Fix enqueue_task_fair warning' 8ab3a3812aa9 ("drm/i915/gt: Incrementally check for rewinding") # commit date: 2020-06-16, author: Chris Wilson <chris(a)chris-wilson.co.uk> # mentions 'drm/i915/execlists: Always force a context reload when rewinding RING_TAIL' 5e548b32018d ("btrfs: do not set the full sync flag on the inode during page release") # commit date: 2020-07-27, author: Filipe Manana <fdmanana(a)suse.com> # mentions 'btrfs: fix race between page release and a fast fsync'

4 years, 10 months

8
10
0 0

[PATCH AUTOSEL 5.8 01/42] hwmon: (pmbus/isl68137) remove READ_TEMPERATURE_1 telemetry for RAA228228

by Sasha Levin

From: Grant Peltier <grantpeltier93(a)gmail.com> [ Upstream commit 51fb91ed5a6fa855a74731610cd5435d83d6e17f ] Per the RAA228228 datasheet, READ_TEMPERATURE_1 is not a supported PMBus command. Signed-off-by: Grant Peltier <grantpeltier93(a)gmail.com> Signed-off-by: Guenter Roeck <linux(a)roeck-us.net> Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- drivers/hwmon/pmbus/isl68137.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/hwmon/pmbus/isl68137.c b/drivers/hwmon/pmbus/isl68137.c index 0c622711ef7e0..58aa95a3c010c 100644 --- a/drivers/hwmon/pmbus/isl68137.c +++ b/drivers/hwmon/pmbus/isl68137.c @@ -67,6 +67,7 @@ enum variants { raa_dmpvr1_2rail, raa_dmpvr2_1rail, raa_dmpvr2_2rail, + raa_dmpvr2_2rail_nontc, raa_dmpvr2_3rail, raa_dmpvr2_hv, }; @@ -241,6 +242,10 @@ static int isl68137_probe(struct i2c_client *client, info->pages = 1; info->read_word_data = raa_dmpvr2_read_word_data; break; + case raa_dmpvr2_2rail_nontc: + info->func[0] &= ~PMBUS_HAVE_TEMP; + info->func[1] &= ~PMBUS_HAVE_TEMP; + fallthrough; case raa_dmpvr2_2rail: info->pages = 2; info->read_word_data = raa_dmpvr2_read_word_data; @@ -304,7 +309,7 @@ static const struct i2c_device_id raa_dmpvr_id[] = { {"raa228000", raa_dmpvr2_hv}, {"raa228004", raa_dmpvr2_hv}, {"raa228006", raa_dmpvr2_hv}, - {"raa228228", raa_dmpvr2_2rail}, + {"raa228228", raa_dmpvr2_2rail_nontc}, {"raa229001", raa_dmpvr2_2rail}, {"raa229004", raa_dmpvr2_2rail}, {} -- 2.25.1

4 years, 10 months

4
47
0 0

[PATCH] x86/debug: Allow a single level of #DB recursion

by Andy Lutomirski

Trying to clear DR7 around a #DB from usermode malfunctions if we schedule when delivering SIGTRAP. Rather than trying to define a special no-recursion region, just allow a single level of recursion. We do the same thing for NMI, and it hasn't caused any problems yet. Reported-by: Kyle Huey <me(a)kylehuey.com> Debugged-by: Josh Poimboeuf <jpoimboe(a)redhat.com> Fixes: 9f58fdde95c9 ("x86/db: Split out dr6/7 handling") Cc: stable(a)vger.kernel.org Signed-off-by: Andy Lutomirski <luto(a)kernel.org> --- arch/x86/kernel/traps.c | 69 +++++++++++++++++++++++------------------ 1 file changed, 38 insertions(+), 31 deletions(-) diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 1f66d2d1e998..bf852b72f15c 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -729,20 +729,9 @@ static bool is_sysenter_singlestep(struct pt_regs *regs) #endif } -static __always_inline void debug_enter(unsigned long *dr6, unsigned long *dr7) +static __always_inline unsigned long debug_read_clear_dr6(void) { - /* - * Disable breakpoints during exception handling; recursive exceptions - * are exceedingly 'fun'. - * - * Since this function is NOKPROBE, and that also applies to - * HW_BREAKPOINT_X, we can't hit a breakpoint before this (XXX except a - * HW_BREAKPOINT_W on our stack) - * - * Entry text is excluded for HW_BP_X and cpu_entry_area, which - * includes the entry stack is excluded for everything. - */ - *dr7 = local_db_save(); + unsigned long dr6; /* * The Intel SDM says: @@ -755,15 +744,21 @@ static __always_inline void debug_enter(unsigned long *dr6, unsigned long *dr7) * * Keep it simple: clear DR6 immediately. */ - get_debugreg(*dr6, 6); + get_debugreg(dr6, 6); set_debugreg(0, 6); /* Filter out all the reserved bits which are preset to 1 */ - *dr6 &= ~DR6_RESERVED; + dr6 &= ~DR6_RESERVED; + + return dr6; +} + +static __always_inline void debug_enter(unsigned long *dr6, unsigned long *dr7) +{ + *dr6 = debug_read_clear_dr6(); } static __always_inline void debug_exit(unsigned long dr7) { - local_db_restore(dr7); } /* @@ -863,6 +858,19 @@ static void handle_debug(struct pt_regs *regs, unsigned long dr6, bool user) static __always_inline void exc_debug_kernel(struct pt_regs *regs, unsigned long dr6) { + /* + * Disable breakpoints during exception handling; recursive exceptions + * are exceedingly 'fun'. + * + * Since this function is NOKPROBE, and that also applies to + * HW_BREAKPOINT_X, we can't hit a breakpoint before this (XXX except a + * HW_BREAKPOINT_W on our stack) + * + * Entry text is excluded for HW_BP_X and cpu_entry_area, which + * includes the entry stack is excluded for everything. + */ + unsigned long dr7 = local_db_save(); + bool irq_state = idtentry_enter_nmi(regs); instrumentation_begin(); @@ -883,6 +891,8 @@ static __always_inline void exc_debug_kernel(struct pt_regs *regs, instrumentation_end(); idtentry_exit_nmi(regs, irq_state); + + local_db_restore(dr7); } static __always_inline void exc_debug_user(struct pt_regs *regs, @@ -894,6 +904,15 @@ static __always_inline void exc_debug_user(struct pt_regs *regs, */ WARN_ON_ONCE(!user_mode(regs)); + /* + * NB: We can't easily clear DR7 here because + * idtentry_exit_to_usermode() can invoke ptrace, schedule, access + * user memory, etc. This means that a recursive #DB is possible. If + * this happens, that #DB will hit exc_debug_kernel() and clear DR7. + * Since we're not on the IST stack right now, everything will be + * fine. + */ + irqentry_enter_from_user_mode(regs); instrumentation_begin(); @@ -907,36 +926,24 @@ static __always_inline void exc_debug_user(struct pt_regs *regs, /* IST stack entry */ DEFINE_IDTENTRY_DEBUG(exc_debug) { - unsigned long dr6, dr7; - - debug_enter(&dr6, &dr7); - exc_debug_kernel(regs, dr6); - debug_exit(dr7); + exc_debug_kernel(regs, debug_read_clear_dr6()); } /* User entry, runs on regular task stack */ DEFINE_IDTENTRY_DEBUG_USER(exc_debug) { - unsigned long dr6, dr7; - - debug_enter(&dr6, &dr7); - exc_debug_user(regs, dr6); - debug_exit(dr7); + exc_debug_user(regs, debug_read_clear_dr6()); } #else /* 32 bit does not have separate entry points. */ DEFINE_IDTENTRY_RAW(exc_debug) { - unsigned long dr6, dr7; - - debug_enter(&dr6, &dr7); + unsigned long dr6 = debug_read_clear_dr6(); if (user_mode(regs)) exc_debug_user(regs, dr6); else exc_debug_kernel(regs, dr6); - - debug_exit(dr7); } #endif -- 2.25.4

4 years, 10 months

3
3
0 0

[PATCH for-4.14] tpm: Unify the mismatching TPM space buffer sizes

by Stefan Berger

From: Jarkko Sakkinen <jarkko.sakkinen(a)linux.intel.com> The size of the buffers for storing context's and sessions can vary from arch to arch as PAGE_SIZE can be anything between 4 kB and 256 kB (the maximum for PPC64). Define a fixed buffer size set to 16 kB. This should be enough for most use with three handles (that is how many we allow at the moment). Parametrize the buffer size while doing this, so that it is easier to revisit this later on if required. Cc: stable(a)vger.kernel.org Reported-by: Stefan Berger <stefanb(a)linux.ibm.com> Fixes: 745b361e989a ("tpm: infrastructure for TPM spaces") Reviewed-by: Jerry Snitselaar <jsnitsel(a)redhat.com> Tested-by: Stefan Berger <stefanb(a)linux.ibm.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen(a)linux.intel.com> --- drivers/char/tpm/tpm-chip.c | 9 ++------- drivers/char/tpm/tpm.h | 6 +++++- drivers/char/tpm/tpm2-space.c | 26 ++++++++++++++++---------- drivers/char/tpm/tpmrm-dev.c | 2 +- 4 files changed, 24 insertions(+), 19 deletions(-) diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c index dcf5bb1534955..11ec5c2715a9e 100644 --- a/drivers/char/tpm/tpm-chip.c +++ b/drivers/char/tpm/tpm-chip.c @@ -247,13 +247,8 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev, chip->cdev.owner = THIS_MODULE; chip->cdevs.owner = THIS_MODULE; - chip->work_space.context_buf = kzalloc(PAGE_SIZE, GFP_KERNEL); - if (!chip->work_space.context_buf) { - rc = -ENOMEM; - goto out; - } - chip->work_space.session_buf = kzalloc(PAGE_SIZE, GFP_KERNEL); - if (!chip->work_space.session_buf) { + rc = tpm2_init_space(&chip->work_space, TPM2_SPACE_BUFFER_SIZE); + if (rc) { rc = -ENOMEM; goto out; } diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h index d53d12f3df6d6..019fe80fedd83 100644 --- a/drivers/char/tpm/tpm.h +++ b/drivers/char/tpm/tpm.h @@ -174,6 +174,7 @@ struct tpm_space { u8 *context_buf; u32 session_tbl[3]; u8 *session_buf; + u32 buf_size; }; enum tpm_chip_flags { @@ -261,6 +262,9 @@ struct tpm_output_header { #define TPM_TAG_RQU_COMMAND 193 +/* TPM2 specific constants. */ +#define TPM2_SPACE_BUFFER_SIZE 16384 /* 16 kB */ + struct stclear_flags_t { __be16 tag; u8 deactivated; @@ -583,7 +587,7 @@ void tpm2_shutdown(struct tpm_chip *chip, u16 shutdown_type); unsigned long tpm2_calc_ordinal_duration(struct tpm_chip *chip, u32 ordinal); int tpm2_probe(struct tpm_chip *chip); int tpm2_find_cc(struct tpm_chip *chip, u32 cc); -int tpm2_init_space(struct tpm_space *space); +int tpm2_init_space(struct tpm_space *space, unsigned int buf_size); void tpm2_del_space(struct tpm_chip *chip, struct tpm_space *space); int tpm2_prepare_space(struct tpm_chip *chip, struct tpm_space *space, u32 cc, u8 *cmd); diff --git a/drivers/char/tpm/tpm2-space.c b/drivers/char/tpm/tpm2-space.c index dabb2ae4e779a..115f0fb32179f 100644 --- a/drivers/char/tpm/tpm2-space.c +++ b/drivers/char/tpm/tpm2-space.c @@ -44,18 +44,21 @@ static void tpm2_flush_sessions(struct tpm_chip *chip, struct tpm_space *space) } } -int tpm2_init_space(struct tpm_space *space) +int tpm2_init_space(struct tpm_space *space, unsigned int buf_size) { - space->context_buf = kzalloc(PAGE_SIZE, GFP_KERNEL); + space->context_buf = kzalloc(buf_size, GFP_KERNEL); if (!space->context_buf) return -ENOMEM; - space->session_buf = kzalloc(PAGE_SIZE, GFP_KERNEL); + space->session_buf = kzalloc(buf_size, GFP_KERNEL); if (space->session_buf == NULL) { kfree(space->context_buf); + /* Prevent caller getting a dangling pointer. */ + space->context_buf = NULL; return -ENOMEM; } + space->buf_size = buf_size; return 0; } @@ -278,8 +281,10 @@ int tpm2_prepare_space(struct tpm_chip *chip, struct tpm_space *space, u32 cc, sizeof(space->context_tbl)); memcpy(&chip->work_space.session_tbl, &space->session_tbl, sizeof(space->session_tbl)); - memcpy(chip->work_space.context_buf, space->context_buf, PAGE_SIZE); - memcpy(chip->work_space.session_buf, space->session_buf, PAGE_SIZE); + memcpy(chip->work_space.context_buf, space->context_buf, + space->buf_size); + memcpy(chip->work_space.session_buf, space->session_buf, + space->buf_size); rc = tpm2_load_space(chip); if (rc) { @@ -459,7 +464,7 @@ static int tpm2_save_space(struct tpm_chip *chip) continue; rc = tpm2_save_context(chip, space->context_tbl[i], - space->context_buf, PAGE_SIZE, + space->context_buf, space->buf_size, &offset); if (rc == -ENOENT) { space->context_tbl[i] = 0; @@ -478,9 +483,8 @@ static int tpm2_save_space(struct tpm_chip *chip) continue; rc = tpm2_save_context(chip, space->session_tbl[i], - space->session_buf, PAGE_SIZE, + space->session_buf, space->buf_size, &offset); - if (rc == -ENOENT) { /* handle error saving session, just forget it */ space->session_tbl[i] = 0; @@ -526,8 +530,10 @@ int tpm2_commit_space(struct tpm_chip *chip, struct tpm_space *space, sizeof(space->context_tbl)); memcpy(&space->session_tbl, &chip->work_space.session_tbl, sizeof(space->session_tbl)); - memcpy(space->context_buf, chip->work_space.context_buf, PAGE_SIZE); - memcpy(space->session_buf, chip->work_space.session_buf, PAGE_SIZE); + memcpy(space->context_buf, chip->work_space.context_buf, + space->buf_size); + memcpy(space->session_buf, chip->work_space.session_buf, + space->buf_size); return 0; } diff --git a/drivers/char/tpm/tpmrm-dev.c b/drivers/char/tpm/tpmrm-dev.c index 1a0e97a5da5a4..162fb16243d03 100644 --- a/drivers/char/tpm/tpmrm-dev.c +++ b/drivers/char/tpm/tpmrm-dev.c @@ -22,7 +22,7 @@ static int tpmrm_open(struct inode *inode, struct file *file) if (priv == NULL) return -ENOMEM; - rc = tpm2_init_space(&priv->space); + rc = tpm2_init_space(&priv->space, TPM2_SPACE_BUFFER_SIZE); if (rc) { kfree(priv); return -ENOMEM; -- 2.26.2

4 years, 10 months

3
2
0 0

Re: [PATCH] scsi: target: tcmu: fix size in calls to tcmu_flush_dcache_range

by Bodo Stroesser

Hi, I'm adding stable(a)vger.kernel.org Once again, this time really adding stable. On 2020-06-03 04:31, Martin K. Petersen wrote: > On Thu, 28 May 2020 21:31:08 +0200, Bodo Stroesser wrote: > >> 1) If remaining ring space before the end of the ring is >> smaller then the next cmd to write, tcmu writes a padding >> entry which fills the remaining space at the end of the >> ring. >> Then tcmu calls tcmu_flush_dcache_range() with the size >> of struct tcmu_cmd_entry as data length to flush. >> If the space filled by the padding was smaller then >> tcmu_cmd_entry, tcmu_flush_dcache_range() is called for >> an address range reaching behind the end of the vmalloc'ed >> ring. >> tcmu_flush_dcache_range() in a loop calls >> flush_dcache_page(virt_to_page(start)); >> for every page being part of the range. On x86 the line is >> optimized out by the compiler, as flush_dcache_page() is >> empty on x86. >> But I assume the above can cause trouble on other >> architectures that really have a flush_dcache_page(). >> For paddings only the header part of an entry is relevant >> Due to alignment rules the header always fits in the >> remaining space, if padding is needed. >> So tcmu_flush_dcache_range() can safely be called with >> sizeof(entry->hdr) as the length here. >> >> [...] > > Applied to 5.8/scsi-queue, thanks! > > [1/1] scsi: target: tcmu: Fix size in calls to tcmu_flush_dcache_range > https://git.kernel.org/mkp/scsi/c/8c4e0f212398 > The full commit of this patch is: 8c4e0f212398cdd1eb4310a5981d06a723cdd24f This patch is the first of four patches that are necessary to run tcmu on ARM without crash. For details please see https://bugzilla.kernel.org/show_bug.cgi?id=208045 Upsteam commits of patches 2,3, and 4 are: 2: 3c58f737231e "scsi: target: tcmu: Optimize use of flush_dcache_page" 3: 3145550a7f8b "scsi: target: tcmu: Fix crash in tcmu_flush_dcache_range on ARM" 4: 5a0c256d96f0 "scsi: target: tcmu: Fix crash on ARM during cmd completion" Since patches 3 and 4 already were accepted for 5.8, 5.4, and 4.19, and I sent a request to add patch 2 about 1 hour ago, please consider adding this patch to 5.4 and 4.19, because without it tcmu on ARM will still crash. Thank you, Bodo

4 years, 10 months

2
3
0 0

[PATCH nvme] nvme: Revert "nvme: Discard workaround for non-conformant devices"

by Dakshaja Uppalapati

This reverts upstream 'commit 530436c45ef2 ("nvme: Discard workaround for non-conformant devices")' Since commit `530436c45ef2` introduced a regression due to which blk_update_request IO error is observed on formatting device, reverting it. Fixes: 530436c45ef2 ("nvme: Discard workaround for non-conformant devices") Cc: stable <stable(a)vger.kernel.org> # 4.19+ Signed-off-by: Dakshaja Uppalapati <dakshaja(a)chelsio.com> --- drivers/nvme/host/core.c | 12 +++--------- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index f3c037f5a9ba..ec598a86f88e 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -607,14 +607,8 @@ static blk_status_t nvme_setup_discard(struct nvme_ns *ns, struct request *req, struct nvme_dsm_range *range; struct bio *bio; - /* - * Some devices do not consider the DSM 'Number of Ranges' field when - * determining how much data to DMA. Always allocate memory for maximum - * number of segments to prevent device reading beyond end of buffer. - */ - static const size_t alloc_size = sizeof(*range) * NVME_DSM_MAX_RANGES; - - range = kzalloc(alloc_size, GFP_ATOMIC | __GFP_NOWARN); + range = kmalloc_array(segments, sizeof(*range), + GFP_ATOMIC | __GFP_NOWARN); if (!range) { /* * If we fail allocation our range, fallback to the controller @@ -654,7 +648,7 @@ static blk_status_t nvme_setup_discard(struct nvme_ns *ns, struct request *req, req->special_vec.bv_page = virt_to_page(range); req->special_vec.bv_offset = offset_in_page(range); - req->special_vec.bv_len = alloc_size; + req->special_vec.bv_len = sizeof(*range) * segments; req->rq_flags |= RQF_SPECIAL_PAYLOAD; return BLK_STS_OK; -- 2.18.0.232.gb7bd9486b.dirty

4 years, 10 months

5
10
0 0

[PATCH for-stable nvmet 0/6] nvme: Fix for blk_update_request IO error.

by Dakshaja Uppalapati

The below error is seen in dmesg, while formatting the disks discovered on host. dmesg: [ 636.733374] blk_update_request: I/O error, dev nvme4n1, sector 0 op 0x3:(DISCARD) flags 0x800 phys_seg 1 prio class 0 Patch 6 fixes it and there are 5 other dependent patches that also need to be pulled from upstream to stable, 5.4 and 4.19 branches. Patch 1 dependent patch Patch 2 dependent patch Patch 3 dependent patch Patch 4 dependent patch Patch 5 dependent patch Patch 6 fix patch Thanks, Dakshaja Christoph Hellwig (5): nvmet: Cleanup discovery execute handlers nvmet: Introduce common execute function for get_log_page and identify nvmet: Introduce nvmet_dsm_len() helper nvmet: Remove the data_len field from the nvmet_req struct nvmet: Open code nvmet_req_execute() Sagi Grimberg (1): nvmet: fix dsm failure when payload does not match sgl descriptor drivers/nvme/target/admin-cmd.c | 128 +++++++++++++++++------------- drivers/nvme/target/core.c | 23 ++++-- drivers/nvme/target/discovery.c | 62 +++++++-------- drivers/nvme/target/fabrics-cmd.c | 15 +++- drivers/nvme/target/fc.c | 4 +- drivers/nvme/target/io-cmd-bdev.c | 19 +++-- drivers/nvme/target/io-cmd-file.c | 20 +++-- drivers/nvme/target/loop.c | 2 +- drivers/nvme/target/nvmet.h | 11 ++- drivers/nvme/target/rdma.c | 4 +- drivers/nvme/target/tcp.c | 6 +- 11 files changed, 176 insertions(+), 118 deletions(-) -- 2.18.0.232.gb7bd9486b.dirty

4 years, 10 months

3
11
0 0

[PATCH] iommu/intel: Handle 36b addressing for x86-32

by Chris Wilson

Beware that the address size for x86-32 may exceed unsigned long. [ 0.368971] UBSAN: shift-out-of-bounds in drivers/iommu/intel/iommu.c:128:14 [ 0.369055] shift exponent 36 is too large for 32-bit type 'long unsigned int' If we don't handle the wide addresses, the pages are mismapped and the device read/writes go astray, detected as DMAR faults and leading to device failure. The behaviour changed (from working to broken) in commit fa954e683178 ("iommu/vt-d: Delegate the dma domain to upper layer"), but the error looks older. Fixes: fa954e683178 ("iommu/vt-d: Delegate the dma domain to upper layer") Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk> Cc: James Sewart <jamessewart(a)arista.com> Cc: Lu Baolu <baolu.lu(a)linux.intel.com> Cc: Joerg Roedel <jroedel(a)suse.de> Cc: <stable(a)vger.kernel.org> # v5.3+ --- drivers/iommu/intel/iommu.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 2e9c8c3d0da4..ba78a2e854f9 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -123,29 +123,29 @@ static inline unsigned int level_to_offset_bits(int level) return (level - 1) * LEVEL_STRIDE; } -static inline int pfn_level_offset(unsigned long pfn, int level) +static inline int pfn_level_offset(u64 pfn, int level) { return (pfn >> level_to_offset_bits(level)) & LEVEL_MASK; } -static inline unsigned long level_mask(int level) +static inline u64 level_mask(int level) { - return -1UL << level_to_offset_bits(level); + return -1ULL << level_to_offset_bits(level); } -static inline unsigned long level_size(int level) +static inline u64 level_size(int level) { - return 1UL << level_to_offset_bits(level); + return 1ULL << level_to_offset_bits(level); } -static inline unsigned long align_to_level(unsigned long pfn, int level) +static inline u64 align_to_level(u64 pfn, int level) { return (pfn + level_size(level) - 1) & level_mask(level); } static inline unsigned long lvl_to_nr_pages(unsigned int lvl) { - return 1 << min_t(int, (lvl - 1) * LEVEL_STRIDE, MAX_AGAW_PFN_WIDTH); + return 1UL << min_t(int, (lvl - 1) * LEVEL_STRIDE, MAX_AGAW_PFN_WIDTH); } /* VT-d pages must always be _smaller_ than MM pages. Otherwise things -- 2.20.1

4 years, 10 months

3
3
0 0

Re: [PATCH v2 00/28] The new cgroup slab memory controller

by Pavel Tatashin

Guys, There is a convoluted deadlock that I just root caused, and that is fixed by this work (at least based on my code inspection it appears to be fixed); but the deadlock exists in older and stable kernels, and I am not sure whether to create a separate patch for it, or backport this whole thing. Thread #1: Hot-removes memory device_offline memory_subsys_offline offline_pages __offline_pages mem_hotplug_lock <- write access waits for Thread #3 refcnt for pfn 9e5113 to get to 1 so it can migrate it. Thread #2: ccs killer kthread css_killed_work_fn cgroup_mutex <- Grab this Mutex mem_cgroup_css_offline memcg_offline_kmem.part memcg_deactivate_kmem_caches get_online_mems mem_hotplug_lock <- waits for Thread#1 to get read access Thread #3: crashing userland program do_coredump elf_core_dump get_dump_page() -> get page with pfn#9e5113, and increment refcnt dump_emit __kernel_write __vfs_write new_sync_write pipe_write pipe_wait -> waits for Thread #4 systemd-coredump to read the pipe Thread #4: systemd-coredump ksys_read vfs_read __vfs_read seq_read proc_single_show proc_cgroup_show cgroup_mutex -> waits from Thread #2 for this lock. In Summary: Thread#1 waits for Thread#3 for refcnt, Thread#3 waits for Thread#4 to read pipe. Thread#4 waits for Thread#2 for cgroup_mutex lock; Thread#2 waits for Thread#1 for mem_hotplug_lock rwlock. This work appears to fix this deadlock because cgroup_mutex is not called anymore before mem_hotplug_lock (unless I am missing it), as it removes memcg_deactivate_kmem_caches. Thank you, Pasha On Wed, Jan 29, 2020 at 9:42 PM Roman Gushchin <guro(a)fb.com> wrote: > > On Thu, Jan 30, 2020 at 07:36:26AM +0530, Bharata B Rao wrote: > > On Mon, Jan 27, 2020 at 09:34:25AM -0800, Roman Gushchin wrote: > > > The existing cgroup slab memory controller is based on the idea of > > > replicating slab allocator internals for each memory cgroup. > > > This approach promises a low memory overhead (one pointer per page), > > > and isn't adding too much code on hot allocation and release paths. > > > But is has a very serious flaw: it leads to a low slab utilization. > > > > > > Using a drgn* script I've got an estimation of slab utilization on > > > a number of machines running different production workloads. In most > > > cases it was between 45% and 65%, and the best number I've seen was > > > around 85%. Turning kmem accounting off brings it to high 90s. Also > > > it brings back 30-50% of slab memory. It means that the real price > > > of the existing slab memory controller is way bigger than a pointer > > > per page. > > > > > > The real reason why the existing design leads to a low slab utilization > > > is simple: slab pages are used exclusively by one memory cgroup. > > > If there are only few allocations of certain size made by a cgroup, > > > or if some active objects (e.g. dentries) are left after the cgroup is > > > deleted, or the cgroup contains a single-threaded application which is > > > barely allocating any kernel objects, but does it every time on a new CPU: > > > in all these cases the resulting slab utilization is very low. > > > If kmem accounting is off, the kernel is able to use free space > > > on slab pages for other allocations. > > > > > > Arguably it wasn't an issue back to days when the kmem controller was > > > introduced and was an opt-in feature, which had to be turned on > > > individually for each memory cgroup. But now it's turned on by default > > > on both cgroup v1 and v2. And modern systemd-based systems tend to > > > create a large number of cgroups. > > > > > > This patchset provides a new implementation of the slab memory controller, > > > which aims to reach a much better slab utilization by sharing slab pages > > > between multiple memory cgroups. Below is the short description of the new > > > design (more details in commit messages). > > > > > > Accounting is performed per-object instead of per-page. Slab-related > > > vmstat counters are converted to bytes. Charging is performed on page-basis, > > > with rounding up and remembering leftovers. > > > > > > Memcg ownership data is stored in a per-slab-page vector: for each slab page > > > a vector of corresponding size is allocated. To keep slab memory reparenting > > > working, instead of saving a pointer to the memory cgroup directly an > > > intermediate object is used. It's simply a pointer to a memcg (which can be > > > easily changed to the parent) with a built-in reference counter. This scheme > > > allows to reparent all allocated objects without walking them over and > > > changing memcg pointer to the parent. > > > > > > Instead of creating an individual set of kmem_caches for each memory cgroup, > > > two global sets are used: the root set for non-accounted and root-cgroup > > > allocations and the second set for all other allocations. This allows to > > > simplify the lifetime management of individual kmem_caches: they are > > > destroyed with root counterparts. It allows to remove a good amount of code > > > and make things generally simpler. > > > > > > The patchset* has been tested on a number of different workloads in our > > > production. In all cases it saved significant amount of memory, measured > > > from high hundreds of MBs to single GBs per host. On average, the size > > > of slab memory has been reduced by 35-45%. > > > > Here are some numbers from multiple runs of sysbench and kernel compilation > > with this patchset on a 10 core POWER8 host: > > > > ========================================================================== > > Peak usage of memory.kmem.usage_in_bytes, memory.usage_in_bytes and > > meminfo:Slab for Sysbench oltp_read_write with mysqld running as part > > of a mem cgroup (Sampling every 5s) > > -------------------------------------------------------------------------- > > 5.5.0-rc7-mm1 +slab patch %reduction > > -------------------------------------------------------------------------- > > memory.kmem.usage_in_bytes 15859712 4456448 72 > > memory.usage_in_bytes 337510400 335806464 .5 > > Slab: (kB) 814336 607296 25 > > > > memory.kmem.usage_in_bytes 16187392 4653056 71 > > memory.usage_in_bytes 318832640 300154880 5 > > Slab: (kB) 789888 559744 29 > > -------------------------------------------------------------------------- > > > > > > Peak usage of memory.kmem.usage_in_bytes, memory.usage_in_bytes and > > meminfo:Slab for kernel compilation (make -s -j64) Compilation was > > done from bash that is in a memory cgroup. (Sampling every 5s) > > -------------------------------------------------------------------------- > > 5.5.0-rc7-mm1 +slab patch %reduction > > -------------------------------------------------------------------------- > > memory.kmem.usage_in_bytes 338493440 231931904 31 > > memory.usage_in_bytes 7368015872 6275923968 15 > > Slab: (kB) 1139072 785408 31 > > > > memory.kmem.usage_in_bytes 341835776 236453888 30 > > memory.usage_in_bytes 6540427264 6072893440 7 > > Slab: (kB) 1074304 761280 29 > > > > memory.kmem.usage_in_bytes 340525056 233570304 31 > > memory.usage_in_bytes 6406209536 6177357824 3 > > Slab: (kB) 1244288 739712 40 > > -------------------------------------------------------------------------- > > > > Slab consumption right after boot > > -------------------------------------------------------------------------- > > 5.5.0-rc7-mm1 +slab patch %reduction > > -------------------------------------------------------------------------- > > Slab: (kB) 821888 583424 29 > > ========================================================================== > > > > Summary: > > > > With sysbench and kernel compilation, memory.kmem.usage_in_bytes shows > > around 70% and 30% reduction consistently. > > > > Didn't see consistent reduction of memory.usage_in_bytes with sysbench and > > kernel compilation. > > > > Slab usage (from /proc/meminfo) shows consistent 30% reduction and the > > same is seen right after boot too. > > That's just perfect! > > memory.usage_in_bytes was most likely the same because the freed space > was taken by pagecache. > > Thank you very much for testing! > > Roman

4 years, 10 months

7
20
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror August 2020