The current interconnect provider registration interface is inherently
racy as nodes are not added until the after adding the provider. This
can specifically cause racing DT lookups to fail:
of_icc_xlate_onecell: invalid index 0
cpu cpu0: error -EINVAL: error finding src node
cpu cpu0: dev_pm_opp_of_find_icc_paths: Unable to get path0: -22
qcom-cpufreq-hw: probe of 18591000.cpufreq failed with error -22
Switch to using the new API where the provider is not registered until
after it has been fully initialised.
Fixes: 5bc9900addaf ("interconnect: qcom: Add OSM L3 interconnect provider support")
Cc: stable(a)vger.kernel.org # 5.7
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
---
drivers/interconnect/qcom/osm-l3.c | 14 ++++++--------
1 file changed, 6 insertions(+), 8 deletions(-)
diff --git a/drivers/interconnect/qcom/osm-l3.c b/drivers/interconnect/qcom/osm-l3.c
index 5fa171087425..3a1cbfe3e481 100644
--- a/drivers/interconnect/qcom/osm-l3.c
+++ b/drivers/interconnect/qcom/osm-l3.c
@@ -158,8 +158,8 @@ static int qcom_osm_l3_remove(struct platform_device *pdev)
{
struct qcom_osm_l3_icc_provider *qp = platform_get_drvdata(pdev);
+ icc_provider_deregister(&qp->provider);
icc_nodes_remove(&qp->provider);
- icc_provider_del(&qp->provider);
return 0;
}
@@ -245,14 +245,9 @@ static int qcom_osm_l3_probe(struct platform_device *pdev)
provider->set = qcom_osm_l3_set;
provider->aggregate = icc_std_aggregate;
provider->xlate = of_icc_xlate_onecell;
- INIT_LIST_HEAD(&provider->nodes);
provider->data = data;
- ret = icc_provider_add(provider);
- if (ret) {
- dev_err(&pdev->dev, "error adding interconnect provider\n");
- return ret;
- }
+ icc_provider_init(provider);
for (i = 0; i < num_nodes; i++) {
size_t j;
@@ -275,12 +270,15 @@ static int qcom_osm_l3_probe(struct platform_device *pdev)
}
data->num_nodes = num_nodes;
+ ret = icc_provider_register(provider);
+ if (ret)
+ goto err;
+
platform_set_drvdata(pdev, qp);
return 0;
err:
icc_nodes_remove(provider);
- icc_provider_del(provider);
return ret;
}
--
2.39.1
The patch titled
Subject: mm: shrinkers: fix deadlock in shrinker debugfs
has been added to the -mm mm-unstable branch. Its filename is
mm-shrinkers-fix-deadlock-in-shrinker-debugfs.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Qi Zheng <zhengqi.arch(a)bytedance.com>
Subject: mm: shrinkers: fix deadlock in shrinker debugfs
Date: Thu, 2 Feb 2023 18:56:12 +0800
The debugfs_remove_recursive() is invoked by unregister_shrinker(), which
is holding the write lock of shrinker_rwsem. It will waits for the
handler of debugfs file complete. The handler also needs to hold the read
lock of shrinker_rwsem to do something. So it may cause the following
deadlock:
CPU0 CPU1
debugfs_file_get()
shrinker_debugfs_count_show()/shrinker_debugfs_scan_write()
unregister_shrinker()
--> down_write(&shrinker_rwsem);
debugfs_remove_recursive()
// wait for (A)
--> wait_for_completion();
// wait for (B)
--> down_read_killable(&shrinker_rwsem)
debugfs_file_put() -- (A)
up_write() -- (B)
The down_read_killable() can be killed, so that the above deadlock can be
recovered. But it still requires an extra kill action, otherwise it will
block all subsequent shrinker-related operations, so it's better to fix
it.
Link: https://lkml.kernel.org/r/20230202105612.64641-1-zhengqi.arch@bytedance.com
Fixes: 5035ebc644ae ("mm: shrinkers: introduce debugfs interface for memory shrinkers")
Signed-off-by: Qi Zheng <zhengqi.arch(a)bytedance.com>
Cc: Roman Gushchin <roman.gushchin(a)linux.dev>
Cc: Kent Overstreet <kent.overstreet(a)gmail.com>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
--- a/include/linux/shrinker.h~mm-shrinkers-fix-deadlock-in-shrinker-debugfs
+++ a/include/linux/shrinker.h
@@ -107,7 +107,7 @@ extern void synchronize_shrinkers(void);
#ifdef CONFIG_SHRINKER_DEBUG
extern int shrinker_debugfs_add(struct shrinker *shrinker);
-extern void shrinker_debugfs_remove(struct shrinker *shrinker);
+extern struct dentry *shrinker_debugfs_remove(struct shrinker *shrinker);
extern int __printf(2, 3) shrinker_debugfs_rename(struct shrinker *shrinker,
const char *fmt, ...);
#else /* CONFIG_SHRINKER_DEBUG */
@@ -115,7 +115,7 @@ static inline int shrinker_debugfs_add(s
{
return 0;
}
-static inline void shrinker_debugfs_remove(struct shrinker *shrinker)
+static inline struct dentry *shrinker_debugfs_remove(struct shrinker *shrinker)
{
}
static inline __printf(2, 3)
--- a/mm/shrinker_debug.c~mm-shrinkers-fix-deadlock-in-shrinker-debugfs
+++ a/mm/shrinker_debug.c
@@ -246,18 +246,21 @@ int shrinker_debugfs_rename(struct shrin
}
EXPORT_SYMBOL(shrinker_debugfs_rename);
-void shrinker_debugfs_remove(struct shrinker *shrinker)
+struct dentry *shrinker_debugfs_remove(struct shrinker *shrinker)
{
+ struct dentry *entry = shrinker->debugfs_entry;
+
lockdep_assert_held(&shrinker_rwsem);
kfree_const(shrinker->name);
shrinker->name = NULL;
- if (!shrinker->debugfs_entry)
- return;
+ if (entry) {
+ ida_free(&shrinker_debugfs_ida, shrinker->debugfs_id);
+ shrinker->debugfs_entry = NULL;
+ }
- debugfs_remove_recursive(shrinker->debugfs_entry);
- ida_free(&shrinker_debugfs_ida, shrinker->debugfs_id);
+ return entry;
}
static int __init shrinker_debugfs_init(void)
--- a/mm/vmscan.c~mm-shrinkers-fix-deadlock-in-shrinker-debugfs
+++ a/mm/vmscan.c
@@ -747,6 +747,8 @@ EXPORT_SYMBOL(register_shrinker);
*/
void unregister_shrinker(struct shrinker *shrinker)
{
+ struct dentry *debugfs_entry;
+
if (!(shrinker->flags & SHRINKER_REGISTERED))
return;
@@ -755,9 +757,11 @@ void unregister_shrinker(struct shrinker
shrinker->flags &= ~SHRINKER_REGISTERED;
if (shrinker->flags & SHRINKER_MEMCG_AWARE)
unregister_memcg_shrinker(shrinker);
- shrinker_debugfs_remove(shrinker);
+ debugfs_entry = shrinker_debugfs_remove(shrinker);
up_write(&shrinker_rwsem);
+ debugfs_remove_recursive(debugfs_entry);
+
kfree(shrinker->nr_deferred);
shrinker->nr_deferred = NULL;
}
_
Patches currently in -mm which might be from zhengqi.arch(a)bytedance.com are
mm-shrinkers-fix-deadlock-in-shrinker-debugfs.patch
After the qmp phy driver was split it looks like 5.15.y stable kernels
aren't getting fixes like commit 7a7d86d14d07 ("phy: qcom-qmp-combo: fix
broken power on") which is tagged for stable 5.10. Trogdor boards use
the qmp phy on 5.15.y kernels, so I backported the fixes I could find
that looked like we may possibly trip over at some point.
USB and DP work on my Trogdor.Lazor board with this set.
Changes from v1 (https://lore.kernel.org/r/20230113005405.3992011-1-swboyd@chromium.org):
* New patch for memleak on probe deferal to avoid compat issues
* Update "fix broken power on" patch for pcie/ufs phy
Johan Hovold (5):
phy: qcom-qmp-combo: disable runtime PM on unbind
phy: qcom-qmp-combo: fix memleak on probe deferral
phy: qcom-qmp-usb: fix memleak on probe deferral
phy: qcom-qmp-combo: fix broken power on
phy: qcom-qmp-combo: fix runtime suspend
drivers/phy/qualcomm/phy-qcom-qmp.c | 97 ++++++++++++++++++-----------
1 file changed, 61 insertions(+), 36 deletions(-)
Cc: Johan Hovold <johan+linaro(a)kernel.org>
Cc: Dmitry Baryshkov <dmitry.baryshkov(a)linaro.org>
Cc: Vinod Koul <vkoul(a)kernel.org>
base-commit: d57287729e229188e7d07ef0117fe927664e08cb
--
https://chromeos.dev
spi_nor_set_erase_type() was used either to set or to mask out an erase
type. When we used it to mask out an erase type a shift-out-of-bounds
was hit:
UBSAN: shift-out-of-bounds in drivers/mtd/spi-nor/core.c:2237:24
shift exponent 4294967295 is too large for 32-bit type 'int'
size_shift and size_mask were not used anyway when erase size was set to
zero, so the functionality was not affected. The setting of the
size_{shift, mask} and of the opcode are unnecessary when the erase size
is zero, as throughout the code just the erase size is considered to
determine whether an erase type is supported or not. Setting the opcode
to 0xFF was wrong too as nobody guarantees that 0xFF is an unused opcode.
Thus condition the setting of these params by the erase size. This will
fix the shift-out-of-bounds too. While here avoid a superfluous
dereference and use 'size' directly.
Fixes: 5390a8df769e ("mtd: spi-nor: add support to non-uniform SFDP SPI NOR flash memories")
Cc: stable(a)vger.kernel.org
Reported-by: Alexander Stein <Alexander.Stein(a)tq-group.com>
Suggested-by: Louis Rannou <lrannou(a)baylibre.com>
Signed-off-by: Tudor Ambarus <tudor.ambarus(a)linaro.org>
---
v1 at https://lore.kernel.org/r/20211106075616.95401-1-tudor.ambarus@microchip.co…
drivers/mtd/spi-nor/core.c | 20 ++++++++++++++++----
drivers/mtd/spi-nor/core.h | 1 +
drivers/mtd/spi-nor/sfdp.c | 4 ++--
3 files changed, 19 insertions(+), 6 deletions(-)
diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
index 247d1014879a..9b90d941d87a 100644
--- a/drivers/mtd/spi-nor/core.c
+++ b/drivers/mtd/spi-nor/core.c
@@ -2019,10 +2019,22 @@ void spi_nor_set_erase_type(struct spi_nor_erase_type *erase, u32 size,
u8 opcode)
{
erase->size = size;
- erase->opcode = opcode;
- /* JEDEC JESD216B Standard imposes erase sizes to be power of 2. */
- erase->size_shift = ffs(erase->size) - 1;
- erase->size_mask = (1 << erase->size_shift) - 1;
+
+ if (size) {
+ erase->opcode = opcode;
+ /* JEDEC JESD216B imposes erase sizes to be power of 2. */
+ erase->size_shift = ffs(size) - 1;
+ erase->size_mask = (1 << erase->size_shift) - 1;
+ }
+}
+
+/**
+ * spi_nor_mask_erase_type() - mask out an SPI NOR erase type
+ * @erase: pointer to a structure that describes a SPI NOR erase type
+ */
+void spi_nor_mask_erase_type(struct spi_nor_erase_type *erase)
+{
+ erase->size = 0;
}
/**
diff --git a/drivers/mtd/spi-nor/core.h b/drivers/mtd/spi-nor/core.h
index f6d012e1f681..25423225c29d 100644
--- a/drivers/mtd/spi-nor/core.h
+++ b/drivers/mtd/spi-nor/core.h
@@ -681,6 +681,7 @@ void spi_nor_set_pp_settings(struct spi_nor_pp_command *pp, u8 opcode,
void spi_nor_set_erase_type(struct spi_nor_erase_type *erase, u32 size,
u8 opcode);
+void spi_nor_mask_erase_type(struct spi_nor_erase_type *erase);
struct spi_nor_erase_region *
spi_nor_region_next(struct spi_nor_erase_region *region);
void spi_nor_init_uniform_erase_map(struct spi_nor_erase_map *map,
diff --git a/drivers/mtd/spi-nor/sfdp.c b/drivers/mtd/spi-nor/sfdp.c
index fd4daf8fa5df..298ab5e53a8c 100644
--- a/drivers/mtd/spi-nor/sfdp.c
+++ b/drivers/mtd/spi-nor/sfdp.c
@@ -875,7 +875,7 @@ static int spi_nor_init_non_uniform_erase_map(struct spi_nor *nor,
*/
for (i = 0; i < SNOR_ERASE_TYPE_MAX; i++)
if (!(regions_erase_type & BIT(erase[i].idx)))
- spi_nor_set_erase_type(&erase[i], 0, 0xFF);
+ spi_nor_mask_erase_type(&erase[i]);
return 0;
}
@@ -1089,7 +1089,7 @@ static int spi_nor_parse_4bait(struct spi_nor *nor,
erase_type[i].opcode = (dwords[SFDP_DWORD(2)] >>
erase_type[i].idx * 8) & 0xFF;
else
- spi_nor_set_erase_type(&erase_type[i], 0u, 0xFF);
+ spi_nor_mask_erase_type(&erase_type[i]);
}
/*
--
2.34.1
netvsc_dma_map() and netvsc_dma_unmap() currently check the cp_partial
flag and adjust the page_count so that pagebuf entries for the RNDIS
portion of the message are skipped when it has already been copied into
a send buffer. But this adjustment has already been made by code in
netvsc_send(). The duplicate adjustment causes some pagebuf entries to
not be mapped. In a normal VM, this doesn't break anything because the
mapping doesn’t change the PFN. But in a Confidential VM,
dma_map_single() does bounce buffering and provides a different PFN.
Failing to do the mapping causes the wrong PFN to be passed to Hyper-V,
and various errors ensue.
Fix this by removing the duplicate adjustment in netvsc_dma_map() and
netvsc_dma_unmap().
Fixes: 846da38de0e8 ("net: netvsc: Add Isolation VM support for netvsc driver")
Cc: stable(a)vger.kernel.org
Signed-off-by: Michael Kelley <mikelley(a)microsoft.com>
---
drivers/net/hyperv/netvsc.c | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)
diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index 661bbe6..8acfa40 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -949,9 +949,6 @@ static void netvsc_copy_to_send_buf(struct netvsc_device *net_device,
void netvsc_dma_unmap(struct hv_device *hv_dev,
struct hv_netvsc_packet *packet)
{
- u32 page_count = packet->cp_partial ?
- packet->page_buf_cnt - packet->rmsg_pgcnt :
- packet->page_buf_cnt;
int i;
if (!hv_is_isolation_supported())
@@ -960,7 +957,7 @@ void netvsc_dma_unmap(struct hv_device *hv_dev,
if (!packet->dma_range)
return;
- for (i = 0; i < page_count; i++)
+ for (i = 0; i < packet->page_buf_cnt; i++)
dma_unmap_single(&hv_dev->device, packet->dma_range[i].dma,
packet->dma_range[i].mapping_size,
DMA_TO_DEVICE);
@@ -990,9 +987,7 @@ static int netvsc_dma_map(struct hv_device *hv_dev,
struct hv_netvsc_packet *packet,
struct hv_page_buffer *pb)
{
- u32 page_count = packet->cp_partial ?
- packet->page_buf_cnt - packet->rmsg_pgcnt :
- packet->page_buf_cnt;
+ u32 page_count = packet->page_buf_cnt;
dma_addr_t dma;
int i;
--
1.8.3.1
[Public]
> -----Original Message-----
> From: Sasha Levin <sashal(a)kernel.org>
> Sent: Wednesday, February 1, 2023 10:43
> To: stable-commits(a)vger.kernel.org; Limonciello, Mario
> <Mario.Limonciello(a)amd.com>
> Cc: Mika Westerberg <mika.westerberg(a)linux.intel.com>; Andy Shevchenko
> <andriy.shevchenko(a)linux.intel.com>; Linus Walleij
> <linus.walleij(a)linaro.org>; Bartosz Golaszewski <brgl(a)bgdev.pl>
> Subject: Patch "gpiolib: acpi: Allow ignoring wake capability on pins that aren't
> in _AEI" has been added to the 6.1-stable tree
>
> This is a note to let you know that I've just added the patch titled
>
> gpiolib: acpi: Allow ignoring wake capability on pins that aren't in _AEI
>
> to the 6.1-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-
> queue.git;a=summary
>
> The filename of the patch is:
> gpiolib-acpi-allow-ignoring-wake-capability-on-pins-.patch
> and it can be found in the queue-6.1 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
Hi Sasha,
I suggest you also pick up two other fixes to go with this one.
1) this fix which was in the same series:
4cb786180dfb ("gpiolib: acpi: Add a ignore wakeup quirk for Clevo NL5xRU")
2) This fix which is tangentially related (fixes something from the same original
series that exposed the regressions).
d63f11c02b8d ("gpiolib-acpi: Don't set GPIOs for wakeup in S3 mode")
>
>
>
> commit 832a64f6e1557e017da518199dcab0e5fdca2808
> Author: Mario Limonciello <mario.limonciello(a)amd.com>
> Date: Mon Jan 16 13:37:01 2023 -0600
>
> gpiolib: acpi: Allow ignoring wake capability on pins that aren't in _AEI
>
> [ Upstream commit 0e3b175f079247f0d40d2ab695999c309d3a7498 ]
>
> Using the `ignore_wake` quirk or module parameter doesn't work for any
> pin
> that has been specified in the _CRS instead of _AEI.
>
> Extend the `acpi_gpio_irq_is_wake` check to cover both places.
>
> Suggested-by: Raul Rangel <rrangel(a)chromium.org>
> Link: https://gitlab.freedesktop.org/drm/amd/-
> /issues/1722#note_1722335
> Signed-off-by: Mario Limonciello <mario.limonciello(a)amd.com>
> Reviewed-by: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski(a)linaro.org>
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/drivers/gpio/gpiolib-acpi.c b/drivers/gpio/gpiolib-acpi.c
> index a7d2358736fe..27f234637a15 100644
> --- a/drivers/gpio/gpiolib-acpi.c
> +++ b/drivers/gpio/gpiolib-acpi.c
> @@ -361,7 +361,7 @@ static bool acpi_gpio_in_ignore_list(const char
> *ignore_list, const char *contro
> }
>
> static bool acpi_gpio_irq_is_wake(struct device *parent,
> - struct acpi_resource_gpio *agpio)
> + const struct acpi_resource_gpio *agpio)
> {
> unsigned int pin = agpio->pin_table[0];
>
> @@ -754,7 +754,7 @@ static int acpi_populate_gpio_lookup(struct
> acpi_resource *ares, void *data)
> lookup->info.pin_config = agpio->pin_config;
> lookup->info.debounce = agpio->debounce_timeout;
> lookup->info.gpioint = gpioint;
> - lookup->info.wake_capable = agpio->wake_capable ==
> ACPI_WAKE_CAPABLE;
> + lookup->info.wake_capable =
> acpi_gpio_irq_is_wake(&lookup->info.adev->dev, agpio);
>
> /*
> * Polarity and triggering are only specified for GpioInt
The powerclamp cooling device cur_state shows actual idle observed by
package C-state idle counters. But the implementation is not sufficient
for multi package or multi die system. The cur_state value is incorrect.
On these systems, these counters must be read from each package/die and
somehow aggregate them. But there is no good method for aggregation.
It was not a problem when explicit CPU model addition was required to
enable intel powerclamp. In this way certain CPU models could have
been avoided. But with the removal of CPU model check with the
availability of Package C-state counters, the driver is loaded on most
of the recent systems.
For multi package/die systems, just show the actual target idle state,
the system is trying to achieve. In powerclamp this is the user set
state minus one.
Also there is no use of starting a worker thread for polling package
C-state counters and applying any compensation for multiple package
or multiple die systems.
Fixes: b721ca0d1927 ("thermal/powerclamp: remove cpu whitelist")
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada(a)linux.intel.com>
Cc: stable(a)vger.kernel.org # 4.14+
---
v2:
Changed: (true == clamping) to (clamping)
Updated commit description for the last paragraph
drivers/thermal/intel/intel_powerclamp.c | 20 ++++++++++++++++----
1 file changed, 16 insertions(+), 4 deletions(-)
diff --git a/drivers/thermal/intel/intel_powerclamp.c b/drivers/thermal/intel/intel_powerclamp.c
index b80e25ec1261..2f4cbfdf26a0 100644
--- a/drivers/thermal/intel/intel_powerclamp.c
+++ b/drivers/thermal/intel/intel_powerclamp.c
@@ -57,6 +57,7 @@
static unsigned int target_mwait;
static struct dentry *debug_dir;
+static bool poll_pkg_cstate_enable;
/* user selected target */
static unsigned int set_target_ratio;
@@ -261,6 +262,9 @@ static unsigned int get_compensation(int ratio)
{
unsigned int comp = 0;
+ if (!poll_pkg_cstate_enable)
+ return 0;
+
/* we only use compensation if all adjacent ones are good */
if (ratio == 1 &&
cal_data[ratio].confidence >= CONFIDENCE_OK &&
@@ -519,7 +523,8 @@ static int start_power_clamp(void)
control_cpu = cpumask_first(cpu_online_mask);
clamping = true;
- schedule_delayed_work(&poll_pkg_cstate_work, 0);
+ if (poll_pkg_cstate_enable)
+ schedule_delayed_work(&poll_pkg_cstate_work, 0);
/* start one kthread worker per online cpu */
for_each_online_cpu(cpu) {
@@ -585,11 +590,15 @@ static int powerclamp_get_max_state(struct thermal_cooling_device *cdev,
static int powerclamp_get_cur_state(struct thermal_cooling_device *cdev,
unsigned long *state)
{
- if (true == clamping)
- *state = pkg_cstate_ratio_cur;
- else
+ if (clamping) {
+ if (poll_pkg_cstate_enable)
+ *state = pkg_cstate_ratio_cur;
+ else
+ *state = set_target_ratio;
+ } else {
/* to save power, do not poll idle ratio while not clamping */
*state = -1; /* indicates invalid state */
+ }
return 0;
}
@@ -712,6 +721,9 @@ static int __init powerclamp_init(void)
goto exit_unregister;
}
+ if (topology_max_packages() == 1 && topology_max_die_per_package() == 1)
+ poll_pkg_cstate_enable = true;
+
cooling_dev = thermal_cooling_device_register("intel_powerclamp", NULL,
&powerclamp_cooling_ops);
if (IS_ERR(cooling_dev)) {
--
2.39.1
From: Yu Kuai <yukuai3(a)huawei.com>
If cfqq is merged with another cfqq(assume that no cfqq is merged to
this cfqq) and then user thread is moved to another cgroup, then
check_blkcg_changed() will set cic->cfqq[1] to NULL. However, this cfqq is
still merged and no one will referece this cfqq, which will cause
following kmemleak:
comm "cgexec", pid 2541110, jiffies 4317578508 (age 5728.064s)
hex dump (first 32 bytes):
03 00 00 00 90 03 00 00 80 8d 0f be 20 80 ff ff ............ ...
a0 16 c6 a1 21 80 ff ff 00 00 00 00 00 00 00 00 ....!...........
backtrace:
[<0000000071d2775e>] kmem_cache_alloc_node+0x13c/0x660
[<00000000e827e6fd>] cfq_get_queue+0x318/0x6f0
[<00000000945249ee>] cfq_set_request+0x724/0xe38
[<00000000af64d5a9>] elv_set_request+0x84/0xd0
[<0000000047344f0d>] __get_request+0x970/0xf90
[<00000000d016bd51>] get_request+0x278/0x970
[<0000000052512d36>] blk_queue_bio+0x278/0xcc0
[<0000000085282101>] generic_make_request+0x3c0/0x778
[<0000000089f69c24>] submit_bio+0xdc/0x408
[<00000000748509ae>] ext4_mpage_readpages+0xc84/0x13b8 [ext4]
[<0000000078bfc705>] ext4_readpages+0x80/0xc0 [ext4]
[<00000000317f2ed7>] read_pages+0xd0/0x610
[<00000000e7ab01d2>] __do_page_cache_readahead+0x36c/0x430
[<0000000071df2285>] ondemand_readahead+0x24c/0x6b0
[<00000000124a824a>] page_cache_sync_readahead+0x188/0x448
[<00000000bca561ae>] generic_file_buffered_read+0x648/0x2b58
unreferenced object 0xffffa028411d4720 (size 240):
Fix the problem by put cooperator in check_blkcg_changed(), so that old
cfqq can be freed.
Signed-off-by: Yu Kuai <yukuai3(a)huawei.com>
---
block/cfq-iosched.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 2eb87444b157..88bae553ece3 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -3778,6 +3778,7 @@ static void check_blkcg_changed(struct cfq_io_cq *cic, struct bio *bio)
if (cfqq) {
cfq_log_cfqq(cfqd, cfqq, "changed cgroup");
cic_set_cfqq(cic, NULL, true);
+ cfq_put_cooperator(cfqq);
cfq_put_queue(cfqq);
}
--
2.31.1
The current interconnect provider registration interface is inherently
racy as nodes are not added until the after adding the provider. This
can specifically cause racing DT lookups to fail.
Switch to using the new API where the provider is not registered until
after it has been fully initialised.
Fixes: d5ef16ba5fbe ("memory: tegra20: Support interconnect framework")
Cc: stable(a)vger.kernel.org # 5.11
Cc: Dmitry Osipenko <digetx(a)gmail.com>
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
---
drivers/memory/tegra/tegra30-emc.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/memory/tegra/tegra30-emc.c b/drivers/memory/tegra/tegra30-emc.c
index 77706e9bc543..c91e9b7e2e01 100644
--- a/drivers/memory/tegra/tegra30-emc.c
+++ b/drivers/memory/tegra/tegra30-emc.c
@@ -1533,15 +1533,13 @@ static int tegra_emc_interconnect_init(struct tegra_emc *emc)
emc->provider.aggregate = soc->icc_ops->aggregate;
emc->provider.xlate_extended = emc_of_icc_xlate_extended;
- err = icc_provider_add(&emc->provider);
- if (err)
- goto err_msg;
+ icc_provider_init(&emc->provider);
/* create External Memory Controller node */
node = icc_node_create(TEGRA_ICC_EMC);
if (IS_ERR(node)) {
err = PTR_ERR(node);
- goto del_provider;
+ goto err_msg;
}
node->name = "External Memory Controller";
@@ -1562,12 +1560,14 @@ static int tegra_emc_interconnect_init(struct tegra_emc *emc)
node->name = "External Memory (DRAM)";
icc_node_add(node, &emc->provider);
+ err = icc_provider_register(&emc->provider);
+ if (err)
+ goto remove_nodes;
+
return 0;
remove_nodes:
icc_nodes_remove(&emc->provider);
-del_provider:
- icc_provider_del(&emc->provider);
err_msg:
dev_err(emc->dev, "failed to initialize ICC: %d\n", err);
--
2.39.1
The current interconnect provider registration interface is inherently
racy as nodes are not added until the after adding the provider. This
can specifically cause racing DT lookups to fail.
Switch to using the new API where the provider is not registered until
after it has been fully initialised.
Fixes: d5ef16ba5fbe ("memory: tegra20: Support interconnect framework")
Cc: stable(a)vger.kernel.org # 5.11
Cc: Dmitry Osipenko <digetx(a)gmail.com>
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
---
drivers/memory/tegra/tegra20-emc.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/memory/tegra/tegra20-emc.c b/drivers/memory/tegra/tegra20-emc.c
index bd4e37b6552d..fd595c851a27 100644
--- a/drivers/memory/tegra/tegra20-emc.c
+++ b/drivers/memory/tegra/tegra20-emc.c
@@ -1021,15 +1021,13 @@ static int tegra_emc_interconnect_init(struct tegra_emc *emc)
emc->provider.aggregate = soc->icc_ops->aggregate;
emc->provider.xlate_extended = emc_of_icc_xlate_extended;
- err = icc_provider_add(&emc->provider);
- if (err)
- goto err_msg;
+ icc_provider_init(&emc->provider);
/* create External Memory Controller node */
node = icc_node_create(TEGRA_ICC_EMC);
if (IS_ERR(node)) {
err = PTR_ERR(node);
- goto del_provider;
+ goto err_msg;
}
node->name = "External Memory Controller";
@@ -1050,12 +1048,14 @@ static int tegra_emc_interconnect_init(struct tegra_emc *emc)
node->name = "External Memory (DRAM)";
icc_node_add(node, &emc->provider);
+ err = icc_provider_register(&emc->provider);
+ if (err)
+ goto remove_nodes;
+
return 0;
remove_nodes:
icc_nodes_remove(&emc->provider);
-del_provider:
- icc_provider_del(&emc->provider);
err_msg:
dev_err(emc->dev, "failed to initialize ICC: %d\n", err);
--
2.39.1
The current interconnect provider registration interface is inherently
racy as nodes are not added until the after adding the provider. This
can specifically cause racing DT lookups to fail.
Switch to using the new API where the provider is not registered until
after it has been fully initialised.
Fixes: 380def2d4cf2 ("memory: tegra124: Support interconnect framework")
Cc: stable(a)vger.kernel.org # 5.12
Cc: Dmitry Osipenko <digetx(a)gmail.com>
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
---
drivers/memory/tegra/tegra124-emc.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/memory/tegra/tegra124-emc.c b/drivers/memory/tegra/tegra124-emc.c
index 85bc936c02f9..00ed2b6a0d1b 100644
--- a/drivers/memory/tegra/tegra124-emc.c
+++ b/drivers/memory/tegra/tegra124-emc.c
@@ -1351,15 +1351,13 @@ static int tegra_emc_interconnect_init(struct tegra_emc *emc)
emc->provider.aggregate = soc->icc_ops->aggregate;
emc->provider.xlate_extended = emc_of_icc_xlate_extended;
- err = icc_provider_add(&emc->provider);
- if (err)
- goto err_msg;
+ icc_provider_init(&emc->provider);
/* create External Memory Controller node */
node = icc_node_create(TEGRA_ICC_EMC);
if (IS_ERR(node)) {
err = PTR_ERR(node);
- goto del_provider;
+ goto err_msg;
}
node->name = "External Memory Controller";
@@ -1380,12 +1378,14 @@ static int tegra_emc_interconnect_init(struct tegra_emc *emc)
node->name = "External Memory (DRAM)";
icc_node_add(node, &emc->provider);
+ err = icc_provider_register(&emc->provider);
+ if (err)
+ goto remove_nodes;
+
return 0;
remove_nodes:
icc_nodes_remove(&emc->provider);
-del_provider:
- icc_provider_del(&emc->provider);
err_msg:
dev_err(emc->dev, "failed to initialize ICC: %d\n", err);
--
2.39.1
The current interconnect provider registration interface is inherently
racy as nodes are not added until the after adding the provider. This
can specifically cause racing DT lookups to trigger a NULL-pointer
deference when either a NULL pointer or not fully initialised node is
returned from exynos_generic_icc_xlate().
Switch to using the new API where the provider is not registered until
after it has been fully initialised.
Fixes: 2f95b9d5cf0b ("interconnect: Add generic interconnect driver for Exynos SoCs")
Cc: stable(a)vger.kernel.org # 5.11
Cc: Sylwester Nawrocki <s.nawrocki(a)samsung.com>
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
---
drivers/interconnect/samsung/exynos.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/drivers/interconnect/samsung/exynos.c b/drivers/interconnect/samsung/exynos.c
index e70665899482..72e42603823b 100644
--- a/drivers/interconnect/samsung/exynos.c
+++ b/drivers/interconnect/samsung/exynos.c
@@ -98,12 +98,13 @@ static int exynos_generic_icc_remove(struct platform_device *pdev)
struct exynos_icc_priv *priv = platform_get_drvdata(pdev);
struct icc_node *parent_node, *node = priv->node;
+ icc_provider_deregister(&priv->provider);
+
parent_node = exynos_icc_get_parent(priv->dev->parent->of_node);
if (parent_node && !IS_ERR(parent_node))
icc_link_destroy(node, parent_node);
icc_nodes_remove(&priv->provider);
- icc_provider_del(&priv->provider);
return 0;
}
@@ -132,15 +133,11 @@ static int exynos_generic_icc_probe(struct platform_device *pdev)
provider->inter_set = true;
provider->data = priv;
- ret = icc_provider_add(provider);
- if (ret < 0)
- return ret;
+ icc_provider_init(provider);
icc_node = icc_node_create(pdev->id);
- if (IS_ERR(icc_node)) {
- ret = PTR_ERR(icc_node);
- goto err_prov_del;
- }
+ if (IS_ERR(icc_node))
+ return PTR_ERR(icc_node);
priv->node = icc_node;
icc_node->name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "%pOFn",
@@ -171,14 +168,17 @@ static int exynos_generic_icc_probe(struct platform_device *pdev)
goto err_pmqos_del;
}
+ ret = icc_provider_register(provider);
+ if (ret < 0)
+ goto err_pmqos_del;
+
return 0;
err_pmqos_del:
dev_pm_qos_remove_request(&priv->qos_req);
err_node_del:
icc_nodes_remove(provider);
-err_prov_del:
- icc_provider_del(provider);
+
return ret;
}
--
2.39.1
This is a note to let you know that I've just added the patch titled
usb: typec: ucsi: Don't attempt to resume the ports before they exist
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From f82060da749c611ed427523b6d1605d87338aac1 Mon Sep 17 00:00:00 2001
From: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Date: Tue, 31 Jan 2023 16:15:18 +0200
Subject: usb: typec: ucsi: Don't attempt to resume the ports before they exist
This will fix null pointer dereference that was caused by
the driver attempting to resume ports that were not yet
registered.
Fixes: e0dced9c7d47 ("usb: typec: ucsi: Resume in separate work")
Cc: <stable(a)vger.kernel.org>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216697
Signed-off-by: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Link: https://lore.kernel.org/r/20230131141518.78215-1-heikki.krogerus@linux.inte…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/typec/ucsi/ucsi.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
index 1292241d581a..1cf8947c6d66 100644
--- a/drivers/usb/typec/ucsi/ucsi.c
+++ b/drivers/usb/typec/ucsi/ucsi.c
@@ -1269,6 +1269,9 @@ static int ucsi_init(struct ucsi *ucsi)
con->port = NULL;
}
+ kfree(ucsi->connector);
+ ucsi->connector = NULL;
+
err_reset:
memset(&ucsi->cap, 0, sizeof(ucsi->cap));
ucsi_reset_ppm(ucsi);
@@ -1300,7 +1303,8 @@ static void ucsi_resume_work(struct work_struct *work)
int ucsi_resume(struct ucsi *ucsi)
{
- queue_work(system_long_wq, &ucsi->resume_work);
+ if (ucsi->connector)
+ queue_work(system_long_wq, &ucsi->resume_work);
return 0;
}
EXPORT_SYMBOL_GPL(ucsi_resume);
@@ -1420,6 +1424,9 @@ void ucsi_unregister(struct ucsi *ucsi)
/* Disable notifications */
ucsi->ops->async_write(ucsi, UCSI_CONTROL, &cmd, sizeof(cmd));
+ if (!ucsi->connector)
+ return;
+
for (i = 0; i < ucsi->cap.num_connectors; i++) {
cancel_work_sync(&ucsi->connector[i].work);
ucsi_unregister_partner(&ucsi->connector[i]);
--
2.39.1
From: Oliver Hartkopp <socketcan(a)hartkopp.net>
When wait_event_interruptible() has been interrupted by a signal the
tx.state value might not be ISOTP_IDLE. Force the state machines
into idle state to inhibit the timer handlers to continue working.
Fixes: 866337865f37 ("can: isotp: fix tx state handling for echo tx processing")
Cc: stable(a)vger.kernel.org
Signed-off-by: Oliver Hartkopp <socketcan(a)hartkopp.net>
Link: https://lore.kernel.org/all/20230112192347.1944-1-socketcan@hartkopp.net
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
net/can/isotp.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/net/can/isotp.c b/net/can/isotp.c
index 608f8c24ae46..dae421f6c901 100644
--- a/net/can/isotp.c
+++ b/net/can/isotp.c
@@ -1162,6 +1162,10 @@ static int isotp_release(struct socket *sock)
/* wait for complete transmission of current pdu */
wait_event_interruptible(so->wait, so->tx.state == ISOTP_IDLE);
+ /* force state machines to be idle also when a signal occurred */
+ so->tx.state = ISOTP_IDLE;
+ so->rx.state = ISOTP_IDLE;
+
spin_lock(&isotp_notifier_lock);
while (isotp_busy_notifier == so) {
spin_unlock(&isotp_notifier_lock);
--
2.39.1
The following patch is required to fix the ext4 online resize in 5.15.
Baokun Li (1):
ext4: fix bad checksum after online resize
fs/ext4/resize.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--
2.39.1.456.gfc5497dd1b-goog
Commit c653c591789b ("drm/amdgpu: Re-enable DCN for 64-bit powerpc")
introduced this check as a workaround for the driver not building
with toolchains that default to 64-bit long double.
The reason things worked on 128-bit-long-double toolchains and
not otherwise was however largely accidental. The real issue was
that some files containing floating point code were compiled
without -mhard-float, while others were compiled with -mhard-float.
The PowerPC compilers tag object files that use long doubles with
a special ABI tag in order to differentiate 64-bit long double,
IBM long double and IEEE 128-bit long double. When no long double
is used in a source file, the file does not receive the ABI tag.
Since only regular doubles are used in the AMDGPU source, there
is no ABI tag on the soft-float object files with 128-bit-ldbl
compilers, and therefore no error. With 64-bit long double,
the double and long double types are equal, and an ABI tag is
introduced.
Of course, this resulted in the real bug, which was mixing of
hard and soft float object files, getting hidden, which makes
this check technically incorrect.
Since then, work has been done to ensure that all float code is
separately compiled. This was also necessary in order to enable
AArch64 support in the display stack, as AArch64 does not have any
soft-float ABI, and all code that does not explicitly conatain
floats is compiled with -mgeneral-regs-only, which prevents
float-using code from being compiled at all. That means AArch64
support will from now on always safeguard against such cases
happening ever again.
In mainline, this work is now fully done, so this check is fully
redundant and does not do anything except preventing AMDGPU DC
from being built on systems such as those using musl libc. The
last piece of work to enable this was commit c92b7fe0d92a
("drm/amd/display: move remaining FPU code to dml folder")
and this has since been backported to 6.1 stable (in 6.1.7).
Relevant issue: https://gitlab.freedesktop.org/drm/amd/-/issues/2288
Signed-off-by: Daniel Kolesa <daniel(a)octaforge.org>
---
arch/powerpc/Kconfig | 4 ----
drivers/gpu/drm/amd/display/Kconfig | 2 +-
2 files changed, 1 insertion(+), 5 deletions(-)
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index b8c4ac56b..267805072 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -289,10 +289,6 @@ config PPC
# Please keep this list sorted alphabetically.
#
-config PPC_LONG_DOUBLE_128
- depends on PPC64 && ALTIVEC
- def_bool $(success,test "$(shell,echo __LONG_DOUBLE_128__ | $(CC) -E -P -)" = 1)
-
config PPC_BARRIER_NOSPEC
bool
default y
diff --git a/drivers/gpu/drm/amd/display/Kconfig b/drivers/gpu/drm/amd/display/Kconfig
index 2efe93f74..94645b6ef 100644
--- a/drivers/gpu/drm/amd/display/Kconfig
+++ b/drivers/gpu/drm/amd/display/Kconfig
@@ -8,7 +8,7 @@ config DRM_AMD_DC
depends on BROKEN || !CC_IS_CLANG || X86_64 || SPARC64 || ARM64
select SND_HDA_COMPONENT if SND_HDA_CORE
# !CC_IS_CLANG: https://github.com/ClangBuiltLinux/linux/issues/1752
- select DRM_AMD_DC_DCN if (X86 || PPC_LONG_DOUBLE_128 || (ARM64 && KERNEL_MODE_NEON && !CC_IS_CLANG))
+ select DRM_AMD_DC_DCN if (X86 || PPC64 || (ARM64 && KERNEL_MODE_NEON && !CC_IS_CLANG))
help
Choose this option if you want to use the new display engine
support for AMDGPU. This adds required support for Vega and
--
2.34.1
The powerclamp cooling device cur_state shows actual idle observed by
package C-state idle counters. But the implementation is not sufficient
for multi package or multi die system. The cur_state value is incorrect.
On these systems, these counters must be read from each package/die and
somehow aggregate them. But there is no good method for aggregation.
It was not a problem when explicit CPU model addition was required to
enable intel powerclamp. In this way certain CPU models could have
been avoided. But with the removal of CPU model check with the
availability of Package C-state counters, the driver is loaded on most
of the recent systems.
For multi package/die systems, just show the actual target idle state,
the system is trying to achieve. In powerclamp this is the user set
state minus one.
Also there is no use of starting a worker thread for polling package
C-state counters and applying any compensation.
Fixes: b721ca0d1927 ("thermal/powerclamp: remove cpu whitelist")
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada(a)linux.intel.com>
Cc: stable(a)vger.kernel.org # 4.14+
---
drivers/thermal/intel/intel_powerclamp.c | 20 ++++++++++++++++----
1 file changed, 16 insertions(+), 4 deletions(-)
diff --git a/drivers/thermal/intel/intel_powerclamp.c b/drivers/thermal/intel/intel_powerclamp.c
index b80e25ec1261..64f082c584b2 100644
--- a/drivers/thermal/intel/intel_powerclamp.c
+++ b/drivers/thermal/intel/intel_powerclamp.c
@@ -57,6 +57,7 @@
static unsigned int target_mwait;
static struct dentry *debug_dir;
+static bool poll_pkg_cstate_enable;
/* user selected target */
static unsigned int set_target_ratio;
@@ -261,6 +262,9 @@ static unsigned int get_compensation(int ratio)
{
unsigned int comp = 0;
+ if (!poll_pkg_cstate_enable)
+ return 0;
+
/* we only use compensation if all adjacent ones are good */
if (ratio == 1 &&
cal_data[ratio].confidence >= CONFIDENCE_OK &&
@@ -519,7 +523,8 @@ static int start_power_clamp(void)
control_cpu = cpumask_first(cpu_online_mask);
clamping = true;
- schedule_delayed_work(&poll_pkg_cstate_work, 0);
+ if (poll_pkg_cstate_enable)
+ schedule_delayed_work(&poll_pkg_cstate_work, 0);
/* start one kthread worker per online cpu */
for_each_online_cpu(cpu) {
@@ -585,11 +590,15 @@ static int powerclamp_get_max_state(struct thermal_cooling_device *cdev,
static int powerclamp_get_cur_state(struct thermal_cooling_device *cdev,
unsigned long *state)
{
- if (true == clamping)
- *state = pkg_cstate_ratio_cur;
- else
+ if (true == clamping) {
+ if (poll_pkg_cstate_enable)
+ *state = pkg_cstate_ratio_cur;
+ else
+ *state = set_target_ratio;
+ } else {
/* to save power, do not poll idle ratio while not clamping */
*state = -1; /* indicates invalid state */
+ }
return 0;
}
@@ -712,6 +721,9 @@ static int __init powerclamp_init(void)
goto exit_unregister;
}
+ if (topology_max_packages() == 1 && topology_max_die_per_package() == 1)
+ poll_pkg_cstate_enable = true;
+
cooling_dev = thermal_cooling_device_register("intel_powerclamp", NULL,
&powerclamp_cooling_ops);
if (IS_ERR(cooling_dev)) {
--
2.39.1
I think it is a security problem to send confidential data in plaintext
over the wire, so we should avoid doing that even if rdma is in use.
We already have a similar check to prevent data integrity problems
for rdma offload.
Modern Windows servers support signed and encrypted rdma offload,
but we don't support this yet...
Changes v2:
- Added missing Cc: list on commit 2/3
Stefan Metzmacher (3):
cifs: introduce cifs_io_parms in smb2_async_writev()
cifs: split out smb3_use_rdma_offload() helper
cifs: don't try to use rdma offload on encrypted connections
fs/cifs/smb2pdu.c | 89 +++++++++++++++++++++++++++++++++++++----------
1 file changed, 70 insertions(+), 19 deletions(-)
--
2.34.1
Samsung Galaxy Book2 Pro 360 (13" 2022 NP930QED-KA1FR) with codec SSID
144d:ca03 requires the same workaround for enabling the speaker amp
like other Samsung models with ALC298 codec.
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Guillaume Pinot <texitoi(a)texitoi.eu>
---
I've tested this fix on my laptop with success. I've took "inspiration" from
https://lore.kernel.org/all/20221115170235.18875-1-tiwai@suse.de/
This is my first contribution, so feel free to give me feedbacks!
sound/pci/hda/patch_realtek.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
index 6fab7c8fc19a..c4496206c3e7 100644
--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -9521,6 +9521,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
SND_PCI_QUIRK(0x144d, 0xc812, "Samsung Notebook Pen S (NT950SBE-X58)", ALC298_FIXUP_SAMSUNG_AMP),
SND_PCI_QUIRK(0x144d, 0xc830, "Samsung Galaxy Book Ion (NT950XCJ-X716A)", ALC298_FIXUP_SAMSUNG_AMP),
SND_PCI_QUIRK(0x144d, 0xc832, "Samsung Galaxy Book Flex Alpha (NP730QCJ)", ALC256_FIXUP_SAMSUNG_HEADPHONE_VERY_QUIET),
+ SND_PCI_QUIRK(0x144d, 0xca03, "Samsung Galaxy Book2 Pro 360 (NP930QED)", ALC298_FIXUP_SAMSUNG_AMP),
SND_PCI_QUIRK(0x1458, 0xfa53, "Gigabyte BXBT-2807", ALC283_FIXUP_HEADSET_MIC),
SND_PCI_QUIRK(0x1462, 0xb120, "MSI Cubi MS-B120", ALC283_FIXUP_HEADSET_MIC),
SND_PCI_QUIRK(0x1462, 0xb171, "Cubi N 8GL (MS-B171)", ALC283_FIXUP_HEADSET_MIC),
--
2.30.2
I think it is a security problem to send confidential data in plaintext
over the wire, so we should avoid doing that even if rdma is in use.
We already have a similar check to prevent data integrity problems
for rdma offload.
Modern Windows servers support signed and encrypted rdma offload,
but we don't support this yet...
Stefan Metzmacher (3):
cifs: introduce cifs_io_parms in smb2_async_writev()
cifs: split out smb3_use_rdma_offload() helper
cifs: don't try to use rdma offload on encrypted connections
fs/cifs/smb2pdu.c | 89 +++++++++++++++++++++++++++++++++++++----------
1 file changed, 70 insertions(+), 19 deletions(-)
--
2.34.1
This bug is marked as fixed by commit:
ext4: block range must be validated before use in ext4_mb_clear_bb()
But I can't find it in the tested trees[1] for more than 90 days.
Is it a correct commit? Please update it by replying:
#syz fix: exact-commit-title
Until then the bug is still considered open and new crashes with
the same signature are ignored.
Kernel: Android 5.10
Dashboard link: https://syzkaller.appspot.com/bug?extid=15cd994e273307bf5cfa
---
[1] I expect the commit to be present in:
1. android12-5.10-lts branch of
https://android.googlesource.com/kernel/common
The interconnect framework currently expects that providers are only
removed when there are no users and after all nodes have been removed.
There is currently nothing that guarantees this to be the case and the
framework does not do any reference counting, but refusing to remove the
provider is never correct as that would leave a dangling pointer to a
resource that is about to be released in the global provider list (e.g.
accessible through debugfs).
Replace the current sanity checks with WARN_ON() so that the provider is
always removed.
Fixes: 11f1ceca7031 ("interconnect: Add generic on-chip interconnect API")
Cc: stable(a)vger.kernel.org # 5.1: 680f8666baf6: interconnect: Make icc_provider_del() return void
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
---
drivers/interconnect/core.c | 14 ++------------
1 file changed, 2 insertions(+), 12 deletions(-)
diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c
index dc61620a0191..43c5c8503ee8 100644
--- a/drivers/interconnect/core.c
+++ b/drivers/interconnect/core.c
@@ -1062,18 +1062,8 @@ EXPORT_SYMBOL_GPL(icc_provider_add);
void icc_provider_del(struct icc_provider *provider)
{
mutex_lock(&icc_lock);
- if (provider->users) {
- pr_warn("interconnect provider still has %d users\n",
- provider->users);
- mutex_unlock(&icc_lock);
- return;
- }
-
- if (!list_empty(&provider->nodes)) {
- pr_warn("interconnect provider still has nodes\n");
- mutex_unlock(&icc_lock);
- return;
- }
+ WARN_ON(provider->users);
+ WARN_ON(!list_empty(&provider->nodes));
list_del(&provider->provider_list);
mutex_unlock(&icc_lock);
--
2.39.1
Fix two races in DMA Rx completion and rearm handling.
I know in advance that the first patch will not be backport friendly
because of commit 56dc5074cbec ("serial: 8250_dma: Rearm DMA Rx if more
data is pending") that is not even in 6.1 but I can create the custom
backport on request.
Cc: stable(a)vger.kernel.org
Ilpo Järvinen (2):
serial: 8250_dma: Fix DMA Rx completion race
serial: 8250_dma: Fix DMA Rx rearm race
drivers/tty/serial/8250/8250_dma.c | 21 +++++++++++++++++----
1 file changed, 17 insertions(+), 4 deletions(-)
--
2.30.2
Working as a personal assistant for a businessman who is establishing a new location of his company in your city will pay you $750 a week. You will primarily work from home, and you will receive sufficient training and training materials to improve your ability to deliver services accurately and without stress. Please fill out the google application for if you are interested in working as a personal assistant part-time for $750 weekly docs.google.com/forms/d/e/1FAIpQLSev9NsI0Yej9ewR2JyVf_WxNcQvBrEtWeNrD57fBoJ…
The quilt patch titled
Subject: mm: memcg: fix NULL pointer in mem_cgroup_track_foreign_dirty_slowpath()
has been removed from the -mm tree. Its filename was
mm-memcg-fix-null-pointer-in-mem_cgroup_track_foreign_dirty_slowpath.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Subject: mm: memcg: fix NULL pointer in mem_cgroup_track_foreign_dirty_slowpath()
Date: Sun, 29 Jan 2023 12:09:45 +0800
As commit 18365225f044 ("hwpoison, memcg: forcibly uncharge LRU pages"),
hwpoison will forcibly uncharg a LRU hwpoisoned page, the folio_memcg
could be NULl, then, mem_cgroup_track_foreign_dirty_slowpath() could
occurs a NULL pointer dereference, let's do not record the foreign
writebacks for folio memcg is null in mem_cgroup_track_foreign_dirty() to
fix it.
Link: https://lkml.kernel.org/r/20230129040945.180629-1-wangkefeng.wang@huawei.com
Fixes: 97b27821b485 ("writeback, memcg: Implement foreign dirty flushing")
Signed-off-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Reported-by: Ma Wupeng <mawupeng1(a)huawei.com>
Tested-by: Miko Larsson <mikoxyzzz(a)gmail.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Jan Kara <jack(a)suse.cz>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Cc: Ma Wupeng <mawupeng1(a)huawei.com>
Cc: Naoya Horiguchi <naoya.horiguchi(a)nec.com>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: Tejun Heo <tj(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/memcontrol.h | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--- a/include/linux/memcontrol.h~mm-memcg-fix-null-pointer-in-mem_cgroup_track_foreign_dirty_slowpath
+++ a/include/linux/memcontrol.h
@@ -1666,10 +1666,13 @@ void mem_cgroup_track_foreign_dirty_slow
static inline void mem_cgroup_track_foreign_dirty(struct folio *folio,
struct bdi_writeback *wb)
{
+ struct mem_cgroup *memcg;
+
if (mem_cgroup_disabled())
return;
- if (unlikely(&folio_memcg(folio)->css != wb->memcg_css))
+ memcg = folio_memcg(folio);
+ if (unlikely(memcg && &memcg->css != wb->memcg_css))
mem_cgroup_track_foreign_dirty_slowpath(folio, wb);
}
_
Patches currently in -mm which might be from wangkefeng.wang(a)huawei.com are
mm-hwposion-support-recovery-from-ksm_might_need_to_copy.patch
mm-hwposion-support-recovery-from-ksm_might_need_to_copy-v3.patch
mm-madvise-use-vm_normal_folio-in-madvise_free_pte_range.patch
The quilt patch titled
Subject: mm/swapfile: add cond_resched() in get_swap_pages()
has been removed from the -mm tree. Its filename was
mm-swapfile-add-cond_resched-in-get_swap_pages.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Longlong Xia <xialonglong1(a)huawei.com>
Subject: mm/swapfile: add cond_resched() in get_swap_pages()
Date: Sat, 28 Jan 2023 09:47:57 +0000
The softlockup still occurs in get_swap_pages() under memory pressure. 64
CPU cores, 64GB memory, and 28 zram devices, the disksize of each zram
device is 50MB with same priority as si. Use the stress-ng tool to
increase memory pressure, causing the system to oom frequently.
The plist_for_each_entry_safe() loops in get_swap_pages() could reach tens
of thousands of times to find available space (extreme case:
cond_resched() is not called in scan_swap_map_slots()). Let's add
cond_resched() into get_swap_pages() when failed to find available space
to avoid softlockup.
Link: https://lkml.kernel.org/r/20230128094757.1060525-1-xialonglong1@huawei.com
Signed-off-by: Longlong Xia <xialonglong1(a)huawei.com>
Reviewed-by: "Huang, Ying" <ying.huang(a)intel.com>
Cc: Chen Wandun <chenwandun(a)huawei.com>
Cc: Huang Ying <ying.huang(a)intel.com>
Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Cc: Nanyong Sun <sunnanyong(a)huawei.com>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/swapfile.c | 1 +
1 file changed, 1 insertion(+)
--- a/mm/swapfile.c~mm-swapfile-add-cond_resched-in-get_swap_pages
+++ a/mm/swapfile.c
@@ -1100,6 +1100,7 @@ start_over:
goto check_out;
pr_debug("scan_swap_map of si %d failed to find offset\n",
si->type);
+ cond_resched();
spin_lock(&swap_avail_lock);
nextsi:
_
Patches currently in -mm which might be from xialonglong1(a)huawei.com are
mm-swapfile-remove-pr_debug-in-get_swap_pages.patch
The quilt patch titled
Subject: Squashfs: fix handling and sanity checking of xattr_ids count
has been removed from the -mm tree. Its filename was
squashfs-fix-handling-and-sanity-checking-of-xattr_ids-count.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Subject: Squashfs: fix handling and sanity checking of xattr_ids count
Date: Fri, 27 Jan 2023 06:18:42 +0000
A Sysbot [1] corrupted filesystem exposes two flaws in the handling and
sanity checking of the xattr_ids count in the filesystem. Both of these
flaws cause computation overflow due to incorrect typing.
In the corrupted filesystem the xattr_ids value is 4294967071, which
stored in a signed variable becomes the negative number -225.
Flaw 1 (64-bit systems only):
The signed integer xattr_ids variable causes sign extension.
This causes variable overflow in the SQUASHFS_XATTR_*(A) macros. The
variable is first multiplied by sizeof(struct squashfs_xattr_id) where the
type of the sizeof operator is "unsigned long".
On a 64-bit system this is 64-bits in size, and causes the negative number
to be sign extended and widened to 64-bits and then become unsigned. This
produces the very large number 18446744073709548016 or 2^64 - 3600. This
number when rounded up by SQUASHFS_METADATA_SIZE - 1 (8191 bytes) and
divided by SQUASHFS_METADATA_SIZE overflows and produces a length of 0
(stored in len).
Flaw 2 (32-bit systems only):
On a 32-bit system the integer variable is not widened by the unsigned
long type of the sizeof operator (32-bits), and the signedness of the
variable has no effect due it always being treated as unsigned.
The above corrupted xattr_ids value of 4294967071, when multiplied
overflows and produces the number 4294963696 or 2^32 - 3400. This number
when rounded up by SQUASHFS_METADATA_SIZE - 1 (8191 bytes) and divided by
SQUASHFS_METADATA_SIZE overflows again and produces a length of 0.
The effect of the 0 length computation:
In conjunction with the corrupted xattr_ids field, the filesystem also has
a corrupted xattr_table_start value, where it matches the end of
filesystem value of 850.
This causes the following sanity check code to fail because the
incorrectly computed len of 0 matches the incorrect size of the table
reported by the superblock (0 bytes).
len = SQUASHFS_XATTR_BLOCK_BYTES(*xattr_ids);
indexes = SQUASHFS_XATTR_BLOCKS(*xattr_ids);
/*
* The computed size of the index table (len bytes) should exactly
* match the table start and end points
*/
start = table_start + sizeof(*id_table);
end = msblk->bytes_used;
if (len != (end - start))
return ERR_PTR(-EINVAL);
Changing the xattr_ids variable to be "usigned int" fixes the flaw on a
64-bit system. This relies on the fact the computation is widened by the
unsigned long type of the sizeof operator.
Casting the variable to u64 in the above macro fixes this flaw on a 32-bit
system.
It also means 64-bit systems do not implicitly rely on the type of the
sizeof operator to widen the computation.
[1] https://lore.kernel.org/lkml/000000000000cd44f005f1a0f17f@google.com/
Link: https://lkml.kernel.org/r/20230127061842.10965-1-phillip@squashfs.org.uk
Fixes: 506220d2ba21 ("squashfs: add more sanity checks in xattr id lookup")
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: <syzbot+082fa4af80a5bb1a9843(a)syzkaller.appspotmail.com>
Cc: Alexey Khoroshilov <khoroshilov(a)ispras.ru>
Cc: Fedor Pchelkin <pchelkin(a)ispras.ru>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/squashfs/squashfs_fs.h | 2 +-
fs/squashfs/squashfs_fs_sb.h | 2 +-
fs/squashfs/xattr.h | 4 ++--
fs/squashfs/xattr_id.c | 2 +-
4 files changed, 5 insertions(+), 5 deletions(-)
--- a/fs/squashfs/squashfs_fs.h~squashfs-fix-handling-and-sanity-checking-of-xattr_ids-count
+++ a/fs/squashfs/squashfs_fs.h
@@ -183,7 +183,7 @@ static inline int squashfs_block_size(__
#define SQUASHFS_ID_BLOCK_BYTES(A) (SQUASHFS_ID_BLOCKS(A) *\
sizeof(u64))
/* xattr id lookup table defines */
-#define SQUASHFS_XATTR_BYTES(A) ((A) * sizeof(struct squashfs_xattr_id))
+#define SQUASHFS_XATTR_BYTES(A) (((u64) (A)) * sizeof(struct squashfs_xattr_id))
#define SQUASHFS_XATTR_BLOCK(A) (SQUASHFS_XATTR_BYTES(A) / \
SQUASHFS_METADATA_SIZE)
--- a/fs/squashfs/squashfs_fs_sb.h~squashfs-fix-handling-and-sanity-checking-of-xattr_ids-count
+++ a/fs/squashfs/squashfs_fs_sb.h
@@ -63,7 +63,7 @@ struct squashfs_sb_info {
long long bytes_used;
unsigned int inodes;
unsigned int fragments;
- int xattr_ids;
+ unsigned int xattr_ids;
unsigned int ids;
bool panic_on_errors;
const struct squashfs_decompressor_thread_ops *thread_ops;
--- a/fs/squashfs/xattr.h~squashfs-fix-handling-and-sanity-checking-of-xattr_ids-count
+++ a/fs/squashfs/xattr.h
@@ -10,12 +10,12 @@
#ifdef CONFIG_SQUASHFS_XATTR
extern __le64 *squashfs_read_xattr_id_table(struct super_block *, u64,
- u64 *, int *);
+ u64 *, unsigned int *);
extern int squashfs_xattr_lookup(struct super_block *, unsigned int, int *,
unsigned int *, unsigned long long *);
#else
static inline __le64 *squashfs_read_xattr_id_table(struct super_block *sb,
- u64 start, u64 *xattr_table_start, int *xattr_ids)
+ u64 start, u64 *xattr_table_start, unsigned int *xattr_ids)
{
struct squashfs_xattr_id_table *id_table;
--- a/fs/squashfs/xattr_id.c~squashfs-fix-handling-and-sanity-checking-of-xattr_ids-count
+++ a/fs/squashfs/xattr_id.c
@@ -56,7 +56,7 @@ int squashfs_xattr_lookup(struct super_b
* Read uncompressed xattr id lookup table indexes from disk into memory
*/
__le64 *squashfs_read_xattr_id_table(struct super_block *sb, u64 table_start,
- u64 *xattr_table_start, int *xattr_ids)
+ u64 *xattr_table_start, unsigned int *xattr_ids)
{
struct squashfs_sb_info *msblk = sb->s_fs_info;
unsigned int len, indexes;
_
Patches currently in -mm which might be from phillip(a)squashfs.org.uk are
The quilt patch titled
Subject: highmem: round down the address passed to kunmap_flush_on_unmap()
has been removed from the -mm tree. Its filename was
highmem-round-down-the-address-passed-to-kunmap_flush_on_unmap.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: "Matthew Wilcox (Oracle)" <willy(a)infradead.org>
Subject: highmem: round down the address passed to kunmap_flush_on_unmap()
Date: Thu, 26 Jan 2023 20:07:27 +0000
We already round down the address in kunmap_local_indexed() which is the
other implementation of __kunmap_local(). The only implementation of
kunmap_flush_on_unmap() is PA-RISC which is expecting a page-aligned
address. This may be causing PA-RISC to be flushing the wrong addresses
currently.
Link: https://lkml.kernel.org/r/20230126200727.1680362-1-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Fixes: 298fa1ad5571 ("highmem: Provide generic variant of kmap_atomic*")
Reviewed-by: Ira Weiny <ira.weiny(a)intel.com>
Cc: "Fabio M. De Francesco" <fmdefrancesco(a)gmail.com>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Helge Deller <deller(a)gmx.de>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Andrey Konovalov <andreyknvl(a)gmail.com>
Cc: Bagas Sanjaya <bagasdotme(a)gmail.com>
Cc: David Sterba <dsterba(a)suse.com>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Cc: Tony Luck <tony.luck(a)intel.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/highmem-internal.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/include/linux/highmem-internal.h~highmem-round-down-the-address-passed-to-kunmap_flush_on_unmap
+++ a/include/linux/highmem-internal.h
@@ -200,7 +200,7 @@ static inline void *kmap_local_pfn(unsig
static inline void __kunmap_local(const void *addr)
{
#ifdef ARCH_HAS_FLUSH_ON_KUNMAP
- kunmap_flush_on_unmap(addr);
+ kunmap_flush_on_unmap(PTR_ALIGN_DOWN(addr, PAGE_SIZE));
#endif
}
@@ -227,7 +227,7 @@ static inline void *kmap_atomic_pfn(unsi
static inline void __kunmap_atomic(const void *addr)
{
#ifdef ARCH_HAS_FLUSH_ON_KUNMAP
- kunmap_flush_on_unmap(addr);
+ kunmap_flush_on_unmap(PTR_ALIGN_DOWN(addr, PAGE_SIZE));
#endif
pagefault_enable();
if (IS_ENABLED(CONFIG_PREEMPT_RT))
_
Patches currently in -mm which might be from willy(a)infradead.org are
mm-remove-folio_pincount_ptr-and-head_compound_pincount.patch
mm-convert-head_subpages_mapcount-into-folio_nr_pages_mapped.patch
doc-clarify-refcount-section-by-referring-to-folios-pages.patch
mm-convert-total_compound_mapcount-to-folio_total_mapcount.patch
mm-convert-page_remove_rmap-to-use-a-folio-internally.patch
mm-convert-page_add_anon_rmap-to-use-a-folio-internally.patch
mm-convert-page_add_file_rmap-to-use-a-folio-internally.patch
mm-add-folio_add_new_anon_rmap.patch
mm-add-folio_add_new_anon_rmap-fix-2.patch
page_alloc-use-folio-fields-directly.patch
mm-use-a-folio-in-hugepage_add_anon_rmap-and-hugepage_add_new_anon_rmap.patch
mm-use-entire_mapcount-in-__page_dup_rmap.patch
mm-debug-remove-call-to-head_compound_mapcount.patch
hugetlb-remove-uses-of-folio_mapcount_ptr.patch
mm-convert-page_mapcount-to-use-folio_entire_mapcount.patch
mm-remove-head_compound_mapcount-and-_ptr-functions.patch
mm-reimplement-compound_order.patch
mm-reimplement-compound_nr.patch
mm-reimplement-compound_nr-fix.patch
mm-convert-set_compound_page_dtor-and-set_compound_order-to-folios.patch
mm-convert-is_transparent_hugepage-to-use-a-folio.patch
mm-convert-destroy_large_folio-to-use-folio_dtor.patch
hugetlb-remove-uses-of-compound_dtor-and-compound_nr.patch
mm-remove-first-tail-page-members-from-struct-page.patch
doc-correct-struct-folio-kernel-doc.patch
mm-move-page-deferred_list-to-folio-_deferred_list.patch
mm-huge_memory-remove-page_deferred_list.patch
mm-huge_memory-convert-get_deferred_split_queue-to-take-a-folio.patch
mm-convert-deferred_split_huge_page-to-deferred_split_folio.patch
shmem-convert-shmem_write_end-to-use-a-folio.patch
mm-add-vma_alloc_zeroed_movable_folio.patch
mm-convert-do_anonymous_page-to-use-a-folio.patch
mm-convert-wp_page_copy-to-use-folios.patch
mm-use-a-folio-in-copy_pte_range.patch
mm-use-a-folio-in-copy_present_pte.patch
mm-fs-convert-inode_attach_wb-to-take-a-folio.patch
mm-convert-mem_cgroup_css_from_page-to-mem_cgroup_css_from_folio.patch
mm-remove-page_evictable.patch
mm-remove-mlock_vma_page.patch
mm-remove-munlock_vma_page.patch
mm-clean-up-mlock_page-munlock_page-references-in-comments.patch
rmap-add-folio-parameter-to-__page_set_anon_rmap.patch
filemap-convert-filemap_map_pmd-to-take-a-folio.patch
filemap-convert-filemap_range_has_page-to-use-a-folio.patch
readahead-convert-readahead_expand-to-use-a-folio.patch
mm-add-memcpy_from_file_folio.patch
fs-convert-writepage_t-callback-to-pass-a-folio.patch
mpage-convert-__mpage_writepage-to-use-a-folio-more-fully.patch
mpage-convert-__mpage_writepage-to-use-a-folio-more-fully-fix.patch
The quilt patch titled
Subject: migrate: hugetlb: check for hugetlb shared PMD in node migration
has been removed from the -mm tree. Its filename was
migrate-hugetlb-check-for-hugetlb-shared-pmd-in-node-migration.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Subject: migrate: hugetlb: check for hugetlb shared PMD in node migration
Date: Thu, 26 Jan 2023 14:27:21 -0800
migrate_pages/mempolicy semantics state that CAP_SYS_NICE is required to
move pages shared with another process to a different node. page_mapcount
> 1 is being used to determine if a hugetlb page is shared. However, a
hugetlb page will have a mapcount of 1 if mapped by multiple processes via
a shared PMD. As a result, hugetlb pages shared by multiple processes and
mapped with a shared PMD can be moved by a process without CAP_SYS_NICE.
To fix, check for a shared PMD if mapcount is 1. If a shared PMD is found
consider the page shared.
Link: https://lkml.kernel.org/r/20230126222721.222195-3-mike.kravetz@oracle.com
Fixes: e2d8cf405525 ("migrate: add hugepage migration code to migrate_pages()")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Acked-by: Peter Xu <peterx(a)redhat.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: James Houghton <jthoughton(a)google.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: Naoya Horiguchi <naoya.horiguchi(a)linux.dev>
Cc: Vishal Moola (Oracle) <vishal.moola(a)gmail.com>
Cc: Yang Shi <shy828301(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/mempolicy.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/mm/mempolicy.c~migrate-hugetlb-check-for-hugetlb-shared-pmd-in-node-migration
+++ a/mm/mempolicy.c
@@ -600,7 +600,8 @@ static int queue_pages_hugetlb(pte_t *pt
/* With MPOL_MF_MOVE, we migrate only unshared hugepage. */
if (flags & (MPOL_MF_MOVE_ALL) ||
- (flags & MPOL_MF_MOVE && page_mapcount(page) == 1)) {
+ (flags & MPOL_MF_MOVE && page_mapcount(page) == 1 &&
+ !hugetlb_pmd_shared(pte))) {
if (isolate_hugetlb(page, qp->pagelist) &&
(flags & MPOL_MF_STRICT))
/*
_
Patches currently in -mm which might be from mike.kravetz(a)oracle.com are
The quilt patch titled
Subject: mm: hugetlb: proc: check for hugetlb shared PMD in /proc/PID/smaps
has been removed from the -mm tree. Its filename was
mm-hugetlb-proc-check-for-hugetlb-shared-pmd-in-proc-pid-smaps.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Subject: mm: hugetlb: proc: check for hugetlb shared PMD in /proc/PID/smaps
Date: Thu, 26 Jan 2023 14:27:20 -0800
Patch series "Fixes for hugetlb mapcount at most 1 for shared PMDs".
This issue of mapcount in hugetlb pages referenced by shared PMDs was
discussed in [1]. The following two patches address user visible behavior
caused by this issue.
[1] https://lore.kernel.org/linux-mm/Y9BF+OCdWnCSilEu@monkey/
This patch (of 2):
A hugetlb page will have a mapcount of 1 if mapped by multiple processes
via a shared PMD. This is because only the first process increases the
map count, and subsequent processes just add the shared PMD page to their
page table.
page_mapcount is being used to decide if a hugetlb page is shared or
private in /proc/PID/smaps. Pages referenced via a shared PMD were
incorrectly being counted as private.
To fix, check for a shared PMD if mapcount is 1. If a shared PMD is found
count the hugetlb page as shared. A new helper to check for a shared PMD
is added.
[akpm(a)linux-foundation.org: simplification, per David]
[akpm(a)linux-foundation.org: hugetlb.h: include page_ref.h for page_count()]
Link: https://lkml.kernel.org/r/20230126222721.222195-2-mike.kravetz@oracle.com
Fixes: 25ee01a2fca0 ("mm: hugetlb: proc: add hugetlb-related fields to /proc/PID/smaps")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Acked-by: Peter Xu <peterx(a)redhat.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: James Houghton <jthoughton(a)google.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: Naoya Horiguchi <naoya.horiguchi(a)linux.dev>
Cc: Vishal Moola (Oracle) <vishal.moola(a)gmail.com>
Cc: Yang Shi <shy828301(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/proc/task_mmu.c | 4 +---
include/linux/hugetlb.h | 13 +++++++++++++
2 files changed, 14 insertions(+), 3 deletions(-)
--- a/fs/proc/task_mmu.c~mm-hugetlb-proc-check-for-hugetlb-shared-pmd-in-proc-pid-smaps
+++ a/fs/proc/task_mmu.c
@@ -745,9 +745,7 @@ static int smaps_hugetlb_range(pte_t *pt
page = pfn_swap_entry_to_page(swpent);
}
if (page) {
- int mapcount = page_mapcount(page);
-
- if (mapcount >= 2)
+ if (page_mapcount(page) >= 2 || hugetlb_pmd_shared(pte))
mss->shared_hugetlb += huge_page_size(hstate_vma(vma));
else
mss->private_hugetlb += huge_page_size(hstate_vma(vma));
--- a/include/linux/hugetlb.h~mm-hugetlb-proc-check-for-hugetlb-shared-pmd-in-proc-pid-smaps
+++ a/include/linux/hugetlb.h
@@ -7,6 +7,7 @@
#include <linux/fs.h>
#include <linux/hugetlb_inline.h>
#include <linux/cgroup.h>
+#include <linux/page_ref.h>
#include <linux/list.h>
#include <linux/kref.h>
#include <linux/pgtable.h>
@@ -1187,6 +1188,18 @@ static inline __init void hugetlb_cma_re
}
#endif
+#ifdef CONFIG_ARCH_WANT_HUGE_PMD_SHARE
+static inline bool hugetlb_pmd_shared(pte_t *pte)
+{
+ return page_count(virt_to_page(pte)) > 1;
+}
+#else
+static inline bool hugetlb_pmd_shared(pte_t *pte)
+{
+ return false;
+}
+#endif
+
bool want_pmd_share(struct vm_area_struct *vma, unsigned long addr);
#ifndef __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE
_
Patches currently in -mm which might be from mike.kravetz(a)oracle.com are
The quilt patch titled
Subject: mm/MADV_COLLAPSE: catch !none !huge !bad pmd lookups
has been removed from the -mm tree. Its filename was
mm-madv_collapse-catch-none-huge-bad-pmd-lookups.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: "Zach O'Keefe" <zokeefe(a)google.com>
Subject: mm/MADV_COLLAPSE: catch !none !huge !bad pmd lookups
Date: Wed, 25 Jan 2023 14:53:58 -0800
In commit 34488399fa08 ("mm/madvise: add file and shmem support to
MADV_COLLAPSE") we make the following change to find_pmd_or_thp_or_none():
- if (!pmd_present(pmde))
- return SCAN_PMD_NULL;
+ if (pmd_none(pmde))
+ return SCAN_PMD_NONE;
This was for-use by MADV_COLLAPSE file/shmem codepaths, where
MADV_COLLAPSE might identify a pte-mapped hugepage, only to have
khugepaged race-in, free the pte table, and clear the pmd. Such codepaths
include:
A) If we find a suitably-aligned compound page of order HPAGE_PMD_ORDER
already in the pagecache.
B) In retract_page_tables(), if we fail to grab mmap_lock for the target
mm/address.
In these cases, collapse_pte_mapped_thp() really does expect a none (not
just !present) pmd, and we want to suitably identify that case separate
from the case where no pmd is found, or it's a bad-pmd (of course, many
things could happen once we drop mmap_lock, and the pmd could plausibly
undergo multiple transitions due to intervening fault, split, etc).
Regardless, the code is prepared install a huge-pmd only when the existing
pmd entry is either a genuine pte-table-mapping-pmd, or the none-pmd.
However, the commit introduces a logical hole; namely, that we've allowed
!none- && !huge- && !bad-pmds to be classified as genuine
pte-table-mapping-pmds. One such example that could leak through are swap
entries. The pmd values aren't checked again before use in
pte_offset_map_lock(), which is expecting nothing less than a genuine
pte-table-mapping-pmd.
We want to put back the !pmd_present() check (below the pmd_none() check),
but need to be careful to deal with subtleties in pmd transitions and
treatments by various arch.
The issue is that __split_huge_pmd_locked() temporarily clears the present
bit (or otherwise marks the entry as invalid), but pmd_present() and
pmd_trans_huge() still need to return true while the pmd is in this
transitory state. For example, x86's pmd_present() also checks the
_PAGE_PSE , riscv's version also checks the _PAGE_LEAF bit, and arm64 also
checks a PMD_PRESENT_INVALID bit.
Covering all 4 cases for x86 (all checks done on the same pmd value):
1) pmd_present() && pmd_trans_huge()
All we actually know here is that the PSE bit is set. Either:
a) We aren't racing with __split_huge_page(), and PRESENT or PROTNONE
is set.
=> huge-pmd
b) We are currently racing with __split_huge_page(). The danger here
is that we proceed as-if we have a huge-pmd, but really we are
looking at a pte-mapping-pmd. So, what is the risk of this
danger?
The only relevant path is:
madvise_collapse() -> collapse_pte_mapped_thp()
Where we might just incorrectly report back "success", when really
the memory isn't pmd-backed. This is fine, since split could
happen immediately after (actually) successful madvise_collapse().
So, it should be safe to just assume huge-pmd here.
2) pmd_present() && !pmd_trans_huge()
Either:
a) PSE not set and either PRESENT or PROTNONE is.
=> pte-table-mapping pmd (or PROT_NONE)
b) devmap. This routine can be called immediately after
unlocking/locking mmap_lock -- or called with no locks held (see
khugepaged_scan_mm_slot()), so previous VMA checks have since been
invalidated.
3) !pmd_present() && pmd_trans_huge()
Not possible.
4) !pmd_present() && !pmd_trans_huge()
Neither PRESENT nor PROTNONE set
=> not present
I've checked all archs that implement pmd_trans_huge() (arm64, riscv,
powerpc, longarch, x86, mips, s390) and this logic roughly translates
(though devmap treatment is unique to x86 and powerpc, and (3) doesn't
necessarily hold in general -- but that doesn't matter since
!pmd_present() always takes failure path).
Also, add a comment above find_pmd_or_thp_or_none() to help future
travelers reason about the validity of the code; namely, the possible
mutations that might happen out from under us, depending on how mmap_lock
is held (if at all).
Link: https://lkml.kernel.org/r/20230125225358.2576151-1-zokeefe@google.com
Fixes: 34488399fa08 ("mm/madvise: add file and shmem support to MADV_COLLAPSE")
Signed-off-by: Zach O'Keefe <zokeefe(a)google.com>
Reported-by: Hugh Dickins <hughd(a)google.com>
Reviewed-by: Yang Shi <shy828301(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/khugepaged.c | 8 ++++++++
1 file changed, 8 insertions(+)
--- a/mm/khugepaged.c~mm-madv_collapse-catch-none-huge-bad-pmd-lookups
+++ a/mm/khugepaged.c
@@ -847,6 +847,10 @@ static int hugepage_vma_revalidate(struc
return SCAN_SUCCEED;
}
+/*
+ * See pmd_trans_unstable() for how the result may change out from
+ * underneath us, even if we hold mmap_lock in read.
+ */
static int find_pmd_or_thp_or_none(struct mm_struct *mm,
unsigned long address,
pmd_t **pmd)
@@ -865,8 +869,12 @@ static int find_pmd_or_thp_or_none(struc
#endif
if (pmd_none(pmde))
return SCAN_PMD_NONE;
+ if (!pmd_present(pmde))
+ return SCAN_PMD_NULL;
if (pmd_trans_huge(pmde))
return SCAN_PMD_MAPPED;
+ if (pmd_devmap(pmde))
+ return SCAN_PMD_NULL;
if (pmd_bad(pmde))
return SCAN_PMD_NULL;
return SCAN_SUCCEED;
_
Patches currently in -mm which might be from zokeefe(a)google.com are
The quilt patch titled
Subject: Revert "mm: kmemleak: alloc gray object for reserved region with direct map"
has been removed from the -mm tree. Its filename was
revert-mm-kmemleak-alloc-gray-object-for-reserved-region-with-direct-map.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: "Isaac J. Manjarres" <isaacmanjarres(a)google.com>
Subject: Revert "mm: kmemleak: alloc gray object for reserved region with direct map"
Date: Tue, 24 Jan 2023 15:02:54 -0800
This reverts commit 972fa3a7c17c9d60212e32ecc0205dc585b1e769.
Kmemleak operates by periodically scanning memory regions for pointers to
allocated memory blocks to determine if they are leaked or not. However,
reserved memory regions can be used for DMA transactions between a device
and a CPU, and thus, wouldn't contain pointers to allocated memory blocks,
making them inappropriate for kmemleak to scan. Thus, revert this commit.
Link: https://lkml.kernel.org/r/20230124230254.295589-1-isaacmanjarres@google.com
Fixes: 972fa3a7c17c9 ("mm: kmemleak: alloc gray object for reserved region with direct map")
Signed-off-by: Isaac J. Manjarres <isaacmanjarres(a)google.com>
Acked-by: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Calvin Zhang <calvinzhang.cool(a)gmail.com>
Cc: Frank Rowand <frowand.list(a)gmail.com>
Cc: Rob Herring <robh+dt(a)kernel.org>
Cc: Saravana Kannan <saravanak(a)google.com>
Cc: <stable(a)vger.kernel.org> [5.17+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
drivers/of/fdt.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
--- a/drivers/of/fdt.c~revert-mm-kmemleak-alloc-gray-object-for-reserved-region-with-direct-map
+++ a/drivers/of/fdt.c
@@ -26,7 +26,6 @@
#include <linux/serial_core.h>
#include <linux/sysfs.h>
#include <linux/random.h>
-#include <linux/kmemleak.h>
#include <asm/setup.h> /* for COMMAND_LINE_SIZE */
#include <asm/page.h>
@@ -525,12 +524,9 @@ static int __init __reserved_mem_reserve
size = dt_mem_next_cell(dt_root_size_cells, &prop);
if (size &&
- early_init_dt_reserve_memory(base, size, nomap) == 0) {
+ early_init_dt_reserve_memory(base, size, nomap) == 0)
pr_debug("Reserved memory: reserved region for node '%s': base %pa, size %lu MiB\n",
uname, &base, (unsigned long)(size / SZ_1M));
- if (!nomap)
- kmemleak_alloc_phys(base, size, 0);
- }
else
pr_err("Reserved memory: failed to reserve memory for node '%s': base %pa, size %lu MiB\n",
uname, &base, (unsigned long)(size / SZ_1M));
_
Patches currently in -mm which might be from isaacmanjarres(a)google.com are
The quilt patch titled
Subject: mm, mremap: fix mremap() expanding for vma's with vm_ops->close()
has been removed from the -mm tree. Its filename was
mm-mremap-fix-mremap-expanding-for-vmas-with-vm_ops-close.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Vlastimil Babka <vbabka(a)suse.cz>
Subject: mm, mremap: fix mremap() expanding for vma's with vm_ops->close()
Date: Tue, 17 Jan 2023 11:19:39 +0100
Fabian has reported another regression in 6.1 due to ca3d76b0aa80 ("mm:
add merging after mremap resize"). The problem is that vma_merge() can
fail when vma has a vm_ops->close() method, causing is_mergeable_vma()
test to be negative. This was happening for vma mapping a file from
fuse-overlayfs, which does have the method. But when we are simply
expanding the vma, we never remove it due to the "merge" with the added
area, so the test should not prevent the expansion.
As a quick fix, check for such vmas and expand them using vma_adjust()
directly as was done before commit ca3d76b0aa80. For a more robust long
term solution we should try to limit the check for vma_ops->close only to
cases that actually result in vma removal, so that no merge would be
prevented unnecessarily.
[akpm(a)linux-foundation.org: fix indenting whitespace, reflow comment]
Link: https://lkml.kernel.org/r/20230117101939.9753-1-vbabka@suse.cz
Fixes: ca3d76b0aa80 ("mm: add merging after mremap resize")
Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz>
Reported-by: Fabian Vogt <fvogt(a)suse.com>
Link: https://bugzilla.suse.com/show_bug.cgi?id=1206359#c35
Tested-by: Fabian Vogt <fvogt(a)suse.com>
Cc: Jakub Mat��na <matenajakub(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/mremap.c | 25 +++++++++++++++++++------
1 file changed, 19 insertions(+), 6 deletions(-)
--- a/mm/mremap.c~mm-mremap-fix-mremap-expanding-for-vmas-with-vm_ops-close
+++ a/mm/mremap.c
@@ -1027,16 +1027,29 @@ SYSCALL_DEFINE5(mremap, unsigned long, a
}
/*
- * Function vma_merge() is called on the extension we are adding to
- * the already existing vma, vma_merge() will merge this extension with
- * the already existing vma (expand operation itself) and possibly also
- * with the next vma if it becomes adjacent to the expanded vma and
- * otherwise compatible.
+ * Function vma_merge() is called on the extension we
+ * are adding to the already existing vma, vma_merge()
+ * will merge this extension with the already existing
+ * vma (expand operation itself) and possibly also with
+ * the next vma if it becomes adjacent to the expanded
+ * vma and otherwise compatible.
+ *
+ * However, vma_merge() can currently fail due to
+ * is_mergeable_vma() check for vm_ops->close (see the
+ * comment there). Yet this should not prevent vma
+ * expanding, so perform a simple expand for such vma.
+ * Ideally the check for close op should be only done
+ * when a vma would be actually removed due to a merge.
*/
- vma = vma_merge(mm, vma, extension_start, extension_end,
+ if (!vma->vm_ops || !vma->vm_ops->close) {
+ vma = vma_merge(mm, vma, extension_start, extension_end,
vma->vm_flags, vma->anon_vma, vma->vm_file,
extension_pgoff, vma_policy(vma),
vma->vm_userfaultfd_ctx, anon_vma_name(vma));
+ } else if (vma_adjust(vma, vma->vm_start, addr + new_len,
+ vma->vm_pgoff, NULL)) {
+ vma = NULL;
+ }
if (!vma) {
vm_unacct_memory(pages);
ret = -ENOMEM;
_
Patches currently in -mm which might be from vbabka(a)suse.cz are
revert-mm-compaction-fix-set-skip-in-fast_find_migrateblock.patch
The quilt patch titled
Subject: ia64: fix build error due to switch case label appearing next to declaration
has been removed from the -mm tree. Its filename was
ia64-fix-build-error-due-to-switch-case-label-appearing-next-to-declaration.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: James Morse <james.morse(a)arm.com>
Subject: ia64: fix build error due to switch case label appearing next to declaration
Date: Tue, 17 Jan 2023 15:16:32 +0000
Since commit aa06a9bd8533 ("ia64: fix clock_getres(CLOCK_MONOTONIC) to
report ITC frequency"), gcc 10.1.0 fails to build ia64 with the gnomic:
| ../arch/ia64/kernel/sys_ia64.c: In function 'ia64_clock_getres':
| ../arch/ia64/kernel/sys_ia64.c:189:3: error: a label can only be part of a statement and a declaration is not a statement
| 189 | s64 tick_ns = DIV_ROUND_UP(NSEC_PER_SEC, local_cpu_data->itc_freq);
This line appears immediately after a case label in a switch.
Move the declarations out of the case, to the top of the function.
Link: https://lkml.kernel.org/r/20230117151632.393836-1-james.morse@arm.com
Fixes: aa06a9bd8533 ("ia64: fix clock_getres(CLOCK_MONOTONIC) to report ITC frequency")
Signed-off-by: James Morse <james.morse(a)arm.com>
Reviewed-by: Sergei Trofimovich <slyich(a)gmail.com>
Cc: ��meric Maschino <emeric.maschino(a)gmail.com>
Cc: matoro <matoro_mailinglist_kernel(a)matoro.tk>
Cc: John Paul Adrian Glaubitz <glaubitz(a)physik.fu-berlin.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/ia64/kernel/sys_ia64.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
--- a/arch/ia64/kernel/sys_ia64.c~ia64-fix-build-error-due-to-switch-case-label-appearing-next-to-declaration
+++ a/arch/ia64/kernel/sys_ia64.c
@@ -170,6 +170,9 @@ ia64_mremap (unsigned long addr, unsigne
asmlinkage long
ia64_clock_getres(const clockid_t which_clock, struct __kernel_timespec __user *tp)
{
+ struct timespec64 rtn_tp;
+ s64 tick_ns;
+
/*
* ia64's clock_gettime() syscall is implemented as a vdso call
* fsys_clock_gettime(). Currently it handles only
@@ -185,8 +188,8 @@ ia64_clock_getres(const clockid_t which_
switch (which_clock) {
case CLOCK_REALTIME:
case CLOCK_MONOTONIC:
- s64 tick_ns = DIV_ROUND_UP(NSEC_PER_SEC, local_cpu_data->itc_freq);
- struct timespec64 rtn_tp = ns_to_timespec64(tick_ns);
+ tick_ns = DIV_ROUND_UP(NSEC_PER_SEC, local_cpu_data->itc_freq);
+ rtn_tp = ns_to_timespec64(tick_ns);
return put_timespec64(&rtn_tp, tp);
}
_
Patches currently in -mm which might be from james.morse(a)arm.com are
The quilt patch titled
Subject: mm: multi-gen LRU: fix crash during cgroup migration
has been removed from the -mm tree. Its filename was
mm-multi-gen-lru-fix-crash-during-cgroup-migration.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Yu Zhao <yuzhao(a)google.com>
Subject: mm: multi-gen LRU: fix crash during cgroup migration
Date: Sun, 15 Jan 2023 20:44:05 -0700
lru_gen_migrate_mm() assumes lru_gen_add_mm() runs prior to itself. This
isn't true for the following scenario:
CPU 1 CPU 2
clone()
cgroup_can_fork()
cgroup_procs_write()
cgroup_post_fork()
task_lock()
lru_gen_migrate_mm()
task_unlock()
task_lock()
lru_gen_add_mm()
task_unlock()
And when the above happens, kernel crashes because of linked list
corruption (mm_struct->lru_gen.list).
Link: https://lore.kernel.org/r/20230115134651.30028-1-msizanoen@qtmlabs.xyz/
Link: https://lkml.kernel.org/r/20230116034405.2960276-1-yuzhao@google.com
Fixes: bd74fdaea146 ("mm: multi-gen LRU: support page table walks")
Signed-off-by: Yu Zhao <yuzhao(a)google.com>
Reported-by: msizanoen <msizanoen(a)qtmlabs.xyz>
Tested-by: msizanoen <msizanoen(a)qtmlabs.xyz>
Cc: <stable(a)vger.kernel.org> [6.1+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vmscan.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--- a/mm/vmscan.c~mm-multi-gen-lru-fix-crash-during-cgroup-migration
+++ a/mm/vmscan.c
@@ -3323,13 +3323,16 @@ void lru_gen_migrate_mm(struct mm_struct
if (mem_cgroup_disabled())
return;
+ /* migration can happen before addition */
+ if (!mm->lru_gen.memcg)
+ return;
+
rcu_read_lock();
memcg = mem_cgroup_from_task(task);
rcu_read_unlock();
if (memcg == mm->lru_gen.memcg)
return;
- VM_WARN_ON_ONCE(!mm->lru_gen.memcg);
VM_WARN_ON_ONCE(list_empty(&mm->lru_gen.list));
lru_gen_del_mm(mm);
_
Patches currently in -mm which might be from yuzhao(a)google.com are