The patch titled
Subject: mm/cgroup/reclaim: fix dirty pages throttling on cgroup v1
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-cgroup-reclaim-fix-dirty-pages-throttling-on-cgroup-v1.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and …
[View More]hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: "Aneesh Kumar K.V" <aneesh.kumar(a)linux.ibm.com>
Subject: mm/cgroup/reclaim: fix dirty pages throttling on cgroup v1
Date: Fri, 18 Nov 2022 12:36:03 +0530
balance_dirty_pages doesn't do the required dirty throttling on cgroupv1.
See commit 9badce000e2c ("cgroup, writeback: don't enable cgroup writeback
on traditional hierarchies"). Instead, the kernel depends on writeback
throttling in shrink_folio_list to achieve the same goal. With large
memory systems, the flusher may not be able to writeback quickly enough
such that we will start finding pages in the shrink_folio_list already in
writeback. Hence for cgroupv1 let's do a reclaim throttle after waking up
the flusher.
The below test which used to fail on a 256GB system completes till the the
file system is full with this change.
root@lp2:/sys/fs/cgroup/memory# mkdir test
root@lp2:/sys/fs/cgroup/memory# cd test/
root@lp2:/sys/fs/cgroup/memory/test# echo 120M > memory.limit_in_bytes
root@lp2:/sys/fs/cgroup/memory/test# echo $$ > tasks
root@lp2:/sys/fs/cgroup/memory/test# dd if=/dev/zero of=/home/kvaneesh/test bs=1M
Killed
Link: https://lkml.kernel.org/r/20221118070603.84081-1-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar(a)linux.ibm.com>
Suggested-by: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Tejun Heo <tj(a)kernel.org>
Cc: zefan li <lizefan.x(a)bytedance.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vmscan.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
--- a/mm/vmscan.c~mm-cgroup-reclaim-fix-dirty-pages-throttling-on-cgroup-v1
+++ a/mm/vmscan.c
@@ -2514,8 +2514,20 @@ static unsigned long shrink_inactive_lis
* the flushers simply cannot keep up with the allocation
* rate. Nudge the flusher threads in case they are asleep.
*/
- if (stat.nr_unqueued_dirty == nr_taken)
+ if (stat.nr_unqueued_dirty == nr_taken) {
wakeup_flusher_threads(WB_REASON_VMSCAN);
+ /*
+ * For cgroupv1 dirty throttling is achieved by waking up
+ * the kernel flusher here and later waiting on folios
+ * which are in writeback to finish (see shrink_folio_list()).
+ *
+ * Flusher may not be able to issue writeback quickly
+ * enough for cgroupv1 writeback throttling to work
+ * on a large system.
+ */
+ if (!writeback_throttling_sane(sc))
+ reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK);
+ }
sc->nr.dirty += stat.nr_dirty;
sc->nr.congested += stat.nr_congested;
_
Patches currently in -mm which might be from aneesh.kumar(a)linux.ibm.com are
mm-cgroup-reclaim-fix-dirty-pages-throttling-on-cgroup-v1.patch
[View Less]
The patch titled
Subject: mm: fix unexpected changes to {failslab|fail_page_alloc}.attr
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-fix-unexpected-changes-to-failslabfail_page_allocattr.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go …
[View More]and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Qi Zheng <zhengqi.arch(a)bytedance.com>
Subject: mm: fix unexpected changes to {failslab|fail_page_alloc}.attr
Date: Fri, 18 Nov 2022 18:00:11 +0800
When we specify __GFP_NOWARN, we only expect that no warnings will be
issued for current caller. But in the __should_failslab() and
__should_fail_alloc_page(), the local GFP flags alter the global
{failslab|fail_page_alloc}.attr, which is persistent and shared by all
tasks. This is not what we expected, let's fix it.
Link: https://lkml.kernel.org/r/20221118100011.2634-1-zhengqi.arch@bytedance.com
Fixes: 3f913fc5f974 ("mm: fix missing handler for __GFP_NOWARN")
Signed-off-by: Qi Zheng <zhengqi.arch(a)bytedance.com>
Reported-by: Dmitry Vyukov <dvyukov(a)google.com>
Reviewed-by: Akinobu Mita <akinobu.mita(a)gmail.com>
Reviewed-by: Jason Gunthorpe <jgg(a)nvidia.com>
Cc: Akinobu Mita <akinobu.mita(a)gmail.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/fault-inject.h | 7 +++++--
lib/fault-inject.c | 14 +++++++++-----
mm/failslab.c | 12 ++++++++++--
mm/page_alloc.c | 7 +++++--
4 files changed, 29 insertions(+), 11 deletions(-)
--- a/include/linux/fault-inject.h~mm-fix-unexpected-changes-to-failslabfail_page_allocattr
+++ a/include/linux/fault-inject.h
@@ -20,7 +20,6 @@ struct fault_attr {
atomic_t space;
unsigned long verbose;
bool task_filter;
- bool no_warn;
unsigned long stacktrace_depth;
unsigned long require_start;
unsigned long require_end;
@@ -32,6 +31,10 @@ struct fault_attr {
struct dentry *dname;
};
+enum fault_flags {
+ FAULT_NOWARN = 1 << 0,
+};
+
#define FAULT_ATTR_INITIALIZER { \
.interval = 1, \
.times = ATOMIC_INIT(1), \
@@ -40,11 +43,11 @@ struct fault_attr {
.ratelimit_state = RATELIMIT_STATE_INIT_DISABLED, \
.verbose = 2, \
.dname = NULL, \
- .no_warn = false, \
}
#define DECLARE_FAULT_ATTR(name) struct fault_attr name = FAULT_ATTR_INITIALIZER
int setup_fault_attr(struct fault_attr *attr, char *str);
+bool should_fail_ex(struct fault_attr *attr, ssize_t size, int flags);
bool should_fail(struct fault_attr *attr, ssize_t size);
#ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
--- a/lib/fault-inject.c~mm-fix-unexpected-changes-to-failslabfail_page_allocattr
+++ a/lib/fault-inject.c
@@ -41,9 +41,6 @@ EXPORT_SYMBOL_GPL(setup_fault_attr);
static void fail_dump(struct fault_attr *attr)
{
- if (attr->no_warn)
- return;
-
if (attr->verbose > 0 && __ratelimit(&attr->ratelimit_state)) {
printk(KERN_NOTICE "FAULT_INJECTION: forcing a failure.\n"
"name %pd, interval %lu, probability %lu, "
@@ -103,7 +100,7 @@ static inline bool fail_stacktrace(struc
* http://www.nongnu.org/failmalloc/
*/
-bool should_fail(struct fault_attr *attr, ssize_t size)
+bool should_fail_ex(struct fault_attr *attr, ssize_t size, int flags)
{
if (in_task()) {
unsigned int fail_nth = READ_ONCE(current->fail_nth);
@@ -146,13 +143,20 @@ bool should_fail(struct fault_attr *attr
return false;
fail:
- fail_dump(attr);
+ if (!(flags & FAULT_NOWARN))
+ fail_dump(attr);
if (atomic_read(&attr->times) != -1)
atomic_dec_not_zero(&attr->times);
return true;
}
+EXPORT_SYMBOL_GPL(should_fail_ex);
+
+bool should_fail(struct fault_attr *attr, ssize_t size)
+{
+ return should_fail_ex(attr, size, 0);
+}
EXPORT_SYMBOL_GPL(should_fail);
#ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
--- a/mm/failslab.c~mm-fix-unexpected-changes-to-failslabfail_page_allocattr
+++ a/mm/failslab.c
@@ -16,6 +16,8 @@ static struct {
bool __should_failslab(struct kmem_cache *s, gfp_t gfpflags)
{
+ int flags = 0;
+
/* No fault-injection for bootstrap cache */
if (unlikely(s == kmem_cache))
return false;
@@ -30,10 +32,16 @@ bool __should_failslab(struct kmem_cache
if (failslab.cache_filter && !(s->flags & SLAB_FAILSLAB))
return false;
+ /*
+ * In some cases, it expects to specify __GFP_NOWARN
+ * to avoid printing any information(not just a warning),
+ * thus avoiding deadlocks. See commit 6b9dbedbe349 for
+ * details.
+ */
if (gfpflags & __GFP_NOWARN)
- failslab.attr.no_warn = true;
+ flags |= FAULT_NOWARN;
- return should_fail(&failslab.attr, s->object_size);
+ return should_fail_ex(&failslab.attr, s->object_size, flags);
}
static int __init setup_failslab(char *str)
--- a/mm/page_alloc.c~mm-fix-unexpected-changes-to-failslabfail_page_allocattr
+++ a/mm/page_alloc.c
@@ -3887,6 +3887,8 @@ __setup("fail_page_alloc=", setup_fail_p
static bool __should_fail_alloc_page(gfp_t gfp_mask, unsigned int order)
{
+ int flags = 0;
+
if (order < fail_page_alloc.min_order)
return false;
if (gfp_mask & __GFP_NOFAIL)
@@ -3897,10 +3899,11 @@ static bool __should_fail_alloc_page(gfp
(gfp_mask & __GFP_DIRECT_RECLAIM))
return false;
+ /* See comment in __should_failslab() */
if (gfp_mask & __GFP_NOWARN)
- fail_page_alloc.attr.no_warn = true;
+ flags |= FAULT_NOWARN;
- return should_fail(&fail_page_alloc.attr, 1 << order);
+ return should_fail_ex(&fail_page_alloc.attr, 1 << order, flags);
}
#ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
_
Patches currently in -mm which might be from zhengqi.arch(a)bytedance.com are
mm-fix-unexpected-changes-to-failslabfail_page_allocattr.patch
[View Less]
There's been a very long running bug that seems to have been neglected for
a while, where amdgpu consistently triggers a KASAN error at start:
BUG: KASAN: global-out-of-bounds in read_indirect_azalia_reg+0x1d4/0x2a0 [amdgpu]
Read of size 4 at addr ffffffffc2274b28 by task modprobe/1889
After digging through amd's rather creative method for accessing registers,
I eventually discovered the problem likely has to do with the fact that on
my dce120 GPU there are supposedly 7 sets of audio …
[View More]registers. But we only
define a register mapping for 6 sets.
So, fix this and fix the KASAN warning finally.
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
Cc: stable(a)vger.kernel.org
---
Sending this one separately from the rest of my fixes since:
* It's definitely completely unrelated to the Gitlab 2171 issue
* I'm not sure if this is the correct fix since it's in DC
drivers/gpu/drm/amd/display/dc/dce120/dce120_resource.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dce120/dce120_resource.c b/drivers/gpu/drm/amd/display/dc/dce120/dce120_resource.c
index 1b70b78e2fa15..af631085e88c5 100644
--- a/drivers/gpu/drm/amd/display/dc/dce120/dce120_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dce120/dce120_resource.c
@@ -359,7 +359,8 @@ static const struct dce_audio_registers audio_regs[] = {
audio_regs(2),
audio_regs(3),
audio_regs(4),
- audio_regs(5)
+ audio_regs(5),
+ audio_regs(6),
};
#define DCE120_AUD_COMMON_MASK_SH_LIST(mask_sh)\
--
2.37.3
[View Less]
> On Fri, Oct 28, 2022 at 02:51:43PM +0000, Dominic Jones wrote:
> > Updating the machine's kernel from v5.19.x to v6.0.x causes the machine to not
> > successfully boot. The machine boots successfully (and exhibits stable operation)
> > with version v5.19.17 and multiple earlier releases in the 5.19 line. Multiple releases
> > from the 6.0 line (including 6.0.0, 6.0.3, and 6.0.5), with no other changes to the
> > software environment, do not boot. Instead, the …
[View More]machine hangs after loading services
> > but before presenting a display manager; the machine instead shows repetitive hard
> > drive activity at this point and then no apparent activity.
> >
> > ''uname'' output for the machine successfully running v5.19.17 is:
> >
> > Linux [MACHINE_NAME] 5.19.17 #1 SMP PREEMPT_DYNAMIC Mon Oct 24 13:32:29 2022 i686 Intel(R) Atom(TM) CPU N270 @ 1.60GHz GenuineIntel GNU/Linux
> >
> > The machine is an OCZ Neutrino netbook, running a custom OS build largely similar to
> > LFS development. The kernel update uses ''make olddefconfig''.
>
> Can you use 'git bisect' to find the offending change that causes this
> to happen?
Wanted to follow up here. I got up to speed on ''git bisect'' and am currently
running the process. It looks like I have about six iterations to go and I'm
typically building/testing 1-3 kernels per day. I can also verify that the issue
occurs under 6.1rc4 as a side effect of the process.
Dominic Jones
jonesd(a)xmission.com
[View Less]
Hello!
This is an experimental semi-automated report about issues detected by
Coverity from a scan of next-20221118 as part of the linux-next scan project:
https://scan.coverity.com/projects/linux-next-weekly-scan
You're getting this email because you were associated with the identified
lines of code (noted below) that were touched by commits:
Thu Nov 17 00:18:25 2022 -0500
7cce4cd628be ("drm/amdgpu/mst: Stop ignoring error codes and deadlocking")
Coverity reported the following:
*** …
[View More]CID 1527373: Uninitialized variables (UNINIT)
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c:1227 in pre_compute_mst_dsc_configs_for_state()
1221 for (j = 0; j < dc_state->stream_count; j++) {
1222 if (dc_state->streams[j]->link == stream->link)
1223 computed_streams[j] = true;
1224 }
1225 }
1226
vvv CID 1527373: Uninitialized variables (UNINIT)
vvv Using uninitialized value "ret".
1227 return ret;
1228 }
1229
1230 static int find_crtc_index_in_state_by_stream(struct drm_atomic_state *state,
1231 struct dc_stream_state *stream)
1232 {
If this is a false positive, please let us know so we can mark it as
such, or teach the Coverity rules to be smarter. If not, please make
sure fixes get into linux-next. :) For patches fixing this, please
include these lines (but double-check the "Fixes" first):
Reported-by: coverity-bot <keescook+coverity-bot(a)chromium.org>
Addresses-Coverity-ID: 1527373 ("Uninitialized variables")
Fixes: 7cce4cd628be ("drm/amdgpu/mst: Stop ignoring error codes and deadlocking")
If dc_state->stream_count is 0, "ret" is undefined. Perhaps initialize
it as -EINVAL?
Thanks for your attention!
--
Coverity-bot
[View Less]
--
Hello,,
We are privileged and delighted to reach you via email" And we are
urgently waiting to hear from you. and again your number is not
connecting.
Thanks,
Woo Nam
The SPI framework checks for each transfer (with the struct
spi_controller::can_dma callback) whether the driver wants to use DMA
for the transfer. If the driver returns true, the SPI framework will
map the transfer's data to the device, start the actual transfer and
map the data back.
In commit 07e759387788 ("spi: spi-imx: add PIO polling support") the
spi-imx driver's spi_imx_transfer_one() function was extended. If the
estimated duration of a transfer does not exceed a configurable
duration,…
[View More] a polling transfer function is used. This check happens
before checking if the driver decided earlier for a DMA transfer.
If spi_imx_can_dma() decided to use a DMA transfer, and the user
configured a big maximum polling duration, a polling transfer will be
used. The DMA unmap after the transfer destroys the transferred data.
To fix this problem check in spi_imx_transfer_one() if the driver
decided for DMA transfer first, then check the limits for a polling
transfer.
Fixes: 07e759387788 ("spi: spi-imx: add PIO polling support")
Link: https://lore.kernel.org/all/20221111003032.82371-1-festevam@gmail.com
Reported-by: Frieder Schrempf <frieder.schrempf(a)kontron.de>
Reported-by: Fabio Estevam <festevam(a)gmail.com>
Tested-by: Fabio Estevam <festevam(a)gmail.com>
Cc: David Jander <david(a)protonic.nl>
Cc: stable(a)vger.kernel.org
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/spi/spi-imx.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/spi/spi-imx.c b/drivers/spi/spi-imx.c
index 30d82cc7300b..3240c139f8f3 100644
--- a/drivers/spi/spi-imx.c
+++ b/drivers/spi/spi-imx.c
@@ -1607,6 +1607,13 @@ static int spi_imx_transfer_one(struct spi_controller *controller,
if (spi_imx->slave_mode)
return spi_imx_pio_transfer_slave(spi, transfer);
+ /*
+ * If we decided in spi_imx_can_dma() that we want to do a DMA
+ * transfer, the SPI transfer has already been mapped, so we
+ * have to do the DMA transfer here.
+ */
+ if (spi_imx->usedma)
+ return spi_imx_dma_transfer(spi_imx, transfer);
/*
* Calculate the estimated time in us the transfer runs. Find
* the number of Hz per byte per polling limit.
@@ -1618,9 +1625,6 @@ static int spi_imx_transfer_one(struct spi_controller *controller,
if (transfer->len < byte_limit)
return spi_imx_poll_transfer(spi, transfer);
- if (spi_imx->usedma)
- return spi_imx_dma_transfer(spi_imx, transfer);
-
return spi_imx_pio_transfer(spi, transfer);
}
--
2.35.1
[View Less]