KASAN uses different routines to map shadow for hot added memory and memory
obtained in boot process. Attempt to offline memory onlined by normal boot
process leads to this:
Trying to vfree() nonexistent vm area (000000005d3b34b9)
WARNING: CPU: 2 PID: 13215 at mm/vmalloc.c:1525 __vunmap+0x147/0x190
Call Trace:
kasan_mem_notifier+0xad/0xb9
notifier_call_chain+0x166/0x260
__blocking_notifier_call_chain+0xdb/0x140
__offline_pages+0x96a/0xb10
memory_subsys_offline+0x76/0xc0
device_offline+0xb8/0x120
store_mem_state+0xfa/0x120
kernfs_fop_write+0x1d5/0x320
__vfs_write+0xd4/0x530
vfs_write+0x105/0x340
SyS_write+0xb0/0x140
Obviously we can't call vfree() to free memory that wasn't allocated via
vmalloc(). Use find_vm_area() to see if we can call vfree().
Unfortunately it's a bit tricky to properly unmap and free shadow allocated
during boot, so we'll have to keep it. If memory will come online again
that shadow will be reused.
Fixes: fa69b5989bb0 ("mm/kasan: add support for memory hotplug")
Reported-by: Paul Menzel <pmenzel+linux-kasan-dev(a)molgen.mpg.de>
Signed-off-by: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Cc: <stable(a)vger.kernel.org>
---
mm/kasan/kasan.c | 57 ++++++++++++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 55 insertions(+), 2 deletions(-)
diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c
index e13d911251e7..0d9d9d268f32 100644
--- a/mm/kasan/kasan.c
+++ b/mm/kasan/kasan.c
@@ -791,6 +791,41 @@ DEFINE_ASAN_SET_SHADOW(f5);
DEFINE_ASAN_SET_SHADOW(f8);
#ifdef CONFIG_MEMORY_HOTPLUG
+static bool shadow_mapped(unsigned long addr)
+{
+ pgd_t *pgd = pgd_offset_k(addr);
+ p4d_t *p4d;
+ pud_t *pud;
+ pmd_t *pmd;
+ pte_t *pte;
+
+ if (pgd_none(*pgd))
+ return false;
+ p4d = p4d_offset(pgd, addr);
+ if (p4d_none(*p4d))
+ return false;
+ pud = pud_offset(p4d, addr);
+ if (pud_none(*pud))
+ return false;
+
+ /*
+ * We can't use pud_large() or pud_huge(), the first one
+ * is arch-specific, the last one depend on HUGETLB_PAGE.
+ * So let's abuse pud_bad(), if bud is bad it's has to
+ * because it's huge.
+ */
+ if (pud_bad(*pud))
+ return true;
+ pmd = pmd_offset(pud, addr);
+ if (pmd_none(*pmd))
+ return false;
+
+ if (pmd_bad(*pmd))
+ return true;
+ pte = pte_offset_kernel(pmd, addr);
+ return !pte_none(*pte);
+}
+
static int __meminit kasan_mem_notifier(struct notifier_block *nb,
unsigned long action, void *data)
{
@@ -812,6 +847,14 @@ static int __meminit kasan_mem_notifier(struct notifier_block *nb,
case MEM_GOING_ONLINE: {
void *ret;
+ /*
+ * If shadow is mapped already than it must have been mapped
+ * during the boot. This could happen if we onlining previously
+ * offlined memory.
+ */
+ if (shadow_mapped(shadow_start))
+ return NOTIFY_OK;
+
ret = __vmalloc_node_range(shadow_size, PAGE_SIZE, shadow_start,
shadow_end, GFP_KERNEL,
PAGE_KERNEL, VM_NO_GUARD,
@@ -823,8 +866,18 @@ static int __meminit kasan_mem_notifier(struct notifier_block *nb,
kmemleak_ignore(ret);
return NOTIFY_OK;
}
- case MEM_OFFLINE:
- vfree((void *)shadow_start);
+ case MEM_OFFLINE: {
+ struct vm_struct *vm;
+
+ /*
+ * Only hot-added memory have vm_area. Freeing shadow
+ * mapped during boot would be tricky, so we'll just
+ * have to keep it.
+ */
+ vm = find_vm_area((void *)shadow_start);
+ if (vm)
+ vfree((void *)shadow_start);
+ }
}
return NOTIFY_OK;
--
2.13.6
From: AMAN DEEP <aman.deep(a)samsung.com>
There is a race condition between finish_unlinks->finish_urb() function
and usb_kill_urb() in ohci controller case. The finish_urb calls
spin_unlock(&ohci->lock) before usb_hcd_giveback_urb() function call,
then if during this time, usb_kill_urb is called for another endpoint,
then new ed will be added to ed_rm_list at beginning for unlink, and
ed_rm_list will point to newly added.
When finish_urb() is completed in finish_unlinks() and ed->td_list
becomes empty as in below code (in finish_unlinks() function):
if (list_empty(&ed->td_list)) {
*last = ed->ed_next;
ed->ed_next = NULL;
} else if (ohci->rh_state == OHCI_RH_RUNNING) {
*last = ed->ed_next;
ed->ed_next = NULL;
ed_schedule(ohci, ed);
}
The *last = ed->ed_next will make ed_rm_list to point to ed->ed_next
and previously added ed by usb_kill_urb will be left unreferenced by
ed_rm_list. This causes usb_kill_urb() hang forever waiting for
finish_unlink to remove added ed from ed_rm_list.
The main reason for hang in this race condtion is addition and removal
of ed from ed_rm_list in the beginning during usb_kill_urb and later
last* is modified in finish_unlinks().
As suggested by Alan Stern, the solution for proper handling of
ohci->ed_rm_list is to remove ed from the ed_rm_list before finishing
any URBs. Then at the end, we can add ed back to the list if necessary.
This properly handle the updated ohci->ed_rm_list in usb_kill_urb().
Fixes:977dcfdc6031("USB:OHCI:don't lose track of EDs when a controller dies")
Acked-by: Alan Stern <stern(a)rowland.harvard.edu>
CC: <stable(a)vger.kernel.org>
Signed-off-by: Aman Deep <aman.deep(a)samsung.com>
Signed-off-by: Jeffy Chen <jeffy.chen(a)rock-chips.com>
---
Changes in v6:
This is a resend of Aman Deep's v5 patch [0], which solved the hang we
hit [1]. (Thanks Aman :)
The v5 has some format issues, so i slightly adjust the commit message.
[0] https://www.spinics.net/lists/linux-usb/msg129010.html
[1] https://bugs.chromium.org/p/chromium/issues/detail?id=803749
drivers/usb/host/ohci-q.c | 17 ++++++++++-------
1 file changed, 10 insertions(+), 7 deletions(-)
diff --git a/drivers/usb/host/ohci-q.c b/drivers/usb/host/ohci-q.c
index b2ec8c399363..4ccb85a67bb3 100644
--- a/drivers/usb/host/ohci-q.c
+++ b/drivers/usb/host/ohci-q.c
@@ -1019,6 +1019,8 @@ static void finish_unlinks(struct ohci_hcd *ohci)
* have modified this list. normally it's just prepending
* entries (which we'd ignore), but paranoia won't hurt.
*/
+ *last = ed->ed_next;
+ ed->ed_next = NULL;
modified = 0;
/* unlink urbs as requested, but rescan the list after
@@ -1077,21 +1079,22 @@ static void finish_unlinks(struct ohci_hcd *ohci)
goto rescan_this;
/*
- * If no TDs are queued, take ED off the ed_rm_list.
+ * If no TDs are queued, ED is now idle.
* Otherwise, if the HC is running, reschedule.
- * If not, leave it on the list for further dequeues.
+ * If the HC isn't running, add ED back to the
+ * start of the list for later processing.
*/
if (list_empty(&ed->td_list)) {
- *last = ed->ed_next;
- ed->ed_next = NULL;
ed->state = ED_IDLE;
list_del(&ed->in_use_list);
} else if (ohci->rh_state == OHCI_RH_RUNNING) {
- *last = ed->ed_next;
- ed->ed_next = NULL;
ed_schedule(ohci, ed);
} else {
- last = &ed->ed_next;
+ ed->ed_next = ohci->ed_rm_list;
+ ohci->ed_rm_list = ed;
+ /* Don't loop on the same ED */
+ if (last == &ohci->ed_rm_list)
+ last = &ed->ed_next;
}
if (modified)
--
2.11.0
Despite the efforts made to correctly read the NDA and CUBC registers,
the order in which the registers are read could sometimes lead to an
inconsistent state.
Re-using the timeline from the comments, this following timing of
registers reads could lead to reading NDA with value "@desc2" and
CUBC with value "MAX desc1":
INITD -------- ------------
|____________________|
_______________________ _______________
NDA @desc2 \/ @desc3
_______________________/\_______________
__________ ___________ _______________
CUBC 0 \/ MAX desc1 \/ MAX desc2
__________/\___________/\_______________
| | | |
Events:(1)(2) (3)(4)
(1) check_nda = @desc2
(2) initd = 1
(3) cur_ubc = MAX desc1
(4) cur_nda = @desc2
This is allowed by the condition ((check_nda == cur_nda) && initd),
despite cur_ubc and cur_nda being in the precise state we don't want.
This error leads to incorrect residue computation.
Fix it by inversing the order in which CUBC and INITD are read. This
makes sure that NDA and CUBC are always read together either _before_
INITD goes to 0 or _after_ it is back at 1.
The case where NDA is read before INITD is at 0 and CUBC is read after
INITD is back at 1 will be rejected by check_nda and cur_nda being
different.
Fixes: 53398f488821 ("dmaengine: at_xdmac: fix residue corruption")
Cc: stable(a)vger.kernel.org
Signed-off-by: Maxime Jayat <maxime.jayat(a)mobile-devices.fr>
---
Hi,
I had a bug where the serial ports on the Atmel SAMA5D2 were sometimes
returning the same data twice, for up to 4096 bytes.
After investigation, I noticed that the ring buffer used in
atmel_serial (in rx dma mode) had sometimes a incorrect "head" value,
which made the ring buffer do a complete extraneous loop of data
pushed to the tty layer.
I tracked it down to the residue of the dma being wrong, and after
more head scratching, I found this bug in the reading of the
registers.
Before fixing this, I was able to reproduce the bug reliably in a few
minutes. With this patch applied, the bug did not reappear after
several hours in testing.
drivers/dma/at_xdmac.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/dma/at_xdmac.c b/drivers/dma/at_xdmac.c
index c00e3923d7d8..94236ec9d410 100644
--- a/drivers/dma/at_xdmac.c
+++ b/drivers/dma/at_xdmac.c
@@ -1471,10 +1471,10 @@ at_xdmac_tx_status(struct dma_chan *chan, dma_cookie_t cookie,
for (retry = 0; retry < AT_XDMAC_RESIDUE_MAX_RETRIES; retry++) {
check_nda = at_xdmac_chan_read(atchan, AT_XDMAC_CNDA) & 0xfffffffc;
rmb();
- initd = !!(at_xdmac_chan_read(atchan, AT_XDMAC_CC) & AT_XDMAC_CC_INITD);
- rmb();
cur_ubc = at_xdmac_chan_read(atchan, AT_XDMAC_CUBC);
rmb();
+ initd = !!(at_xdmac_chan_read(atchan, AT_XDMAC_CC) & AT_XDMAC_CC_INITD);
+ rmb();
cur_nda = at_xdmac_chan_read(atchan, AT_XDMAC_CNDA) & 0xfffffffc;
rmb();
--
2.14.1
When ath9k was switched over to use the mac80211 intermediate queues,
node cleanup now drains the mac80211 queues. However, this call path is
not protected by rcu_read_lock() as it was previously entirely internal
to the driver which uses its own locking.
This leads to a possible rcu_dereference() without holding
rcu_read_lock(); but only if a station is cleaned up while having
packets queued on the TXQ. Fix this by adding the rcu_read_lock() to the
caller in ath9k.
Fixes: 50f08edf9809 ("ath9k: Switch to using mac80211 intermediate software queues.")
Cc: stable(a)vger.kernel.org
Reported-by: Ben Greear <greearb(a)candelatech.com>
Signed-off-by: Toke Høiland-Jørgensen <toke(a)toke.dk>
---
drivers/net/wireless/ath/ath9k/xmit.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/net/wireless/ath/ath9k/xmit.c b/drivers/net/wireless/ath/ath9k/xmit.c
index 396bf05c6bf6..d8b041f48ca8 100644
--- a/drivers/net/wireless/ath/ath9k/xmit.c
+++ b/drivers/net/wireless/ath/ath9k/xmit.c
@@ -2892,6 +2892,8 @@ void ath_tx_node_cleanup(struct ath_softc *sc, struct ath_node *an)
struct ath_txq *txq;
int tidno;
+ rcu_read_lock();
+
for (tidno = 0; tidno < IEEE80211_NUM_TIDS; tidno++) {
tid = ath_node_to_tid(an, tidno);
txq = tid->txq;
@@ -2909,6 +2911,8 @@ void ath_tx_node_cleanup(struct ath_softc *sc, struct ath_node *an)
if (!an->sta)
break; /* just one multicast ath_atx_tid */
}
+
+ rcu_read_unlock();
}
#ifdef CONFIG_ATH9K_TX99
--
2.16.0
-Stephen Warren
+Stefan Wahren
On Fri, 09 Feb 2018 09:32:40 +0000
Eric Anholt <eric(a)anholt.net> wrote:
> Boris Brezillon <boris.brezillon(a)bootlin.com> writes:
>
> > On Thu, 08 Feb 2018 15:20:16 +0000
> > Eric Anholt <eric(a)anholt.net> wrote:
> >
> >> Boris Brezillon <boris.brezillon(a)bootlin.com> writes:
> >>
> >> > All bcm2835 PLLs should be gated before their rate can be changed.
> >> > Setting CLK_SET_RATE_GATE will let the core enforce that, but this is
> >> > not enough to make the code work in all situations. Indeed, the
> >> > CLK_SET_RATE_GATE flag prevents a user from changing the rate while
> >> > the clock is enabled, but this check only guarantees there's no Linux
> >> > users. In our case, the clock might have been enabled by the
> >> > bootloader/FW, and, because we have CLK_IGNORE_UNUSED set, Linux never
> >> > disables the PLL. So we have to make sure the PLL is actually disabled
> >> > before changing the rate.
> >> >
> >> > Fixes: 41691b8862e2 ("clk: bcm2835: Add support for programming the audio domain clocks")
> >> > Cc: <stable(a)vger.kernel.org>
> >> > Signed-off-by: Boris Brezillon <boris.brezillon(a)bootlin.com>
> >> > ---
> >> > drivers/clk/bcm/clk-bcm2835.c | 14 +++++++++++++-
> >> > 1 file changed, 13 insertions(+), 1 deletion(-)
> >> >
> >> > diff --git a/drivers/clk/bcm/clk-bcm2835.c b/drivers/clk/bcm/clk-bcm2835.c
> >> > index 6c5d4a8e426c..051ce769c109 100644
> >> > --- a/drivers/clk/bcm/clk-bcm2835.c
> >> > +++ b/drivers/clk/bcm/clk-bcm2835.c
> >> > @@ -678,6 +678,18 @@ static int bcm2835_pll_set_rate(struct clk_hw *hw,
> >> > u32 ana[4];
> >> > int i;
> >> >
> >> > + /*
> >> > + * Normally, the CLK_SET_RATE_GATE flag prevents a user from changing
> >> > + * the rate while the clock is enabled, but this check only makes sure
> >> > + * there's no Linux users.
> >> > + * In our case, the clock might have been enabled by the bootloader/FW,
> >> > + * and, since CLK_IGNORE_UNUSED flag is set, Linux never disables it.
> >> > + * So we have to make sure the clk is actually disabled before changing
> >> > + * the rate.
> >> > + */
> >> > + if (bcm2835_pll_is_on(hw))
> >> > + bcm2835_pll_off(hw);
> >> > +
> >>
> >> I'm not sure this improves the situation. If the PLL was on, then
> >> presumably there's a divider using it and a CM clock using that, so
> >> we'll probably end up driving some glitches on them.
> >
> > Hm, yes, but if someone is trying to change the rate of the PLL, and the
> > core doesn't know other clks depend on this PLL (which is the case if
> > we reach this point), we're already in big trouble.
> >
> >>
> >> Does the common clk framework have a way to disable unused clocks from
> >> the leaf clocks up to this root, before the general
> >> disable-unused-clocks path happens late in the boot process?
> >
> > Not that I know of. What do you have in mind?
>
> I was hoping that Stephen Boyd or Mike might have an answer for this
> problem.
Having a generic solution for this sort of issue is definitely the
way to go, but I think this temporary hack is needed to make HDMI/SDTV
work properly. If we don't have it and the FW configures and enables
PLLH with a rate that is different from the one the HDMI or SDTV
encoder tries to set, we're screwed, because I doubt the CPRMAN block
allows you to change the rate of the PLL when it's not gated. Which
means the new rate is not applied and the clk user has no way of
knowing that, which in turn means the display output is likely to not
work properly the first time it's enabled.
Of course, this all goes away the second time the HDMI/SDTV encoder is
enabled, because then clk_disable_unprepare() is called which has the
effect of disabling the PLL.
--
Boris Brezillon, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
http://bootlin.com