Freeing a hugetlb page and releasing base pages back to the underlying
allocator such as buddy or cma is performed in two steps:
- remove_hugetlb_folio() is called to remove the folio from hugetlb
lists, get a ref on the page and remove hugetlb destructor. This
all must be done under the hugetlb lock. After this call, the page
can be treated as a normal compound page or a collection of base
size pages.
- update_and_free_hugetlb_folio() is called to allocate vmemmap if
needed and the free routine of the underlying allocator is called
on the resulting page. We can not hold the hugetlb lock here.
One issue with this scheme is that a memory error could occur between
these two steps. In this case, the memory error handling code treats
the old hugetlb page as a normal compound page or collection of base
pages. It will then try to SetPageHWPoison(page) on the page with an
error. If the page with error is a tail page without vmemmap, a write
error will occur when trying to set the flag.
Address this issue by modifying remove_hugetlb_folio() and
update_and_free_hugetlb_folio() such that the hugetlb destructor is not
cleared until after allocating vmemmap. Since clearing the destructor
requires holding the hugetlb lock, the clearing is done in
remove_hugetlb_folio() if the vmemmap is present. This saves a
lock/unlock cycle. Otherwise, destructor is cleared in
update_and_free_hugetlb_folio() after allocating vmemmap.
Note that this will leave hugetlb pages in a state where they are marked
free (by hugetlb specific page flag) and have a ref count. This is not
a normal state. The only code that would notice is the memory error
code, and it is set up to retry in such a case.
A subsequent patch will create a routine to do bulk processing of
vmemmap allocation. This will eliminate a lock/unlock cycle for each
hugetlb page in the case where we are freeing a large number of pages.
Fixes: ad2fa3717b74 ("mm: hugetlb: alloc the vmemmap pages associated with each HugeTLB page")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
---
mm/hugetlb.c | 90 ++++++++++++++++++++++++++++++++++++++--------------
1 file changed, 66 insertions(+), 24 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 64a3239b6407..4a910121a647 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1579,9 +1579,37 @@ static inline void destroy_compound_gigantic_folio(struct folio *folio,
unsigned int order) { }
#endif
+static inline void __clear_hugetlb_destructor(struct hstate *h,
+ struct folio *folio)
+{
+ lockdep_assert_held(&hugetlb_lock);
+
+ /*
+ * Very subtle
+ *
+ * For non-gigantic pages set the destructor to the normal compound
+ * page dtor. This is needed in case someone takes an additional
+ * temporary ref to the page, and freeing is delayed until they drop
+ * their reference.
+ *
+ * For gigantic pages set the destructor to the null dtor. This
+ * destructor will never be called. Before freeing the gigantic
+ * page destroy_compound_gigantic_folio will turn the folio into a
+ * simple group of pages. After this the destructor does not
+ * apply.
+ *
+ */
+ if (hstate_is_gigantic(h))
+ folio_set_compound_dtor(folio, NULL_COMPOUND_DTOR);
+ else
+ folio_set_compound_dtor(folio, COMPOUND_PAGE_DTOR);
+}
+
/*
- * Remove hugetlb folio from lists, and update dtor so that the folio appears
- * as just a compound page.
+ * Remove hugetlb folio from lists.
+ * If vmemmap exists for the folio, update dtor so that the folio appears
+ * as just a compound page. Otherwise, wait until after allocating vmemmap
+ * to update dtor.
*
* A reference is held on the folio, except in the case of demote.
*
@@ -1612,31 +1640,19 @@ static void __remove_hugetlb_folio(struct hstate *h, struct folio *folio,
}
/*
- * Very subtle
- *
- * For non-gigantic pages set the destructor to the normal compound
- * page dtor. This is needed in case someone takes an additional
- * temporary ref to the page, and freeing is delayed until they drop
- * their reference.
- *
- * For gigantic pages set the destructor to the null dtor. This
- * destructor will never be called. Before freeing the gigantic
- * page destroy_compound_gigantic_folio will turn the folio into a
- * simple group of pages. After this the destructor does not
- * apply.
- *
- * This handles the case where more than one ref is held when and
- * after update_and_free_hugetlb_folio is called.
- *
- * In the case of demote we do not ref count the page as it will soon
- * be turned into a page of smaller size.
+ * We can only clear the hugetlb destructor after allocating vmemmap
+ * pages. Otherwise, someone (memory error handling) may try to write
+ * to tail struct pages.
+ */
+ if (!folio_test_hugetlb_vmemmap_optimized(folio))
+ __clear_hugetlb_destructor(h, folio);
+
+ /*
+ * In the case of demote we do not ref count the page as it will soon
+ * be turned into a page of smaller size.
*/
if (!demote)
folio_ref_unfreeze(folio, 1);
- if (hstate_is_gigantic(h))
- folio_set_compound_dtor(folio, NULL_COMPOUND_DTOR);
- else
- folio_set_compound_dtor(folio, COMPOUND_PAGE_DTOR);
h->nr_huge_pages--;
h->nr_huge_pages_node[nid]--;
@@ -1728,6 +1744,19 @@ static void __update_and_free_hugetlb_folio(struct hstate *h,
return;
}
+ /*
+ * If needed, clear hugetlb destructor under the hugetlb lock.
+ * This must be done AFTER allocating vmemmap pages in case there is an
+ * attempt to write to tail struct pages as in memory poison.
+ * It must be done BEFORE PageHWPoison handling so that any subsequent
+ * memory errors poison individual pages instead of head.
+ */
+ if (folio_test_hugetlb(folio)) {
+ spin_lock_irq(&hugetlb_lock);
+ __clear_hugetlb_destructor(h, folio);
+ spin_unlock_irq(&hugetlb_lock);
+ }
+
/*
* Move PageHWPoison flag from head page to the raw error pages,
* which makes any healthy subpages reusable.
@@ -3604,6 +3633,19 @@ static int demote_free_hugetlb_folio(struct hstate *h, struct folio *folio)
return rc;
}
+ /*
+ * The hugetlb destructor could still be set for this folio if vmemmap
+ * was actually allocated above. The ref count on all pages is 0.
+ * Therefore, nobody should attempt access. However, before destroying
+ * compound page below, clear the destructor. Unfortunately, this
+ * requires a lock/unlock cycle.
+ */
+ if (folio_test_hugetlb(folio)) {
+ spin_lock_irq(&hugetlb_lock);
+ __clear_hugetlb_destructor(h, folio);
+ spin_unlock_irq(&hugetlb_lock);
+ }
+
/*
* Use destroy_compound_hugetlb_folio_for_demote for all huge page
* sizes as it will not ref count folios.
--
2.41.0
This is a note to let you know that I've just added the patch titled
usb: chipidea: imx: improve logic if samsung,picophy-* parameter is 0
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
From 36668515d56bf73f06765c71e08c8f7465f1e5c4 Mon Sep 17 00:00:00 2001
From: Xu Yang <xu.yang_2(a)nxp.com>
Date: Tue, 27 Jun 2023 19:21:24 +0800
Subject: usb: chipidea: imx: improve logic if samsung,picophy-* parameter is 0
In current driver, the value of tuning parameter will not take effect
if samsung,picophy-* is assigned as 0. Because 0 is also a valid value
acccording to the description of USB_PHY_CFG1 register, this will improve
the logic to let it work.
Fixes: 58a3cefb3840 ("usb: chipidea: imx: add two samsung picophy parameters tuning implementation")
cc: <stable(a)vger.kernel.org>
Signed-off-by: Xu Yang <xu.yang_2(a)nxp.com>
Acked-by: Peter Chen <peter.chen(a)kernel.org>
Link: https://lore.kernel.org/r/20230627112126.1882666-1-xu.yang_2@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/chipidea/ci_hdrc_imx.c | 10 ++++++----
drivers/usb/chipidea/usbmisc_imx.c | 6 ++++--
2 files changed, 10 insertions(+), 6 deletions(-)
diff --git a/drivers/usb/chipidea/ci_hdrc_imx.c b/drivers/usb/chipidea/ci_hdrc_imx.c
index aa2aebed8e2d..c251a4f34879 100644
--- a/drivers/usb/chipidea/ci_hdrc_imx.c
+++ b/drivers/usb/chipidea/ci_hdrc_imx.c
@@ -176,10 +176,12 @@ static struct imx_usbmisc_data *usbmisc_get_init_data(struct device *dev)
if (of_usb_get_phy_mode(np) == USBPHY_INTERFACE_MODE_ULPI)
data->ulpi = 1;
- of_property_read_u32(np, "samsung,picophy-pre-emp-curr-control",
- &data->emp_curr_control);
- of_property_read_u32(np, "samsung,picophy-dc-vol-level-adjust",
- &data->dc_vol_level_adjust);
+ if (of_property_read_u32(np, "samsung,picophy-pre-emp-curr-control",
+ &data->emp_curr_control))
+ data->emp_curr_control = -1;
+ if (of_property_read_u32(np, "samsung,picophy-dc-vol-level-adjust",
+ &data->dc_vol_level_adjust))
+ data->dc_vol_level_adjust = -1;
return data;
}
diff --git a/drivers/usb/chipidea/usbmisc_imx.c b/drivers/usb/chipidea/usbmisc_imx.c
index e8a712e5abad..c4165f061e42 100644
--- a/drivers/usb/chipidea/usbmisc_imx.c
+++ b/drivers/usb/chipidea/usbmisc_imx.c
@@ -660,13 +660,15 @@ static int usbmisc_imx7d_init(struct imx_usbmisc_data *data)
usbmisc->base + MX7D_USBNC_USB_CTRL2);
/* PHY tuning for signal quality */
reg = readl(usbmisc->base + MX7D_USB_OTG_PHY_CFG1);
- if (data->emp_curr_control && data->emp_curr_control <=
+ if (data->emp_curr_control >= 0 &&
+ data->emp_curr_control <=
(TXPREEMPAMPTUNE0_MASK >> TXPREEMPAMPTUNE0_BIT)) {
reg &= ~TXPREEMPAMPTUNE0_MASK;
reg |= (data->emp_curr_control << TXPREEMPAMPTUNE0_BIT);
}
- if (data->dc_vol_level_adjust && data->dc_vol_level_adjust <=
+ if (data->dc_vol_level_adjust >= 0 &&
+ data->dc_vol_level_adjust <=
(TXVREFTUNE0_MASK >> TXVREFTUNE0_BIT)) {
reg &= ~TXVREFTUNE0_MASK;
reg |= (data->dc_vol_level_adjust << TXVREFTUNE0_BIT);
--
2.41.0
The qca8xxx switch supports 2 way to write reg values, a slow way using
mdio and a fast way by sending specially crafted mgmt packet to
read/write reg.
The fast way can support up to 32 bytes of data as eth packet are used
to send/receive.
This correctly works for almost the entire regmap of the switch but with
the use of some kernel selftests for dsa drivers it was found a funny
and interesting hw defect/limitation.
For some specific reg, bulk write won't work and will result in writing
only part of the requested regs resulting in half data written. This was
especially hard to track and discover due to the total strangeness of
the problem and also by the specific regs where this occurs.
This occurs in the specific regs of the ATU table, where multiple entry
needs to be written to compose the entire entry.
It was discovered that with a bulk write of 12 bytes on
QCA8K_REG_ATU_DATA0 only QCA8K_REG_ATU_DATA0 and QCA8K_REG_ATU_DATA2
were written, but QCA8K_REG_ATU_DATA1 was always zero.
Tcpdump was used to make sure the specially crafted packet was correct
and this was confirmed.
The problem was hard to track as the lack of QCA8K_REG_ATU_DATA1
resulted in an entry somehow possible as the first bytes of the mac
address are set in QCA8K_REG_ATU_DATA0 and the entry type is set in
QCA8K_REG_ATU_DATA2.
Funlly enough writing QCA8K_REG_ATU_DATA1 results in the same problem
with QCA8K_REG_ATU_DATA2 empty and QCA8K_REG_ATU_DATA1 and
QCA8K_REG_ATU_FUNC correctly written.
A speculation on the problem might be that there are some kind of
indirection internally when accessing these regs and they can't be
accessed all together, due to the fact that it's really a table mapped
somewhere in the switch SRAM.
Even more funny is the fact that every other reg was tested with all
kind of combination and they are not affected by this problem. Read
operation was also tested and always worked so it's not affected by this
problem.
The problem is not present if we limit writing a single reg at times.
To handle this hardware defect, enable use_single_write so that bulk
api can correctly split the write in multiple different operation
effectively reverting to a non-bulk write.
Cc: Mark Brown <broonie(a)kernel.org>
Fixes: c766e077d927 ("net: dsa: qca8k: convert to regmap read/write API")
Signed-off-by: Christian Marangi <ansuelsmth(a)gmail.com>
Cc: stable(a)vger.kernel.org
---
drivers/net/dsa/qca/qca8k-8xxx.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/net/dsa/qca/qca8k-8xxx.c b/drivers/net/dsa/qca/qca8k-8xxx.c
index dee7b6579916..ae088a4df794 100644
--- a/drivers/net/dsa/qca/qca8k-8xxx.c
+++ b/drivers/net/dsa/qca/qca8k-8xxx.c
@@ -576,8 +576,11 @@ static struct regmap_config qca8k_regmap_config = {
.rd_table = &qca8k_readable_table,
.disable_locking = true, /* Locking is handled by qca8k read/write */
.cache_type = REGCACHE_NONE, /* Explicitly disable CACHE */
- .max_raw_read = 32, /* mgmt eth can read/write up to 8 registers at time */
- .max_raw_write = 32,
+ .max_raw_read = 32, /* mgmt eth can read up to 8 registers at time */
+ /* ATU regs suffer from a bug where some data are not correctly
+ * written. Disable bulk write to correctly write ATU entry.
+ */
+ .use_single_write = true,
};
static int
--
2.40.1
Dzień dobry,
chciałbym poinformować Państwa o możliwości pozyskania nowych zleceń ze strony www.
Widzimy zainteresowanie potencjalnych Klientów Państwa firmą, dlatego chętnie pomożemy Państwu dotrzeć z ofertą do większego grona odbiorców poprzez efektywne metody pozycjonowania strony w Google.
Czy mógłbym liczyć na kontakt zwrotny?
Pozdrawiam serdecznie
Wiktor Nurek