Depending on ABI "long long" type of a particular 32-bit CPU
might be aligned by either word (32-bits) or double word (64-bits).
Make sure "data" is really 64-bit aligned for any 32-bit CPU.
At least for 32-bit ARC cores ABI requires "long long" types
to be aligned by normal 32-bit word. This makes "data" field aligned to
12 bytes. Which is still OK as long as we use 32-bit data only.
But once we want to use native atomic64_t type (i.e. when we use special
instructions LLOCKD/SCONDD for accessing 64-bit data) we easily hit
misaligned access exception.
That's because even on CPUs capable of non-aligned data access LL/SC
instructions require strict alignment.
Signed-off-by: Alexey Brodkin <abrodkin(a)synopsys.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
---
Changes v1 -> v2:
* Reworded commit message
* Inserted comment right in source [Thomas]
drivers/base/devres.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/base/devres.c b/drivers/base/devres.c
index f98a097e73f2..466fa59c866a 100644
--- a/drivers/base/devres.c
+++ b/drivers/base/devres.c
@@ -24,8 +24,12 @@ struct devres_node {
struct devres {
struct devres_node node;
- /* -- 3 pointers */
- unsigned long long data[]; /* guarantee ull alignment */
+ /*
+ * Depending on ABI "long long" type of a particular 32-bit CPU
+ * might be aligned by either word (32-bits) or double word (64-bits).
+ * Make sure "data" is really 64-bit aligned for any 32-bit CPU.
+ */
+ unsigned long long data[] __aligned(sizeof(unsigned long long));
};
struct devres_group {
--
2.17.1
On Tegra30 Cardhu the PCA9546 I2C mux is not ACK'ing I2C commands on
resume from suspend (which is caused by the reset signal for the I2C
mux not being configured correctl). However, this NACK is causing the
Tegra30 to hang on resuming from suspend which is not expected as we
detect NACKs and handle them. The hang observed appears to occur when
resetting the I2C controller to recover from the NACK.
Commit 77821b4678f9 ("i2c: tegra: proper handling of error cases") added
additional error handling for some error cases including NACK, however,
it appears that this change conflicts with an early fix by commit
f70893d08338 ("i2c: tegra: Add delay before resetting the controller
after NACK"). After commit 77821b4678f9 was made we now disable 'packet
mode' before the delay from commit f70893d08338 happens. Testing shows
that moving the delay to before disabling 'packet mode' fixes the hang
observed on Tegra30. The delay was added to give the I2C controller
chance to send a stop condition and so it makes sense to move this to
before we disable packet mode. Please note that packet mode is always
enabled for Tegra.
Fixes: 77821b4678f9 ("i2c: tegra: proper handling of error cases")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jon Hunter <jonathanh(a)nvidia.com>
---
drivers/i2c/busses/i2c-tegra.c | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)
diff --git a/drivers/i2c/busses/i2c-tegra.c b/drivers/i2c/busses/i2c-tegra.c
index 5fccd1f1bca8..797def5319f1 100644
--- a/drivers/i2c/busses/i2c-tegra.c
+++ b/drivers/i2c/busses/i2c-tegra.c
@@ -545,6 +545,14 @@ static int tegra_i2c_disable_packet_mode(struct tegra_i2c_dev *i2c_dev)
{
u32 cnfg;
+ /*
+ * NACK interrupt is generated before the I2C controller generates
+ * the STOP condition on the bus. So wait for 2 clock periods
+ * before disabling the controller so that the STOP condition has
+ * been delivered properly.
+ */
+ udelay(DIV_ROUND_UP(2 * 1000000, i2c_dev->bus_clk_rate));
+
cnfg = i2c_readl(i2c_dev, I2C_CNFG);
if (cnfg & I2C_CNFG_PACKET_MODE_EN)
i2c_writel(i2c_dev, cnfg & ~I2C_CNFG_PACKET_MODE_EN, I2C_CNFG);
@@ -706,15 +714,6 @@ static int tegra_i2c_xfer_msg(struct tegra_i2c_dev *i2c_dev,
if (likely(i2c_dev->msg_err == I2C_ERR_NONE))
return 0;
- /*
- * NACK interrupt is generated before the I2C controller generates
- * the STOP condition on the bus. So wait for 2 clock periods
- * before resetting the controller so that the STOP condition has
- * been delivered properly.
- */
- if (i2c_dev->msg_err == I2C_ERR_NO_ACK)
- udelay(DIV_ROUND_UP(2 * 1000000, i2c_dev->bus_clk_rate));
-
tegra_i2c_init(i2c_dev);
if (i2c_dev->msg_err == I2C_ERR_NO_ACK) {
if (msg->flags & I2C_M_IGNORE_NAK)
--
1.9.1
My wife and I won the Euro Millions Lottery of £53 Million British Pounds and we have voluntarily decided to donate 1,000,000EUR(One Million Euros) to 5 individuals randomly as part of our own charity project.
To verify our lottery winnings,please see our interview by visiting the web page below:
telegraph.co.uk/news/newstopics/howaboutthat/11511467/Lincolnshire-couple-win-53m-on-EuroMillions.html
Your email address was among the emails which were submitted to us by the Google Inc. as a web user; if you have received our email,kindly send us the below details so that we can transfer your 1,000,000.00 EUR(One Million Euros) to you in your own country.
Full Names:
Mobile No:
Age:
Occupation:
Country:
Send your response to: rmaxwell81(a)yahoo.com
Best Regards,
Richard & Angela Maxwell
Switching the CPU from the L2 or L3 frequencies (300 and 200 Mhz
respectively) to L0 frequency (1.2 Ghz) requires a significant amount
of time to let VDD stabilize to the appropriate voltage. This amount of
time is large enough that it cannot be covered by the hardware
countdown register. Due to this, the CPU might start operating at L0
before the voltage is stabilized, leading to CPU stalls.
To work around this problem, we prevent switching directly from the
L2/L3 frequencies to the L0 frequency, and instead switch to the L1
frequency in-between. The sequence therefore becomes:
1. First switch from L2/L3(200/300MHz) to L1(600MHZ)
2. Sleep 20ms for stabling VDD voltage
3. Then switch from L1(600MHZ) to L0(1200Mhz).
It is based on the work done by Ken Ma <make(a)marvell.com>
Cc: stable(a)vger.kernel.org
Fixes: 2089dc33ea0e ("clk: mvebu: armada-37xx-periph: add DVFS support for cpu clocks")
Signed-off-by: Gregory CLEMENT <gregory.clement(a)bootlin.com>
---
drivers/clk/mvebu/armada-37xx-periph.c | 38 ++++++++++++++++++++++++++
1 file changed, 38 insertions(+)
diff --git a/drivers/clk/mvebu/armada-37xx-periph.c b/drivers/clk/mvebu/armada-37xx-periph.c
index 6860bd5a37c5..44e4e27eddad 100644
--- a/drivers/clk/mvebu/armada-37xx-periph.c
+++ b/drivers/clk/mvebu/armada-37xx-periph.c
@@ -35,6 +35,7 @@
#define CLK_SEL 0x10
#define CLK_DIS 0x14
+#define ARMADA_37XX_DVFS_LOAD_1 1
#define LOAD_LEVEL_NR 4
#define ARMADA_37XX_NB_L0L1 0x18
@@ -507,6 +508,40 @@ static long clk_pm_cpu_round_rate(struct clk_hw *hw, unsigned long rate,
return -EINVAL;
}
+/*
+ * Switching the CPU from the L2 or L3 frequencies (300 and 200 Mhz
+ * respectively) to L0 frequency (1.2 Ghz) requires a significant
+ * amount of time to let VDD stabilize to the appropriate
+ * voltage. This amount of time is large enough that it cannot be
+ * covered by the hardware countdown register. Due to this, the CPU
+ * might start operating at L0 before the voltage is stabilized,
+ * leading to CPU stalls.
+ *
+ * To work around this problem, we prevent switching directly from the
+ * L2/L3 frequencies to the L0 frequency, and instead switch to the L1
+ * frequency in-between. The sequence therefore becomes:
+ * 1. First switch from L2/L3(200/300MHz) to L1(600MHZ)
+ * 2. Sleep 20ms for stabling VDD voltage
+ * 3. Then switch from L1(600MHZ) to L0(1200Mhz).
+ */
+static void clk_pm_cpu_set_rate_wa(unsigned long rate, struct regmap *base)
+{
+ unsigned int cur_level;
+
+ if (rate != 1200 * 1000 * 1000)
+ return;
+
+ regmap_read(base, ARMADA_37XX_NB_CPU_LOAD, &cur_level);
+ cur_level &= ARMADA_37XX_NB_CPU_LOAD_MASK;
+ if (cur_level <= ARMADA_37XX_DVFS_LOAD_1)
+ return;
+
+ regmap_update_bits(base, ARMADA_37XX_NB_CPU_LOAD,
+ ARMADA_37XX_NB_CPU_LOAD_MASK,
+ ARMADA_37XX_DVFS_LOAD_1);
+ msleep(20);
+}
+
static int clk_pm_cpu_set_rate(struct clk_hw *hw, unsigned long rate,
unsigned long parent_rate)
{
@@ -537,6 +572,9 @@ static int clk_pm_cpu_set_rate(struct clk_hw *hw, unsigned long rate,
*/
reg = ARMADA_37XX_NB_CPU_LOAD;
mask = ARMADA_37XX_NB_CPU_LOAD_MASK;
+
+ clk_pm_cpu_set_rate_wa(rate, base);
+
regmap_update_bits(base, reg, mask, load_level);
return rate;
--
2.17.1
On all versions of Tegra30 Cardhu, the reset signal to the NXP PCA9546
I2C mux is connected to the Tegra GPIO BB0. Currently, this pin on the
Tegra is not configured as a GPIO but as a special-function IO (SFIO)
that is multiplexing the pin to an I2S controller. On exiting system
suspend, I2C commands sent to the PCA9546 are failing because there is
no ACK. Although it is not possible to see exactly what is happening
to the reset during suspend, by ensuring it is configured as a GPIO
and driven high, to de-assert the reset, the failures are no longer
seen.
Please note that this GPIO is also used to drive the reset signal
going to the camera connector on the board. However, given that there
is no camera support currently for Cardhu, this should not have any
impact.
Fixes: 40431d16ff11 ("ARM: tegra: enable PCA9546 on Cardhu")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jon Hunter <jonathanh(a)nvidia.com>
---
arch/arm/boot/dts/tegra30-cardhu.dtsi | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/arm/boot/dts/tegra30-cardhu.dtsi b/arch/arm/boot/dts/tegra30-cardhu.dtsi
index 92a9740c533f..3b1db7b9ec50 100644
--- a/arch/arm/boot/dts/tegra30-cardhu.dtsi
+++ b/arch/arm/boot/dts/tegra30-cardhu.dtsi
@@ -206,6 +206,7 @@
#address-cells = <1>;
#size-cells = <0>;
reg = <0x70>;
+ reset-gpio = <&gpio TEGRA_GPIO(BB, 0) GPIO_ACTIVE_LOW>;
};
};
--
1.9.1
The commit 719f6a7040f1bdaf96 ("printk: Use the main logbuf in NMI
when logbuf_lock is available") brought back the possible deadlocks
in printk() and NMI.
This is rework of the proposed fix, see
https://lkml.kernel.org/r/20180606111557.xzs6l3lkvg7lq3ts@pathway.suse.cz
I realized that we could rather easily move the check to vprintk_func()
and still avoid any race. I believe that this is a win-win solution.
Changes against v1:
+ Move the check from vprintk_emit() to vprintk_func()
+ More straightforward commit message
+ Fix build with CONFIG_PRINTK_NMI disabled
Petr Mladek (3):
printk: Split the code for storing a message into the log buffer
printk: Create helper function to queue deferred console handling
printk/nmi: Prevent deadlock when accessing the main log buffer in NMI
include/linux/printk.h | 4 ++++
kernel/printk/internal.h | 9 ++++++-
kernel/printk/printk.c | 57 +++++++++++++++++++++++++++-----------------
kernel/printk/printk_safe.c | 58 +++++++++++++++++++++++++++++----------------
kernel/trace/trace.c | 4 +++-
lib/nmi_backtrace.c | 3 ---
6 files changed, 87 insertions(+), 48 deletions(-)
--
2.13.7
Commit ac75a041048b ("HID: i2c-hid: fix size check and type usage")
started writing messages when the ret_size is <= 2 from i2c_master_recv.
However, my device i2c-DLL07D1 returns 2 for a short period of time
(~0.5s) after I stop moving the pointing stick or touchpad. It varies,
but you get ~50 messages each time which spams the log hard.
[ 95.925055] i2c_hid i2c-DLL07D1:01: i2c_hid_get_input: incomplete report (83/2)
This has also been observed with a i2c-ALP0017.
[ 1781.266353] i2c_hid i2c-ALP0017:00: i2c_hid_get_input: incomplete report (30/2)
Only print the message when ret_size is totally invalid and less than 2
to cut down on the log spam.
Reported-by: John Smith <john-s-84(a)gmx.net>
Cc: stable(a)vger.kernel.org
Signed-off-by: Jason Andryuk <jandryuk(a)gmail.com>
---
John Smith originally reported this, but his post did not include a git
formatted patch nor a Signed-off-by.
https://www.spinics.net/lists/linux-input/msg56171.htmlhttps://patchwork.kernel.org/patch/10374383/
When ret_size is 2, hid_input_report is passed 0 and returns early. ret_size
== 2 seems to be a header size saying there is no content. Should
i2c_hid_get_input just return early in that case?
Also, should this condition be noted to stop an interrupt from firing to
avoid the ~50 bogus messages?
drivers/hid/i2c-hid/i2c-hid.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hid/i2c-hid/i2c-hid.c b/drivers/hid/i2c-hid/i2c-hid.c
index c1652bb7bd15..eae0cb3ddec6 100644
--- a/drivers/hid/i2c-hid/i2c-hid.c
+++ b/drivers/hid/i2c-hid/i2c-hid.c
@@ -484,7 +484,7 @@ static void i2c_hid_get_input(struct i2c_hid *ihid)
return;
}
- if ((ret_size > size) || (ret_size <= 2)) {
+ if ((ret_size > size) || (ret_size < 2)) {
dev_err(&ihid->client->dev, "%s: incomplete report (%d/%d)\n",
__func__, size, ret_size);
return;
--
2.17.1
From: Andy Lutomirski <luto(a)kernel.org>
commit c592b57347069abfc0dcad3b3a302cf882602597 upstream
This removes all the obvious code paths that depend on lazy FPU mode.
It shouldn't change the generated code at all.
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Signed-off-by: Rik van Riel <riel(a)redhat.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: Fenghua Yu <fenghua.yu(a)intel.com>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Oleg Nesterov <oleg(a)redhat.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas(a)oracle.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: pbonzini(a)redhat.com
Link: http://lkml.kernel.org/r/1475627678-20788-5-git-send-email-riel@redhat.com
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Daniel Sangorrin <daniel.sangorrin(a)toshiba.co.jp>
---
I backported this patch to remove the final fpu lazy mode deadcode
from stable 4.9.y. I only had to solve a small conflict caused by
the fact that commit b22cbe404a9c ("x86/fpu: Fix invalid FPU ptrace state after execve()")
had been applied before in stable (it was supposed to come after)
After applying this patch please do the following to remove more deadcode:
- cherry-pick: 3913cc350757 ("x86/fpu: Remove struct fpu::counter")
- revert: f09a7b0eead7 ("perf: sync up x86/.../cpufeatures.h")
- cherry-pick: e63650840e8b ("x86/fpu: Finish excising 'eagerfpu'")
arch/x86/crypto/crc32c-intel_glue.c | 17 ++++-------------
arch/x86/include/asm/fpu/internal.h | 34 +---------------------------------
arch/x86/kernel/fpu/core.c | 36 ++++--------------------------------
arch/x86/kernel/fpu/signal.c | 8 +++-----
arch/x86/kernel/fpu/xstate.c | 9 ---------
arch/x86/kvm/cpuid.c | 4 +---
arch/x86/kvm/x86.c | 10 ----------
7 files changed, 13 insertions(+), 105 deletions(-)
diff --git a/arch/x86/crypto/crc32c-intel_glue.c b/arch/x86/crypto/crc32c-intel_glue.c
index dd19584..5773e11 100644
--- a/arch/x86/crypto/crc32c-intel_glue.c
+++ b/arch/x86/crypto/crc32c-intel_glue.c
@@ -48,21 +48,13 @@
#ifdef CONFIG_X86_64
/*
* use carryless multiply version of crc32c when buffer
- * size is >= 512 (when eager fpu is enabled) or
- * >= 1024 (when eager fpu is disabled) to account
+ * size is >= 512 to account
* for fpu state save/restore overhead.
*/
-#define CRC32C_PCL_BREAKEVEN_EAGERFPU 512
-#define CRC32C_PCL_BREAKEVEN_NOEAGERFPU 1024
+#define CRC32C_PCL_BREAKEVEN 512
asmlinkage unsigned int crc_pcl(const u8 *buffer, int len,
unsigned int crc_init);
-static int crc32c_pcl_breakeven = CRC32C_PCL_BREAKEVEN_EAGERFPU;
-#define set_pcl_breakeven_point() \
-do { \
- if (!use_eager_fpu()) \
- crc32c_pcl_breakeven = CRC32C_PCL_BREAKEVEN_NOEAGERFPU; \
-} while (0)
#endif /* CONFIG_X86_64 */
static u32 crc32c_intel_le_hw_byte(u32 crc, unsigned char const *data, size_t length)
@@ -185,7 +177,7 @@ static int crc32c_pcl_intel_update(struct shash_desc *desc, const u8 *data,
* use faster PCL version if datasize is large enough to
* overcome kernel fpu state save/restore overhead
*/
- if (len >= crc32c_pcl_breakeven && irq_fpu_usable()) {
+ if (len >= CRC32C_PCL_BREAKEVEN && irq_fpu_usable()) {
kernel_fpu_begin();
*crcp = crc_pcl(data, len, *crcp);
kernel_fpu_end();
@@ -197,7 +189,7 @@ static int crc32c_pcl_intel_update(struct shash_desc *desc, const u8 *data,
static int __crc32c_pcl_intel_finup(u32 *crcp, const u8 *data, unsigned int len,
u8 *out)
{
- if (len >= crc32c_pcl_breakeven && irq_fpu_usable()) {
+ if (len >= CRC32C_PCL_BREAKEVEN && irq_fpu_usable()) {
kernel_fpu_begin();
*(__le32 *)out = ~cpu_to_le32(crc_pcl(data, len, *crcp));
kernel_fpu_end();
@@ -257,7 +249,6 @@ static int __init crc32c_intel_mod_init(void)
alg.update = crc32c_pcl_intel_update;
alg.finup = crc32c_pcl_intel_finup;
alg.digest = crc32c_pcl_intel_digest;
- set_pcl_breakeven_point();
}
#endif
return crypto_register_shash(&alg);
diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 8852e3a..7801d32 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -60,11 +60,6 @@ extern u64 fpu__get_supported_xfeatures_mask(void);
/*
* FPU related CPU feature flag helper routines:
*/
-static __always_inline __pure bool use_eager_fpu(void)
-{
- return true;
-}
-
static __always_inline __pure bool use_xsaveopt(void)
{
return static_cpu_has(X86_FEATURE_XSAVEOPT);
@@ -501,24 +496,6 @@ static inline int fpu_want_lazy_restore(struct fpu *fpu, unsigned int cpu)
}
-/*
- * Wrap lazy FPU TS handling in a 'hw fpregs activation/deactivation'
- * idiom, which is then paired with the sw-flag (fpregs_active) later on:
- */
-
-static inline void __fpregs_activate_hw(void)
-{
- if (!use_eager_fpu())
- clts();
-}
-
-static inline void __fpregs_deactivate_hw(void)
-{
- if (!use_eager_fpu())
- stts();
-}
-
-/* Must be paired with an 'stts' (fpregs_deactivate_hw()) after! */
static inline void __fpregs_deactivate(struct fpu *fpu)
{
WARN_ON_FPU(!fpu->fpregs_active);
@@ -528,7 +505,6 @@ static inline void __fpregs_deactivate(struct fpu *fpu)
trace_x86_fpu_regs_deactivated(fpu);
}
-/* Must be paired with a 'clts' (fpregs_activate_hw()) before! */
static inline void __fpregs_activate(struct fpu *fpu)
{
WARN_ON_FPU(fpu->fpregs_active);
@@ -554,22 +530,17 @@ static inline int fpregs_active(void)
}
/*
- * Encapsulate the CR0.TS handling together with the
- * software flag.
- *
* These generally need preemption protection to work,
* do try to avoid using these on their own.
*/
static inline void fpregs_activate(struct fpu *fpu)
{
- __fpregs_activate_hw();
__fpregs_activate(fpu);
}
static inline void fpregs_deactivate(struct fpu *fpu)
{
__fpregs_deactivate(fpu);
- __fpregs_deactivate_hw();
}
/*
@@ -596,8 +567,7 @@ switch_fpu_prepare(struct fpu *old_fpu, struct fpu *new_fpu, int cpu)
* or if the past 5 consecutive context-switches used math.
*/
fpu.preload = static_cpu_has(X86_FEATURE_FPU) &&
- new_fpu->fpstate_active &&
- (use_eager_fpu() || new_fpu->counter > 5);
+ new_fpu->fpstate_active;
if (old_fpu->fpregs_active) {
if (!copy_fpregs_to_fpstate(old_fpu))
@@ -615,8 +585,6 @@ switch_fpu_prepare(struct fpu *old_fpu, struct fpu *new_fpu, int cpu)
__fpregs_activate(new_fpu);
trace_x86_fpu_regs_activated(new_fpu);
prefetch(&new_fpu->state);
- } else {
- __fpregs_deactivate_hw();
}
} else {
old_fpu->counter = 0;
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 96d80df..2dc1927 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -58,27 +58,9 @@ static bool kernel_fpu_disabled(void)
return this_cpu_read(in_kernel_fpu);
}
-/*
- * Were we in an interrupt that interrupted kernel mode?
- *
- * On others, we can do a kernel_fpu_begin/end() pair *ONLY* if that
- * pair does nothing at all: the thread must not have fpu (so
- * that we don't try to save the FPU state), and TS must
- * be set (so that the clts/stts pair does nothing that is
- * visible in the interrupted kernel thread).
- *
- * Except for the eagerfpu case when we return true; in the likely case
- * the thread has FPU but we are not going to set/clear TS.
- */
static bool interrupted_kernel_fpu_idle(void)
{
- if (kernel_fpu_disabled())
- return false;
-
- if (use_eager_fpu())
- return true;
-
- return !current->thread.fpu.fpregs_active && (read_cr0() & X86_CR0_TS);
+ return !kernel_fpu_disabled();
}
/*
@@ -126,7 +108,6 @@ void __kernel_fpu_begin(void)
copy_fpregs_to_fpstate(fpu);
} else {
this_cpu_write(fpu_fpregs_owner_ctx, NULL);
- __fpregs_activate_hw();
}
}
EXPORT_SYMBOL(__kernel_fpu_begin);
@@ -137,8 +118,6 @@ void __kernel_fpu_end(void)
if (fpu->fpregs_active)
copy_kernel_to_fpregs(&fpu->state);
- else
- __fpregs_deactivate_hw();
kernel_fpu_enable();
}
@@ -200,10 +179,7 @@ void fpu__save(struct fpu *fpu)
trace_x86_fpu_before_save(fpu);
if (fpu->fpregs_active) {
if (!copy_fpregs_to_fpstate(fpu)) {
- if (use_eager_fpu())
- copy_kernel_to_fpregs(&fpu->state);
- else
- fpregs_deactivate(fpu);
+ copy_kernel_to_fpregs(&fpu->state);
}
}
trace_x86_fpu_after_save(fpu);
@@ -261,8 +237,7 @@ int fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu)
* Don't let 'init optimized' areas of the XSAVE area
* leak into the child task:
*/
- if (use_eager_fpu())
- memset(&dst_fpu->state.xsave, 0, fpu_kernel_xstate_size);
+ memset(&dst_fpu->state.xsave, 0, fpu_kernel_xstate_size);
/*
* Save current FPU registers directly into the child
@@ -284,10 +259,7 @@ int fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu)
memcpy(&src_fpu->state, &dst_fpu->state,
fpu_kernel_xstate_size);
- if (use_eager_fpu())
- copy_kernel_to_fpregs(&src_fpu->state);
- else
- fpregs_deactivate(src_fpu);
+ copy_kernel_to_fpregs(&src_fpu->state);
}
preempt_enable();
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 3ec0d2d..3a93186 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -344,11 +344,9 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
}
fpu->fpstate_active = 1;
- if (use_eager_fpu()) {
- preempt_disable();
- fpu__restore(fpu);
- preempt_enable();
- }
+ preempt_disable();
+ fpu__restore(fpu);
+ preempt_enable();
return err;
} else {
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index abfbb61b..e9d7f46 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -890,15 +890,6 @@ int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
*/
if (!boot_cpu_has(X86_FEATURE_OSPKE))
return -EINVAL;
- /*
- * For most XSAVE components, this would be an arduous task:
- * brining fpstate up to date with fpregs, updating fpstate,
- * then re-populating fpregs. But, for components that are
- * never lazily managed, we can just access the fpregs
- * directly. PKRU is never managed lazily, so we can just
- * manipulate it directly. Make sure it stays that way.
- */
- WARN_ON_ONCE(!use_eager_fpu());
/* Set the bits we need in PKRU: */
if (init_val & PKEY_DISABLE_ACCESS)
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 7e5119c..c17d389 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -16,7 +16,6 @@
#include <linux/export.h>
#include <linux/vmalloc.h>
#include <linux/uaccess.h>
-#include <asm/fpu/internal.h> /* For use_eager_fpu. Ugh! */
#include <asm/user.h>
#include <asm/fpu/xstate.h>
#include "cpuid.h"
@@ -114,8 +113,7 @@ int kvm_update_cpuid(struct kvm_vcpu *vcpu)
if (best && (best->eax & (F(XSAVES) | F(XSAVEC))))
best->ebx = xstate_required_size(vcpu->arch.xcr0, true);
- if (use_eager_fpu())
- kvm_x86_ops->fpu_activate(vcpu);
+ kvm_x86_ops->fpu_activate(vcpu);
/*
* The existing code assumes virtual address is 48-bit in the canonical
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5ca23af..0754dae 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7513,16 +7513,6 @@ void kvm_put_guest_fpu(struct kvm_vcpu *vcpu)
copy_fpregs_to_fpstate(&vcpu->arch.guest_fpu);
__kernel_fpu_end();
++vcpu->stat.fpu_reload;
- /*
- * If using eager FPU mode, or if the guest is a frequent user
- * of the FPU, just leave the FPU active for next time.
- * Every 255 times fpu_counter rolls over to 0; a guest that uses
- * the FPU in bursts will revert to loading it on demand.
- */
- if (!use_eager_fpu()) {
- if (++vcpu->fpu_counter < 5)
- kvm_make_request(KVM_REQ_DEACTIVATE_FPU, vcpu);
- }
trace_kvm_fpu(0);
}
--
2.1.4
Depending on ABI "long long" type of a particular 32-bit CPU
might be aligned by either word (32-bits) or double word (64-bits).
Make sure "data" is really 64-bit aligned for any 32-bit CPU.
At least for 32-bit ARC cores ABI requires "long long" types
to be aligned by normal 32-bit word. This makes "data" field aligned to
12 bytes. Which is still OK as long as we use 32-bit data only.
But once we want to use native atomic64_t type (i.e. when we use special
instructions LLOCKD/SCONDD for accessing 64-bit data) we easily hit
misaligned access exception.
That's because even on CPUs capable of non-aligned data access LL/SC
instructions require strict alignment.
Signed-off-by: Alexey Brodkin <abrodkin(a)synopsys.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
---
Changes v1 -> v2:
* Reworded commit message
* Inserted comment right in source [Thomas]
drivers/base/devres.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/base/devres.c b/drivers/base/devres.c
index f98a097e73f2..466fa59c866a 100644
--- a/drivers/base/devres.c
+++ b/drivers/base/devres.c
@@ -24,8 +24,12 @@ struct devres_node {
struct devres {
struct devres_node node;
- /* -- 3 pointers */
- unsigned long long data[]; /* guarantee ull alignment */
+ /*
+ * Depending on ABI "long long" type of a particular 32-bit CPU
+ * might be aligned by either word (32-bits) or double word (64-bits).
+ * Make sure "data" is really 64-bit aligned for any 32-bit CPU.
+ */
+ unsigned long long data[] __aligned(sizeof(unsigned long long));
};
struct devres_group {
--
2.17.1
On 06/18/2018 10:13 AM, Greg Kroah-Hartman wrote:
> 4.16-stable review patch. If anyone has any objections, please let me know.
So I was wondering, why backport such a considerable number of
*selftests* to stable, given the stable policy? Surely selftests don't
affect the kernel itself breaking for users?
Thanks, Vlastimil
In commit bc73905abf770192 ("[SCSI] lpfc 8.3.16: SLI Additions, updates,
and code cleanup"), lpfc_memcpy_to_slim() have switched memcpy_toio() to
__write32_copy() in order to prevent unaligned 64 bit copy. Recently, we
found that lpfc_memcpy_from_slim() have similar issues, so let it switch
memcpy_fromio() to __read32_copy().
Cc: stable(a)vger.kernel.org
Signed-off-by: Huacai Chen <chenhc(a)lemote.com>
---
drivers/scsi/lpfc/lpfc_compat.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/lpfc/lpfc_compat.h b/drivers/scsi/lpfc/lpfc_compat.h
index 6b32b0a..47d4fad 100644
--- a/drivers/scsi/lpfc/lpfc_compat.h
+++ b/drivers/scsi/lpfc/lpfc_compat.h
@@ -91,8 +91,8 @@ lpfc_memcpy_to_slim( void __iomem *dest, void *src, unsigned int bytes)
static inline void
lpfc_memcpy_from_slim( void *dest, void __iomem *src, unsigned int bytes)
{
- /* actually returns 1 byte past dest */
- memcpy_fromio( dest, src, bytes);
+ /* convert bytes in argument list to word count for copy function */
+ __ioread32_copy(dest, src, bytes / sizeof(uint32_t));
}
#endif /* __BIG_ENDIAN */
--
2.7.0
Every time I tried to upgrade my laptop from 3.10.x to 4.x I faced an
issue by which the fan would run at full speed upon resume. Bisecting
it showed me the issue was introduced in 3.17 by commit 821d6f0359b0
(ACPI / sleep: Do not save NVS for new machines to accelerate S3). This
code only affects machines built starting as of 2012, but this Asus
1025C laptop was made in 2012 and apparently needs the NVS data to be
saved, otherwise the CPU's thermal state is not properly reported on
resume and the fan runs at full speed upon resume.
Here's a very simple way to check if such a machine is affected :
# cat /sys/class/thermal/thermal_zone0/temp
55000
( now suspend, wait one second and resume )
# cat /sys/class/thermal/thermal_zone0/temp
0
(and after ~15 seconds the fan starts to spin)
Let's apply the same quirk as commit cbc00c13 (ACPI: save NVS memory
for Lenovo G50-45) and reuse the function it provides. Note that this
commit was already backported to 4.9.x but not 4.4.x.
Cc: <stable(a)vger.kernel.org> # 3.17+: requires cbc00c13
Signed-off-by: Willy Tarreau <w(a)1wt.eu>
---
drivers/acpi/sleep.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/acpi/sleep.c b/drivers/acpi/sleep.c
index 974e584..af54d7b 100644
--- a/drivers/acpi/sleep.c
+++ b/drivers/acpi/sleep.c
@@ -338,6 +338,14 @@ static const struct dmi_system_id acpisleep_dmi_table[] __initconst = {
DMI_MATCH(DMI_PRODUCT_NAME, "K54HR"),
},
},
+ {
+ .callback = init_nvs_save_s3,
+ .ident = "Asus 1025C",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+ DMI_MATCH(DMI_PRODUCT_NAME, "1025C"),
+ },
+ },
/*
* https://bugzilla.kernel.org/show_bug.cgi?id=189431
* Lenovo G50-45 is a platform later than 2012, but needs nvs memory
--
2.8.0.rc2.1.gbe9624a
This patch restores the suspend and resume hooks that the old driver used
to have. Apart from stopping and starting the clocks, the resume callback
also nullifies the selected_chip pointer, so the next command that is issued
will re-select the chip and thereby restore the timing registers.
Factor out some code from marvell_nfc_init() into a new function
marvell_nfc_reset() and also call it at resume time to reset some registers
that don't retain their contents during low-power mode.
Without this patch, a PXA3xx based system would cough up an error similar to
the one below after resume.
[ 44.660162] marvell-nfc 43100000.nand-controller: Timeout waiting for RB signal
[ 44.671492] ubi0 error: ubi_io_write: error -110 while writing 2048 bytes to PEB 102:38912, written 0 bytes
[ 44.682887] CPU: 0 PID: 1417 Comm: remote-control Not tainted 4.18.0-rc2+ #344
[ 44.691197] Hardware name: Marvell PXA3xx (Device Tree Support)
[ 44.697111] Backtrace:
[ 44.699593] [<c0106458>] (dump_backtrace) from [<c0106718>] (show_stack+0x18/0x1c)
[ 44.708931] r7:00000800 r6:00009800 r5:00000066 r4:c6139000
[ 44.715833] [<c0106700>] (show_stack) from [<c0678a60>] (dump_stack+0x20/0x28)
[ 44.724206] [<c0678a40>] (dump_stack) from [<c0456cbc>] (ubi_io_write+0x3d4/0x630)
[ 44.732925] [<c04568e8>] (ubi_io_write) from [<c0454428>] (ubi_eba_write_leb+0x690/0x6fc)
...
Signed-off-by: Daniel Mack <daniel(a)zonque.org>
Fixes: 02f26ecf8c77 ("mtd: nand: add reworked Marvell NAND controller driver")
Cc: stable(a)vger.kernel.org
---
drivers/mtd/nand/raw/marvell_nand.c | 73 ++++++++++++++++++++++++-----
1 file changed, 62 insertions(+), 11 deletions(-)
diff --git a/drivers/mtd/nand/raw/marvell_nand.c b/drivers/mtd/nand/raw/marvell_nand.c
index 00d9f29bbdb6..03ee016d7516 100644
--- a/drivers/mtd/nand/raw/marvell_nand.c
+++ b/drivers/mtd/nand/raw/marvell_nand.c
@@ -2662,6 +2662,21 @@ static int marvell_nfc_init_dma(struct marvell_nfc *nfc)
return 0;
}
+static void marvell_nfc_reset(struct marvell_nfc *nfc)
+{
+ /*
+ * ECC operations and interruptions are only enabled when specifically
+ * needed. ECC shall not be activated in the early stages (fails probe).
+ * Arbiter flag, even if marked as "reserved", must be set (empirical).
+ * SPARE_EN bit must always be set or ECC bytes will not be at the same
+ * offset in the read page and this will fail the protection.
+ */
+ writel_relaxed(NDCR_ALL_INT | NDCR_ND_ARB_EN | NDCR_SPARE_EN |
+ NDCR_RD_ID_CNT(NFCV1_READID_LEN), nfc->regs + NDCR);
+ writel_relaxed(0xFFFFFFFF, nfc->regs + NDSR);
+ writel_relaxed(0, nfc->regs + NDECCCTRL);
+}
+
static int marvell_nfc_init(struct marvell_nfc *nfc)
{
struct device_node *np = nfc->dev->of_node;
@@ -2700,17 +2715,7 @@ static int marvell_nfc_init(struct marvell_nfc *nfc)
if (!nfc->caps->is_nfcv2)
marvell_nfc_init_dma(nfc);
- /*
- * ECC operations and interruptions are only enabled when specifically
- * needed. ECC shall not be activated in the early stages (fails probe).
- * Arbiter flag, even if marked as "reserved", must be set (empirical).
- * SPARE_EN bit must always be set or ECC bytes will not be at the same
- * offset in the read page and this will fail the protection.
- */
- writel_relaxed(NDCR_ALL_INT | NDCR_ND_ARB_EN | NDCR_SPARE_EN |
- NDCR_RD_ID_CNT(NFCV1_READID_LEN), nfc->regs + NDCR);
- writel_relaxed(0xFFFFFFFF, nfc->regs + NDSR);
- writel_relaxed(0, nfc->regs + NDECCCTRL);
+ marvell_nfc_reset(nfc);
return 0;
}
@@ -2825,6 +2830,51 @@ static int marvell_nfc_remove(struct platform_device *pdev)
return 0;
}
+static int __maybe_unused marvell_nfc_suspend(struct device *dev)
+{
+ struct marvell_nfc *nfc = dev_get_drvdata(dev);
+ struct marvell_nand_chip *chip;
+
+ list_for_each_entry(chip, &nfc->chips, node)
+ marvell_nfc_wait_ndrun(&chip->chip);
+
+ clk_disable_unprepare(nfc->reg_clk);
+ clk_disable_unprepare(nfc->core_clk);
+
+ return 0;
+}
+
+static int __maybe_unused marvell_nfc_resume(struct device *dev)
+{
+ struct marvell_nfc *nfc = dev_get_drvdata(dev);
+ int ret;
+
+ ret = clk_prepare_enable(nfc->core_clk);
+ if (ret < 0)
+ return ret;
+
+ if (!IS_ERR(nfc->reg_clk)) {
+ ret = clk_prepare_enable(nfc->reg_clk);
+ if (ret < 0)
+ return ret;
+ }
+
+ /*
+ * Reset nfc->selected_chip so the next command will cause the timing
+ * registers to be restored in marvell_nfc_select_chip().
+ */
+ nfc->selected_chip = NULL;
+
+ /* Reset registers that have lost their contents */
+ marvell_nfc_reset(nfc);
+
+ return 0;
+}
+
+static const struct dev_pm_ops marvell_nfc_pm_ops = {
+ SET_SYSTEM_SLEEP_PM_OPS(marvell_nfc_suspend, marvell_nfc_resume)
+};
+
static const struct marvell_nfc_caps marvell_armada_8k_nfc_caps = {
.max_cs_nb = 4,
.max_rb_nb = 2,
@@ -2909,6 +2959,7 @@ static struct platform_driver marvell_nfc_driver = {
.driver = {
.name = "marvell-nfc",
.of_match_table = marvell_nfc_of_ids,
+ .pm = &marvell_nfc_pm_ops,
},
.id_table = marvell_nfc_platform_ids,
.probe = marvell_nfc_probe,
--
2.17.1
This is the start of the stable review cycle for the 4.17.5 release.
There are 46 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun Jul 8 05:45:10 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.17.5-rc1…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.17.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.17.5-rc1
Sean Nyekjaer <sean.nyekjaer(a)prevas.dk>
ARM: dts: imx6q: Use correct SDMA script for SPI5 core
Andrey Ryabinin <aryabinin(a)virtuozzo.com>
x86/mm: Don't free P4D table when it is folded at runtime
Neil Armstrong <narmstrong(a)baylibre.com>
ARM64: dts: meson-gxl-s905x-p212: Add phy-supply for usb0
Taehee Yoo <ap420073(a)gmail.com>
netfilter: nf_tables: use WARN_ON_ONCE instead of BUG_ON in nft_do_chain()
Florian Westphal <fw(a)strlen.de>
netfilter: xt_connmark: fix list corruption on rmmod
Vincent Bernat <vincent(a)bernat.im>
netfilter: ip6t_rpfilter: provide input interface for route lookup
Kenneth Graunke <kenneth(a)whitecape.org>
drm/i915: Enable provoking vertex fix on Gen9 systems.
Ville Syrjälä <ville.syrjala(a)linux.intel.com>
drm/i915: Turn off g4x DP port in .post_disable()
Ville Syrjälä <ville.syrjala(a)linux.intel.com>
drm/i915: Disallow interlaced modes on g4x DP outputs
Ville Syrjälä <ville.syrjala(a)linux.intel.com>
drm/i915: Fix PIPESTAT irq ack on i965/g4x
Ville Syrjälä <ville.syrjala(a)linux.intel.com>
drm/i915: Allow DBLSCAN user modes with eDP/LVDS/DSI
Shirish S <shirish.s(a)amd.com>
drm/amd/display: release spinlock before committing updates to stream
Lyude Paul <lyude(a)redhat.com>
drm/amdgpu: Count disabled CRTCs in commit tail earlier
Michel Dänzer <michel.daenzer(a)amd.com>
drm/amdgpu: GPU vs CPU page size fixes in amdgpu_vm_bo_split_mapping
Michel Dänzer <michel.daenzer(a)amd.com>
drm/amdgpu: Update pin_size values before unpinning BO
Michel Dänzer <michel.daenzer(a)amd.com>
drm/amdgpu: Make amdgpu_vram_mgr_bo_invisible_size always accurate
Michel Dänzer <michel.daenzer(a)amd.com>
drm/amdgpu: Refactor amdgpu_vram_mgr_bo_invisible_size helper
Michel Dänzer <michel.daenzer(a)amd.com>
drm/amdgpu: Use kvmalloc_array for allocating VRAM manager nodes array
Harry Wentland <harry.wentland(a)amd.com>
drm/amdgpu: Don't default to DC support for Kaveri and older
Paul Kocialkowski <paul.kocialkowski(a)bootlin.com>
Revert "drm/sun4i: Handle DRM_BUS_FLAG_PIXDATA_*EDGE"
Stefan Agner <stefan(a)agner.ch>
drm/atmel-hlcdc: check stride values in the first plane
Jeremy Cline <jcline(a)redhat.com>
drm/qxl: Call qxl_bo_unref outside atomic context
Lyude Paul <lyude(a)redhat.com>
drm/i915/dp: Send DPCD ON for MST before phy_up
Mikita Lipski <mikita.lipski(a)amd.com>
drm/amd/display: Clear connector's edid pointer
Oliver O'Halloran <oohall(a)gmail.com>
drm/sti: Depend on OF rather than selecting it
Junwei Zhang <Jerry.Zhang(a)amd.com>
drm/amdgpu: fix clear_all and replace handling in the VM (v2)
Lyude Paul <lyude(a)redhat.com>
drm/amdgpu: Grab/put runtime PM references in atomic_commit_tail()
Huang Rui <ray.huang(a)amd.com>
drm/amdgpu: fix the missed vcn fw version report
Rex Zhu <Rex.Zhu(a)amd.com>
drm/amdgpu: Add APU support in vi_set_vce_clocks
Rex Zhu <Rex.Zhu(a)amd.com>
drm/amdgpu: Add APU support in vi_set_uvd_clocks
Alexander Potapenko <glider(a)google.com>
vt: prevent leaking uninitialized data to userspace via /dev/vcs*
Johan Hovold <johan(a)kernel.org>
serdev: fix memleak on module unload
Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
serial: 8250_pci: Remove stalled entries in blacklist
Leonard Crestez <leonard.crestez(a)nxp.com>
iio: mma8452: Fix ignoring MMA8452_INT_DRDY
Laura Abbott <labbott(a)redhat.com>
staging: android: ion: Return an ERR_PTR in ion_map_kernel
Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
n_tty: Access echo_* variables carefully.
Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
n_tty: Fix stall at n_tty_receive_char_special().
Zhengjun Xing <zhengjun.xing(a)linux.intel.com>
xhci: Fix kernel oops in trace_xhci_free_virt_device
Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
usb: typec: ucsi: Fix for incorrect status data issue
Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
usb: typec: ucsi: acpi: Workaround for cache mode issue
Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
acpi: Add helper for deactivating memory region
Peter Chen <peter.chen(a)nxp.com>
usb: typec: tcpm: fix logbuffer index is wrong if _tcpm_log is re-entered
William Wu <william.wu(a)rock-chips.com>
usb: dwc2: fix the incorrect bitmaps for the ports of multi_tt hub
Karoly Pados <pados(a)pados.hu>
USB: serial: cp210x: add Silicon Labs IDs for Windows Update
Johan Hovold <johan(a)kernel.org>
USB: serial: cp210x: add CESINEL device ids
Houston Yaroschoff <hstn(a)4ever3.net>
usb: cdc_acm: Add quirk for Uniden UBC125 scanner
-------------
Diffstat:
Makefile | 4 +-
arch/arm/boot/dts/imx6q.dtsi | 2 +-
.../boot/dts/amlogic/meson-gxl-s905x-p212.dtsi | 7 ++
arch/x86/include/asm/pgalloc.h | 3 +
drivers/acpi/osl.c | 72 ++++++++++++++++++++
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 ++-
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 24 +++----
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 14 ++--
drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 39 ++++++++++-
drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 4 +-
drivers/gpu/drm/amd/amdgpu/vi.c | 77 +++++++++++++++++-----
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 22 +++++--
drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c | 2 +-
drivers/gpu/drm/i915/i915_irq.c | 12 +++-
drivers/gpu/drm/i915/i915_reg.h | 5 ++
drivers/gpu/drm/i915/intel_crt.c | 20 ++++++
drivers/gpu/drm/i915/intel_ddi.c | 8 ++-
drivers/gpu/drm/i915/intel_display.c | 16 ++++-
drivers/gpu/drm/i915/intel_dp.c | 34 +++++-----
drivers/gpu/drm/i915/intel_dp_mst.c | 14 +++-
drivers/gpu/drm/i915/intel_dsi.c | 6 ++
drivers/gpu/drm/i915/intel_dvo.c | 6 ++
drivers/gpu/drm/i915/intel_hdmi.c | 6 ++
drivers/gpu/drm/i915/intel_lrc.c | 12 +++-
drivers/gpu/drm/i915/intel_lvds.c | 5 ++
drivers/gpu/drm/i915/intel_sdvo.c | 6 ++
drivers/gpu/drm/i915/intel_tv.c | 12 +++-
drivers/gpu/drm/qxl/qxl_display.c | 7 +-
drivers/gpu/drm/sti/Kconfig | 3 +-
drivers/gpu/drm/sun4i/sun4i_tcon.c | 25 -------
drivers/iio/accel/mma8452.c | 2 +-
drivers/staging/android/ion/ion_heap.c | 2 +-
drivers/tty/n_tty.c | 55 +++++++++-------
drivers/tty/serdev/core.c | 1 +
drivers/tty/serial/8250/8250_pci.c | 2 -
drivers/tty/vt/vt.c | 4 +-
drivers/usb/class/cdc-acm.c | 3 +
drivers/usb/dwc2/hcd_queue.c | 2 +-
drivers/usb/host/xhci-mem.c | 4 +-
drivers/usb/host/xhci-trace.h | 36 ++++++++--
drivers/usb/serial/cp210x.c | 14 ++++
drivers/usb/typec/tcpm.c | 7 +-
drivers/usb/typec/ucsi/ucsi.c | 13 ++++
drivers/usb/typec/ucsi/ucsi_acpi.c | 5 ++
include/linux/acpi.h | 3 +
net/ipv6/netfilter/ip6t_rpfilter.c | 2 +
net/netfilter/nf_tables_core.c | 3 +-
net/netfilter/xt_connmark.c | 2 +-
50 files changed, 489 insertions(+), 150 deletions(-)
Hi,
On Thu, Jul 5, 2018 at 7:31 AM, Antti Seppälä <a.seppala(a)gmail.com> wrote:
> The commit 3bc04e28a030 ("usb: dwc2: host: Get aligned DMA in a more
> supported way") introduced a common way to align DMA allocations.
> The code in the commit aligns the struct dma_aligned_buffer but the
> actual DMA address pointed by data[0] gets aligned to an offset from
> the allocated boundary by the kmalloc_ptr and the old_xfer_buffer
> pointers.
>
> This is against the recommendation in Documentation/DMA-API.txt which
> states:
>
> Therefore, it is recommended that driver writers who don't take
> special care to determine the cache line size at run time only map
> virtual regions that begin and end on page boundaries (which are
> guaranteed also to be cache line boundaries).
>
> The effect of this is that architectures with non-coherent DMA caches
> may run into memory corruption or kernel crashes with Unhandled
> kernel unaligned accesses exceptions.
>
> Fix the alignment by positioning the DMA area in front of the allocation
> and use memory at the end of the area for storing the orginal
> transfer_buffer pointer. This may have the added benefit of increased
> performance as the DMA area is now fully aligned on all architectures.
>
> Tested with Lantiq xRX200 (MIPS) and RPi Model B Rev 2 (ARM).
>
> Fixes: 3bc04e28a030 ("usb: dwc2: host: Get aligned DMA in a more
> supported way")
>
> Signed-off-by: Antti Seppälä <a.seppala(a)gmail.com>
> ---
> drivers/usb/dwc2/hcd.c | 44 +++++++++++++++++++++++---------------------
> 1 file changed, 23 insertions(+), 21 deletions(-)
Thanks for tracking this down and sorry for the original regression.
Seems like a good fix. With this fix, I'd be curious of your
observations on how dwc2 performs (both performance and compatibility
under stress) with the newest driver compared to whatever you were
using before.
Also: you're using the dwc2_set_ltq_params() parameters? Have you
checked if removing the "max_transfer_size" limit boosts your
performance?
Cc: stable(a)vger.kernel.org
Reviewed-by: Douglas Anderson <dianders(a)chromium.org>