Greg,
Can you please include the following patch to 4.14.y stable branch
to fix the intel_idle probe and avoid mis-lead info to user by blaming
unsupported processor model:
a4c447533a18 ("intel_idle: Graceful probe failure when MWAIT is disabled")
This is a self-contained and cherry-pick'able patch in linus master branch.
Thanks,
Hope this time is better communicated.
--
All the best,
Eduardo Valentin
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From bd3599a0e142cd73edd3b6801068ac3f48ac771a Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Thu, 12 Jul 2018 01:36:43 +0100
Subject: [PATCH] Btrfs: fix file data corruption after cloning a range and
fsync
When we clone a range into a file we can end up dropping existing
extent maps (or trimming them) and replacing them with new ones if the
range to be cloned overlaps with a range in the destination inode.
When that happens we add the new extent maps to the list of modified
extents in the inode's extent map tree, so that a "fast" fsync (the flag
BTRFS_INODE_NEEDS_FULL_SYNC not set in the inode) will see the extent maps
and log corresponding extent items. However, at the end of range cloning
operation we do truncate all the pages in the affected range (in order to
ensure future reads will not get stale data). Sometimes this truncation
will release the corresponding extent maps besides the pages from the page
cache. If this happens, then a "fast" fsync operation will miss logging
some extent items, because it relies exclusively on the extent maps being
present in the inode's extent tree, leading to data loss/corruption if
the fsync ends up using the same transaction used by the clone operation
(that transaction was not committed in the meanwhile). An extent map is
released through the callback btrfs_invalidatepage(), which gets called by
truncate_inode_pages_range(), and it calls __btrfs_releasepage(). The
later ends up calling try_release_extent_mapping() which will release the
extent map if some conditions are met, like the file size being greater
than 16Mb, gfp flags allow blocking and the range not being locked (which
is the case during the clone operation) nor being the extent map flagged
as pinned (also the case for cloning).
The following example, turned into a test for fstests, reproduces the
issue:
$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt
$ xfs_io -f -c "pwrite -S 0x18 9000K 6908K" /mnt/foo
$ xfs_io -f -c "pwrite -S 0x20 2572K 156K" /mnt/bar
$ xfs_io -c "fsync" /mnt/bar
# reflink destination offset corresponds to the size of file bar,
# 2728Kb minus 4Kb.
$ xfs_io -c ""reflink ${SCRATCH_MNT}/foo 0 2724K 15908K" /mnt/bar
$ xfs_io -c "fsync" /mnt/bar
$ md5sum /mnt/bar
95a95813a8c2abc9aa75a6c2914a077e /mnt/bar
<power fail>
$ mount /dev/sdb /mnt
$ md5sum /mnt/bar
207fd8d0b161be8a84b945f0df8d5f8d /mnt/bar
# digest should be 95a95813a8c2abc9aa75a6c2914a077e like before the
# power failure
In the above example, the destination offset of the clone operation
corresponds to the size of the "bar" file minus 4Kb. So during the clone
operation, the extent map covering the range from 2572Kb to 2728Kb gets
trimmed so that it ends at offset 2724Kb, and a new extent map covering
the range from 2724Kb to 11724Kb is created. So at the end of the clone
operation when we ask to truncate the pages in the range from 2724Kb to
2724Kb + 15908Kb, the page invalidation callback ends up removing the new
extent map (through try_release_extent_mapping()) when the page at offset
2724Kb is passed to that callback.
Fix this by setting the bit BTRFS_INODE_NEEDS_FULL_SYNC whenever an extent
map is removed at try_release_extent_mapping(), forcing the next fsync to
search for modified extents in the fs/subvolume tree instead of relying on
the presence of extent maps in memory. This way we can continue doing a
"fast" fsync if the destination range of a clone operation does not
overlap with an existing range or if any of the criteria necessary to
remove an extent map at try_release_extent_mapping() is not met (file
size not bigger then 16Mb or gfp flags do not allow blocking).
CC: stable(a)vger.kernel.org # 3.16+
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 1aa91d57404a..8fd86e29085f 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -4241,8 +4241,9 @@ int try_release_extent_mapping(struct page *page, gfp_t mask)
struct extent_map *em;
u64 start = page_offset(page);
u64 end = start + PAGE_SIZE - 1;
- struct extent_io_tree *tree = &BTRFS_I(page->mapping->host)->io_tree;
- struct extent_map_tree *map = &BTRFS_I(page->mapping->host)->extent_tree;
+ struct btrfs_inode *btrfs_inode = BTRFS_I(page->mapping->host);
+ struct extent_io_tree *tree = &btrfs_inode->io_tree;
+ struct extent_map_tree *map = &btrfs_inode->extent_tree;
if (gfpflags_allow_blocking(mask) &&
page->mapping->host->i_size > SZ_16M) {
@@ -4265,6 +4266,8 @@ int try_release_extent_mapping(struct page *page, gfp_t mask)
extent_map_end(em) - 1,
EXTENT_LOCKED | EXTENT_WRITEBACK,
0, NULL)) {
+ set_bit(BTRFS_INODE_NEEDS_FULL_SYNC,
+ &btrfs_inode->runtime_flags);
remove_extent_mapping(map, em);
/* once for the rb tree */
free_extent_map(em);
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 9f9e3e0d4dd3338b3f3dde080789f71901e1e4ff Mon Sep 17 00:00:00 2001
From: Esben Haabendal <eha(a)deif.com>
Date: Mon, 9 Jul 2018 11:43:01 +0200
Subject: [PATCH] i2c: imx: Fix reinit_completion() use
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Make sure to call reinit_completion() before dma is started to avoid race
condition where reinit_completion() is called after complete() and before
wait_for_completion_timeout().
Signed-off-by: Esben Haabendal <eha(a)deif.com>
Fixes: ce1a78840ff7 ("i2c: imx: add DMA support for freescale i2c driver")
Reviewed-by: Uwe Kleine-König <u.kleine-koenig(a)pengutronix.de>
Signed-off-by: Wolfram Sang <wsa(a)the-dreams.de>
Cc: stable(a)kernel.org
diff --git a/drivers/i2c/busses/i2c-imx.c b/drivers/i2c/busses/i2c-imx.c
index 0207e194f84b..39cfd98c7b23 100644
--- a/drivers/i2c/busses/i2c-imx.c
+++ b/drivers/i2c/busses/i2c-imx.c
@@ -368,6 +368,7 @@ static int i2c_imx_dma_xfer(struct imx_i2c_struct *i2c_imx,
goto err_desc;
}
+ reinit_completion(&dma->cmd_complete);
txdesc->callback = i2c_imx_dma_callback;
txdesc->callback_param = i2c_imx;
if (dma_submit_error(dmaengine_submit(txdesc))) {
@@ -622,7 +623,6 @@ static int i2c_imx_dma_write(struct imx_i2c_struct *i2c_imx,
* The first byte must be transmitted by the CPU.
*/
imx_i2c_write_reg(i2c_8bit_addr_from_msg(msgs), i2c_imx, IMX_I2C_I2DR);
- reinit_completion(&i2c_imx->dma->cmd_complete);
time_left = wait_for_completion_timeout(
&i2c_imx->dma->cmd_complete,
msecs_to_jiffies(DMA_TIMEOUT));
@@ -681,7 +681,6 @@ static int i2c_imx_dma_read(struct imx_i2c_struct *i2c_imx,
if (result)
return result;
- reinit_completion(&i2c_imx->dma->cmd_complete);
time_left = wait_for_completion_timeout(
&i2c_imx->dma->cmd_complete,
msecs_to_jiffies(DMA_TIMEOUT));
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 73c8d8945505acdcbae137c2e00a1232e0be709f Mon Sep 17 00:00:00 2001
From: Masami Hiramatsu <mhiramat(a)kernel.org>
Date: Sat, 14 Jul 2018 01:28:15 +0900
Subject: [PATCH] ring_buffer: tracing: Inherit the tracing setting to next
ring buffer
Maintain the tracing on/off setting of the ring_buffer when switching
to the trace buffer snapshot.
Taking a snapshot is done by swapping the backup ring buffer
(max_tr_buffer). But since the tracing on/off setting is defined
by the ring buffer, when swapping it, the tracing on/off setting
can also be changed. This causes a strange result like below:
/sys/kernel/debug/tracing # cat tracing_on
1
/sys/kernel/debug/tracing # echo 0 > tracing_on
/sys/kernel/debug/tracing # cat tracing_on
0
/sys/kernel/debug/tracing # echo 1 > snapshot
/sys/kernel/debug/tracing # cat tracing_on
1
/sys/kernel/debug/tracing # echo 1 > snapshot
/sys/kernel/debug/tracing # cat tracing_on
0
We don't touch tracing_on, but snapshot changes tracing_on
setting each time. This is an anomaly, because user doesn't know
that each "ring_buffer" stores its own tracing-enable state and
the snapshot is done by swapping ring buffers.
Link: http://lkml.kernel.org/r/153149929558.11274.11730609978254724394.stgit@devb…
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Tom Zanussi <tom.zanussi(a)linux.intel.com>
Cc: Hiraku Toyooka <hiraku.toyooka(a)cybertrust.co.jp>
Cc: stable(a)vger.kernel.org
Fixes: debdd57f5145 ("tracing: Make a snapshot feature available from userspace")
Signed-off-by: Masami Hiramatsu <mhiramat(a)kernel.org>
[ Updated commit log and comment in the code ]
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h
index b72ebdff0b77..003d09ab308d 100644
--- a/include/linux/ring_buffer.h
+++ b/include/linux/ring_buffer.h
@@ -165,6 +165,7 @@ void ring_buffer_record_enable(struct ring_buffer *buffer);
void ring_buffer_record_off(struct ring_buffer *buffer);
void ring_buffer_record_on(struct ring_buffer *buffer);
int ring_buffer_record_is_on(struct ring_buffer *buffer);
+int ring_buffer_record_is_set_on(struct ring_buffer *buffer);
void ring_buffer_record_disable_cpu(struct ring_buffer *buffer, int cpu);
void ring_buffer_record_enable_cpu(struct ring_buffer *buffer, int cpu);
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 6a46af21765c..0b0b688ea166 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -3226,6 +3226,22 @@ int ring_buffer_record_is_on(struct ring_buffer *buffer)
return !atomic_read(&buffer->record_disabled);
}
+/**
+ * ring_buffer_record_is_set_on - return true if the ring buffer is set writable
+ * @buffer: The ring buffer to see if write is set enabled
+ *
+ * Returns true if the ring buffer is set writable by ring_buffer_record_on().
+ * Note that this does NOT mean it is in a writable state.
+ *
+ * It may return true when the ring buffer has been disabled by
+ * ring_buffer_record_disable(), as that is a temporary disabling of
+ * the ring buffer.
+ */
+int ring_buffer_record_is_set_on(struct ring_buffer *buffer)
+{
+ return !(atomic_read(&buffer->record_disabled) & RB_BUFFER_OFF);
+}
+
/**
* ring_buffer_record_disable_cpu - stop all writes into the cpu_buffer
* @buffer: The ring buffer to stop writes to.
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 87cf25171fb8..823687997b01 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -1373,6 +1373,12 @@ update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu)
arch_spin_lock(&tr->max_lock);
+ /* Inherit the recordable setting from trace_buffer */
+ if (ring_buffer_record_is_set_on(tr->trace_buffer.buffer))
+ ring_buffer_record_on(tr->max_buffer.buffer);
+ else
+ ring_buffer_record_off(tr->max_buffer.buffer);
+
swap(tr->trace_buffer.buffer, tr->max_buffer.buffer);
__update_max_tr(tr, tsk, cpu);
Hi Greg,
Upstream commit a0040c014594 ("ACPI / PCI: Bail early in
acpi_pci_add_bus() if there is no ACPI handle ").
This patch fixes a NULL pointer access when ACPI_HANDLE() is NULL.
acpi_evaluate_dsm() fails as follows while there is no DSM in the tables.
ACPI: \: failed to evaluate _DSM (0x1001)
Sinan
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 44de022c4382541cebdd6de4465d1f4f465ff1dd Mon Sep 17 00:00:00 2001
From: Theodore Ts'o <tytso(a)mit.edu>
Date: Sun, 8 Jul 2018 19:35:02 -0400
Subject: [PATCH] ext4: fix false negatives *and* false positives in
ext4_check_descriptors()
Ext4_check_descriptors() was getting called before s_gdb_count was
initialized. So for file systems w/o the meta_bg feature, allocation
bitmaps could overlap the block group descriptors and ext4 wouldn't
notice.
For file systems with the meta_bg feature enabled, there was a
fencepost error which would cause the ext4_check_descriptors() to
incorrectly believe that the block allocation bitmap overlaps with the
block group descriptor blocks, and it would reject the mount.
Fix both of these problems.
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
Cc: stable(a)vger.kernel.org
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index ba2396a7bd04..eff5c983e067 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -2342,7 +2342,7 @@ static int ext4_check_descriptors(struct super_block *sb,
struct ext4_sb_info *sbi = EXT4_SB(sb);
ext4_fsblk_t first_block = le32_to_cpu(sbi->s_es->s_first_data_block);
ext4_fsblk_t last_block;
- ext4_fsblk_t last_bg_block = sb_block + ext4_bg_num_gdb(sb, 0) + 1;
+ ext4_fsblk_t last_bg_block = sb_block + ext4_bg_num_gdb(sb, 0);
ext4_fsblk_t block_bitmap;
ext4_fsblk_t inode_bitmap;
ext4_fsblk_t inode_table;
@@ -4085,14 +4085,13 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
goto failed_mount2;
}
}
+ sbi->s_gdb_count = db_count;
if (!ext4_check_descriptors(sb, logical_sb_block, &first_not_zeroed)) {
ext4_msg(sb, KERN_ERR, "group descriptors corrupted!");
ret = -EFSCORRUPTED;
goto failed_mount2;
}
- sbi->s_gdb_count = db_count;
-
timer_setup(&sbi->s_err_report, print_daily_error_info, 0);
/* Register extent status tree shrinker */
Fix trivial conflict that caused the original commit to bounce from 4.4,
4.9, and 4.14.
-- >8 --
commit 44de022c4382541cebdd6de4465d1f4f465ff1dd upstream.
Ext4_check_descriptors() was getting called before s_gdb_count was
initialized. So for file systems w/o the meta_bg feature, allocation
bitmaps could overlap the block group descriptors and ext4 wouldn't
notice.
For file systems with the meta_bg feature enabled, there was a
fencepost error which would cause the ext4_check_descriptors() to
incorrectly believe that the block allocation bitmap overlaps with the
block group descriptor blocks, and it would reject the mount.
Fix both of these problems.
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
Cc: stable(a)vger.kernel.org
Signed-off-by: Benjamin Gilbert <bgilbert(a)redhat.com>
---
fs/ext4/super.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index fc32a67a7a19..a500f55d421a 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -2301,7 +2301,7 @@ static int ext4_check_descriptors(struct super_block *sb,
struct ext4_sb_info *sbi = EXT4_SB(sb);
ext4_fsblk_t first_block = le32_to_cpu(sbi->s_es->s_first_data_block);
ext4_fsblk_t last_block;
- ext4_fsblk_t last_bg_block = sb_block + ext4_bg_num_gdb(sb, 0) + 1;
+ ext4_fsblk_t last_bg_block = sb_block + ext4_bg_num_gdb(sb, 0);
ext4_fsblk_t block_bitmap;
ext4_fsblk_t inode_bitmap;
ext4_fsblk_t inode_table;
@@ -4044,13 +4044,13 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
goto failed_mount2;
}
}
+ sbi->s_gdb_count = db_count;
if (!ext4_check_descriptors(sb, logical_sb_block, &first_not_zeroed)) {
ext4_msg(sb, KERN_ERR, "group descriptors corrupted!");
ret = -EFSCORRUPTED;
goto failed_mount2;
}
- sbi->s_gdb_count = db_count;
get_random_bytes(&sbi->s_next_generation, sizeof(u32));
spin_lock_init(&sbi->s_next_gen_lock);
--
2.17.1
Did you receive my email yesterday?
Just want to check if you have needs for photo editing for our studio?
We normally edit 300 images within 12-24 hours.
We take care of different kinds of e-commerce photos, jewelry images, and
portrait model images.
Including cutting out and clipping path and others, we also give retouching
for your images.
You may drop us a photo if you want to check our quality, we will provide
testing.
Thanks,
Roy
Since commit 63347db0affa ("ACPI / scan: Use acpi_bus_get_status() to
initialize ACPI_TYPE_DEVICE devs") the status field of normal acpi_devices
gets set to 0 by acpi_bus_type_and_status() and filled with its actual
value later when acpi_add_single_object() calls acpi_bus_get_status().
This means that status is 0 when acpi_device_enumeration_by_parent()
gets called which makes the acpi_match_device_ids() call in
acpi_is_indirect_io_slave() always return -ENOENT.
This commit fixes this by making acpi_bus_type_and_status() set status to
ACPI_STA_DEFAULT until acpi_add_single_object() calls acpi_bus_get_status()
Fixes: dfda4492322e ("ACPI / scan: Do not enumerate Indirect IO host children")
Cc: stable(a)vger.kernel.org
Cc: John Garry <john.garry(a)huawei.com>
Cc: Bjorn Helgaas <bhelgaas(a)google.com>
Cc: Dann Frazier <dann.frazier(a)canonical.com>
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
---
drivers/acpi/scan.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
index 970dd87d347c..6799d00dd790 100644
--- a/drivers/acpi/scan.c
+++ b/drivers/acpi/scan.c
@@ -1612,7 +1612,8 @@ static int acpi_add_single_object(struct acpi_device **child,
* Note this must be done before the get power-/wakeup_dev-flags calls.
*/
if (type == ACPI_BUS_TYPE_DEVICE)
- acpi_bus_get_status(device);
+ if (acpi_bus_get_status(device) < 0)
+ acpi_set_device_status(device, 0);
acpi_bus_get_power_flags(device);
acpi_bus_get_wakeup_device_flags(device);
@@ -1690,7 +1691,7 @@ static int acpi_bus_type_and_status(acpi_handle handle, int *type,
* acpi_add_single_object updates this once we've an acpi_device
* so that acpi_bus_get_status' quirk handling can be used.
*/
- *sta = 0;
+ *sta = ACPI_STA_DEFAULT;
break;
case ACPI_TYPE_PROCESSOR:
*type = ACPI_BUS_TYPE_PROCESSOR;
--
2.18.0
Commit 3ac6d8c787b8 ("x86/entry/64: Clear registers for
exceptions/interrupts, to reduce speculation attack surface") unintendedly
broke Xen PV virtual machines by clearing the %rbx register at the end of
xen_failsafe_callback before the latter jumps to error_exit.
error_exit expects the %rbx register to be a flag indicating whether
there should be a return to kernel mode.
This commit makes sure that the %rbx register is not cleared by
the PUSH_AND_CLEAR_REGS macro, when the macro in question is instantiated
by xen_failsafe_callback, to avoid the issue.
The impact of this bug includes (and most likely is not limited to) kernel
BUGs when wine is run in Xen PV virtual machines. One example is included
below. This example depicts the bug when wine is used to run TeamViewer
version 13's portable version under a Xen PV virtual machine using an
up-to-date version of Qubes OS R3.2.
[...........] kernel tried to execute NX-protected page - exploit attempt? (uid: 1000)
[ +0.000021] BUG: unable to handle kernel paging request at ffffffff82012480
[ +0.000020] IP: init_task+0x0/0x2ac0
[ +0.000009] PGD 200e067 P4D 200e067 PUD 200f067 PMD 2d5e067 PTE 8010000002012067
[ +0.000019] Oops: 0011 [#1] SMP DEBUG_PAGEALLOC NOPTI
[ +0.000015] Modules linked in: fuse bluetooth ecdh_generic rfkill ip6table_filter ip6_tables xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c intel_rapl x86_pkg_temp_thermal crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel xen_netfront intel_rapl_perf pcspkr binfmt_misc dummy_hcd udc_core xen_gntdev xen_gntalloc u2mfn(O) xen_blkback xenfs xen_evtchn xen_privcmd xen_blkfront
[ +0.000091] CPU: 0 PID: 1350 Comm: TeamViewer.exe Tainted: G O 4.14.20-11.d10d0bb86d #15
[ +0.000018] task: ffff8800042fda00 task.stack: ffffc900006f8000
[ +0.000015] RIP: e030:init_task+0x0/0x2ac0
[ +0.000009] RSP: e02b:ffffffff82003df8 EFLAGS: 00010002
[ +0.000012] RAX: 0000000000000040 RBX: 0000000000000004 RCX: 0000000000000000
[ +0.000015] RDX: 0000000000000000 RSI: 0000000000000202 RDI: ffffffff810011aa
[ +0.000015] RBP: 000000000033eb88 R08: 0000000000000000 R09: 0000000000000000
[ +0.000015] R10: 0000000000000000 R11: ffff88000d772600 R12: 0000000000000000
[ +0.000015] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ +0.000027] FS: 000000007ffd8000(0063) GS:ffff88000f400000(006b) knlGS:0000000000000000
[ +0.000016] CS: e033 DS: 002b ES: 002b CR0: 0000000080050033
[ +0.000014] CR2: ffffffff82012480 CR3: 0000000004880000 CR4: 0000000000042660
[ +0.000025] Call Trace:
[ +0.000007] Code: ff ff ff d0 5a 1e 81 ff ff ff ff 00 00 00 00 00 00 00 00 e0 d7 04 82 ff ff ff ff e8 98 c0 0d 00 88 ff ff 01 80 00 00 00 00 00 00 <00> 00 00 80 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ +0.000068] RIP: init_task+0x0/0x2ac0 RSP: ffffffff82003df8
[ +0.000011] CR2: ffffffff82012480
[ +0.000010] ---[ end trace 7fb0075d4247b2a2 ]---
The commit that introduces this bug was found with git-bisect in conjunction with
some scripts and Qubes OS's cross-VM RPCs for automation, and the correction was
prepared after a manual inspection of the changes introduced by the commit in
question.
To the best of my recollection, this issue is also reproducible in an Ubuntu
18.04 LTS dom0 instance with the version of the Xen hypervisor in Ubuntu's
package repositories.
Signed-off-by: M. Vefa Bicakci <m.v.b(a)runbox.com>
Cc: Dominik Brodowski <linux(a)dominikbrodowski.net>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: "H. Peter Anvin" <hpa(a)zytor.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: xen-devel(a)lists.xenproject.org
Cc: x86(a)kernel.org
Cc: stable(a)vger.kernel.org # for v4.14 and up
Fixes: 3ac6d8c787b8 ("x86/entry/64: Clear registers for exceptions/interrupts, to reduce speculation attack surface")
---
arch/x86/entry/calling.h | 4 +++-
arch/x86/entry/entry_64.S | 2 +-
2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
index 20d0885b00fb..71795767da94 100644
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -97,7 +97,7 @@ For 32-bit we have the following conventions - kernel is built with
#define SIZEOF_PTREGS 21*8
-.macro PUSH_AND_CLEAR_REGS rdx=%rdx rax=%rax save_ret=0
+.macro PUSH_AND_CLEAR_REGS rdx=%rdx rax=%rax save_ret=0 clear_rbx=1
/*
* Push registers and sanitize registers of values that a
* speculation attack might otherwise want to exploit. The
@@ -127,7 +127,9 @@ For 32-bit we have the following conventions - kernel is built with
pushq %r11 /* pt_regs->r11 */
xorl %r11d, %r11d /* nospec r11*/
pushq %rbx /* pt_regs->rbx */
+ .if \clear_rbx
xorl %ebx, %ebx /* nospec rbx*/
+ .endif
pushq %rbp /* pt_regs->rbp */
xorl %ebp, %ebp /* nospec rbp*/
pushq %r12 /* pt_regs->r12 */
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index c7449f377a77..96e8ff34129e 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1129,7 +1129,7 @@ ENTRY(xen_failsafe_callback)
addq $0x30, %rsp
UNWIND_HINT_IRET_REGS
pushq $-1 /* orig_ax = -1 => not a system call */
- PUSH_AND_CLEAR_REGS
+ PUSH_AND_CLEAR_REGS clear_rbx=0
ENCODE_FRAME_POINTER
jmp error_exit
END(xen_failsafe_callback)
--
2.17.1