- Linux-stable-mirror - lists.linaro.org

FAILED: patch "[PATCH] drm/amdgpu: fix KV harvesting" failed to apply to 4.16-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.16-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From a97fc4e4524cda15e0176ee8853038f6b3920a9a Mon Sep 17 00:00:00 2001 From: Alex Deucher <alexander.deucher(a)amd.com> Date: Thu, 1 Mar 2018 11:05:31 -0500 Subject: [PATCH] drm/amdgpu: fix KV harvesting MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Always set the graphics values to the max for the asic type. E.g., some 1 RB chips are actually 1 RB chips, others are actually harvested 2 RB chips. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=99353 Reviewed-by: Christian König <christian.koenig(a)amd.com> Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com> Cc: stable(a)vger.kernel.org diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c index 972d421caada..e13d9d83767b 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c @@ -4358,34 +4358,8 @@ static void gfx_v7_0_gpu_early_init(struct amdgpu_device *adev) case CHIP_KAVERI: adev->gfx.config.max_shader_engines = 1; adev->gfx.config.max_tile_pipes = 4; - if ((adev->pdev->device == 0x1304) || - (adev->pdev->device == 0x1305) || - (adev->pdev->device == 0x130C) || - (adev->pdev->device == 0x130F) || - (adev->pdev->device == 0x1310) || - (adev->pdev->device == 0x1311) || - (adev->pdev->device == 0x131C)) { - adev->gfx.config.max_cu_per_sh = 8; - adev->gfx.config.max_backends_per_se = 2; - } else if ((adev->pdev->device == 0x1309) || - (adev->pdev->device == 0x130A) || - (adev->pdev->device == 0x130D) || - (adev->pdev->device == 0x1313) || - (adev->pdev->device == 0x131D)) { - adev->gfx.config.max_cu_per_sh = 6; - adev->gfx.config.max_backends_per_se = 2; - } else if ((adev->pdev->device == 0x1306) || - (adev->pdev->device == 0x1307) || - (adev->pdev->device == 0x130B) || - (adev->pdev->device == 0x130E) || - (adev->pdev->device == 0x1315) || - (adev->pdev->device == 0x131B)) { - adev->gfx.config.max_cu_per_sh = 4; - adev->gfx.config.max_backends_per_se = 1; - } else { - adev->gfx.config.max_cu_per_sh = 3; - adev->gfx.config.max_backends_per_se = 1; - } + adev->gfx.config.max_cu_per_sh = 8; + adev->gfx.config.max_backends_per_se = 2; adev->gfx.config.max_sh_per_se = 1; adev->gfx.config.max_texture_channel_caches = 4; adev->gfx.config.max_gprs = 256;

7 years, 4 months

1
0
0 0

ext4 patches for 4.4.y

by Nathan Chancellor

Hi Greg and Ted, I've been looking at ext4 history the past couple of days seeing if the patches that you attempted to apply from 4.17-rc1 were relevant and I noticed a couple from previous 4.x versions that seem like they should be applied here, as they are clean picks and tagged for stable: 74dae4278546 ("ext4: fix crashes in dioread_nolock mode") c755e251357a ("ext4: fix deadlock between inline_data and ext4_expand_extra_isize_ea()") I've been running them for the day and not had any issues (though I didn't have any before). If there are any objections, let me know! Thanks! Nathan

7 years, 4 months

2
2
0 0

[PATCH v3 3.18.y 0/3] 4.17-rc1 stable tagged ext4 patches

by Harsh Shandilya

These are all the ext4 patches that were tagged for -stable and failed to apply to 3.18.y. Patch e40ff2138985 ("ext4: force revalidation of directory pointer after seekdir(2)") was Cc'd to stable as well but it requires commmit ae5e165d855d ("fs: new API for handling inode->i_version") to be applied as well which is neither a stable candidate nor under 100 lines so I've skipped e40ff2138985. If somebody can suggest a backport of the commit which doesn't require ae5e165d855d, I'll be glad. Theodore Ts'o (3): ext4: add validity checks for bitmap block numbers ext4: fail ext4_iget for root directory if unallocated ext4: don't allow r/w mounts if metadata blocks overlap the superblock fs/ext4/balloc.c | 16 ++++++++++++++-- fs/ext4/ialloc.c | 8 +++++++- fs/ext4/inode.c | 6 ++++++ fs/ext4/super.c | 6 ++++++ 4 files changed, 33 insertions(+), 3 deletions(-) -- 2.15.0.2308.g658a28aa74af

7 years, 4 months

2
4
0 0

Please apply commit cf0d53ba4947 and 523184972b28 to v4.4.y, v4.9.y, v4.14.y

by Sinan Kaya

Hi Greg, Upstream commit cf0d53ba4947 ("vfio/pci: Virtualize Maximum Read Request Size") and commit 523184972b28 ("vfio/pci: Virtualize Maximum Payload Size") fixes nasty PCIe virtualization issues for platforms that support Maximum Payload Size bigger than 128. Issue shows up when a device is assigned to the guest machine as a passthrough. Guest machine configures the MPS/MRRS settings to values that are incompatible with the parent bridge device. This causes PCIe transaction timeouts and AER errors to be spilled in the host kernel. Please apply commit cf0d53ba4947 and 523184972b28 to all affected releases to fix the resulting regression. Thanks, Sinan -- Sinan Kaya Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

7 years, 4 months

3
7
0 0

FAILED: patch "[PATCH] ALSA: pcm: Avoid potential races between OSS ioctls and" failed to apply to 4.14-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.14-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 02a5d6925cd34c3b774bdb8eefb057c40a30e870 Mon Sep 17 00:00:00 2001 From: Takashi Iwai <tiwai(a)suse.de> Date: Thu, 22 Mar 2018 18:10:14 +0100 Subject: [PATCH] ALSA: pcm: Avoid potential races between OSS ioctls and read/write Although we apply the params_lock mutex to the whole read and write operations as well as snd_pcm_oss_change_params(), we may still face some races. First off, the params_lock is taken inside the read and write loop. This is intentional for avoiding the too long locking, but it allows the in-between parameter change, which might lead to invalid pointers. We check the readiness of the stream and set up via snd_pcm_oss_make_ready() at the beginning of read and write, but it's called only once, by assuming that it remains ready in the rest. Second, many ioctls that may change the actual parameters (i.e. setting runtime->oss.params=1) aren't protected, hence they can be processed in a half-baked state. This patch is an attempt to plug these holes. The stream readiness check is moved inside the read/write inner loop, so that the stream is always set up in a proper state before further processing. Also, each ioctl that may change the parameter is wrapped with the params_lock for avoiding the races. The issues were triggered by syzkaller in a few different scenarios, particularly the one below appearing as GPF in loopback_pos_update. Reported-by: syzbot+c4227aec125487ec3efa(a)syzkaller.appspotmail.com Cc: <stable(a)vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai(a)suse.de> diff --git a/sound/core/oss/pcm_oss.c b/sound/core/oss/pcm_oss.c index 02298c9c6020..f8bfee8022e0 100644 --- a/sound/core/oss/pcm_oss.c +++ b/sound/core/oss/pcm_oss.c @@ -823,8 +823,8 @@ static int choose_rate(struct snd_pcm_substream *substream, return snd_pcm_hw_param_near(substream, params, SNDRV_PCM_HW_PARAM_RATE, best_rate, NULL); } -static int snd_pcm_oss_change_params(struct snd_pcm_substream *substream, - bool trylock) +/* call with params_lock held */ +static int snd_pcm_oss_change_params_locked(struct snd_pcm_substream *substream) { struct snd_pcm_runtime *runtime = substream->runtime; struct snd_pcm_hw_params *params, *sparams; @@ -838,11 +838,8 @@ static int snd_pcm_oss_change_params(struct snd_pcm_substream *substream, const struct snd_mask *sformat_mask; struct snd_mask mask; - if (trylock) { - if (!(mutex_trylock(&runtime->oss.params_lock))) - return -EAGAIN; - } else if (mutex_lock_interruptible(&runtime->oss.params_lock)) - return -ERESTARTSYS; + if (!runtime->oss.params) + return 0; sw_params = kzalloc(sizeof(*sw_params), GFP_KERNEL); params = kmalloc(sizeof(*params), GFP_KERNEL); sparams = kmalloc(sizeof(*sparams), GFP_KERNEL); @@ -1068,6 +1065,23 @@ static int snd_pcm_oss_change_params(struct snd_pcm_substream *substream, kfree(sw_params); kfree(params); kfree(sparams); + return err; +} + +/* this one takes the lock by itself */ +static int snd_pcm_oss_change_params(struct snd_pcm_substream *substream, + bool trylock) +{ + struct snd_pcm_runtime *runtime = substream->runtime; + int err; + + if (trylock) { + if (!(mutex_trylock(&runtime->oss.params_lock))) + return -EAGAIN; + } else if (mutex_lock_interruptible(&runtime->oss.params_lock)) + return -ERESTARTSYS; + + err = snd_pcm_oss_change_params_locked(substream); mutex_unlock(&runtime->oss.params_lock); return err; } @@ -1096,11 +1110,14 @@ static int snd_pcm_oss_get_active_substream(struct snd_pcm_oss_file *pcm_oss_fil return 0; } +/* call with params_lock held */ static int snd_pcm_oss_prepare(struct snd_pcm_substream *substream) { int err; struct snd_pcm_runtime *runtime = substream->runtime; + if (!runtime->oss.prepare) + return 0; err = snd_pcm_kernel_ioctl(substream, SNDRV_PCM_IOCTL_PREPARE, NULL); if (err < 0) { pcm_dbg(substream->pcm, @@ -1120,14 +1137,35 @@ static int snd_pcm_oss_make_ready(struct snd_pcm_substream *substream) struct snd_pcm_runtime *runtime; int err; - if (substream == NULL) - return 0; runtime = substream->runtime; if (runtime->oss.params) { err = snd_pcm_oss_change_params(substream, false); if (err < 0) return err; } + if (runtime->oss.prepare) { + if (mutex_lock_interruptible(&runtime->oss.params_lock)) + return -ERESTARTSYS; + err = snd_pcm_oss_prepare(substream); + mutex_unlock(&runtime->oss.params_lock); + if (err < 0) + return err; + } + return 0; +} + +/* call with params_lock held */ +static int snd_pcm_oss_make_ready_locked(struct snd_pcm_substream *substream) +{ + struct snd_pcm_runtime *runtime; + int err; + + runtime = substream->runtime; + if (runtime->oss.params) { + err = snd_pcm_oss_change_params_locked(substream); + if (err < 0) + return err; + } if (runtime->oss.prepare) { err = snd_pcm_oss_prepare(substream); if (err < 0) @@ -1332,13 +1370,14 @@ static ssize_t snd_pcm_oss_write1(struct snd_pcm_substream *substream, const cha if (atomic_read(&substream->mmap_count)) return -ENXIO; - if ((tmp = snd_pcm_oss_make_ready(substream)) < 0) - return tmp; while (bytes > 0) { if (mutex_lock_interruptible(&runtime->oss.params_lock)) { tmp = -ERESTARTSYS; break; } + tmp = snd_pcm_oss_make_ready_locked(substream); + if (tmp < 0) + goto err; if (bytes < runtime->oss.period_bytes || runtime->oss.buffer_used > 0) { tmp = bytes; if (tmp + runtime->oss.buffer_used > runtime->oss.period_bytes) @@ -1439,13 +1478,14 @@ static ssize_t snd_pcm_oss_read1(struct snd_pcm_substream *substream, char __use if (atomic_read(&substream->mmap_count)) return -ENXIO; - if ((tmp = snd_pcm_oss_make_ready(substream)) < 0) - return tmp; while (bytes > 0) { if (mutex_lock_interruptible(&runtime->oss.params_lock)) { tmp = -ERESTARTSYS; break; } + tmp = snd_pcm_oss_make_ready_locked(substream); + if (tmp < 0) + goto err; if (bytes < runtime->oss.period_bytes || runtime->oss.buffer_used > 0) { if (runtime->oss.buffer_used == 0) { tmp = snd_pcm_oss_read2(substream, runtime->oss.buffer, runtime->oss.period_bytes, 1); @@ -1501,10 +1541,12 @@ static int snd_pcm_oss_reset(struct snd_pcm_oss_file *pcm_oss_file) continue; runtime = substream->runtime; snd_pcm_kernel_ioctl(substream, SNDRV_PCM_IOCTL_DROP, NULL); + mutex_lock(&runtime->oss.params_lock); runtime->oss.prepare = 1; runtime->oss.buffer_used = 0; runtime->oss.prev_hw_ptr_period = 0; runtime->oss.period_ptr = 0; + mutex_unlock(&runtime->oss.params_lock); } return 0; } @@ -1590,9 +1632,10 @@ static int snd_pcm_oss_sync(struct snd_pcm_oss_file *pcm_oss_file) goto __direct; if ((err = snd_pcm_oss_make_ready(substream)) < 0) return err; + if (mutex_lock_interruptible(&runtime->oss.params_lock)) + return -ERESTARTSYS; format = snd_pcm_oss_format_from(runtime->oss.format); width = snd_pcm_format_physical_width(format); - mutex_lock(&runtime->oss.params_lock); if (runtime->oss.buffer_used > 0) { #ifdef OSS_DEBUG pcm_dbg(substream->pcm, "sync: buffer_used\n"); @@ -1643,7 +1686,9 @@ static int snd_pcm_oss_sync(struct snd_pcm_oss_file *pcm_oss_file) substream->f_flags = saved_f_flags; if (err < 0) return err; + mutex_lock(&runtime->oss.params_lock); runtime->oss.prepare = 1; + mutex_unlock(&runtime->oss.params_lock); } substream = pcm_oss_file->streams[SNDRV_PCM_STREAM_CAPTURE]; @@ -1654,8 +1699,10 @@ static int snd_pcm_oss_sync(struct snd_pcm_oss_file *pcm_oss_file) err = snd_pcm_kernel_ioctl(substream, SNDRV_PCM_IOCTL_DROP, NULL); if (err < 0) return err; + mutex_lock(&runtime->oss.params_lock); runtime->oss.buffer_used = 0; runtime->oss.prepare = 1; + mutex_unlock(&runtime->oss.params_lock); } return 0; } @@ -1674,10 +1721,13 @@ static int snd_pcm_oss_set_rate(struct snd_pcm_oss_file *pcm_oss_file, int rate) rate = 1000; else if (rate > 192000) rate = 192000; + if (mutex_lock_interruptible(&runtime->oss.params_lock)) + return -ERESTARTSYS; if (runtime->oss.rate != rate) { runtime->oss.params = 1; runtime->oss.rate = rate; } + mutex_unlock(&runtime->oss.params_lock); } return snd_pcm_oss_get_rate(pcm_oss_file); } @@ -1705,10 +1755,13 @@ static int snd_pcm_oss_set_channels(struct snd_pcm_oss_file *pcm_oss_file, unsig if (substream == NULL) continue; runtime = substream->runtime; + if (mutex_lock_interruptible(&runtime->oss.params_lock)) + return -ERESTARTSYS; if (runtime->oss.channels != channels) { runtime->oss.params = 1; runtime->oss.channels = channels; } + mutex_unlock(&runtime->oss.params_lock); } return snd_pcm_oss_get_channels(pcm_oss_file); } @@ -1794,10 +1847,13 @@ static int snd_pcm_oss_set_format(struct snd_pcm_oss_file *pcm_oss_file, int for if (substream == NULL) continue; runtime = substream->runtime; + if (mutex_lock_interruptible(&runtime->oss.params_lock)) + return -ERESTARTSYS; if (runtime->oss.format != format) { runtime->oss.params = 1; runtime->oss.format = format; } + mutex_unlock(&runtime->oss.params_lock); } } return snd_pcm_oss_get_format(pcm_oss_file); @@ -1817,8 +1873,6 @@ static int snd_pcm_oss_set_subdivide1(struct snd_pcm_substream *substream, int s { struct snd_pcm_runtime *runtime; - if (substream == NULL) - return 0; runtime = substream->runtime; if (subdivide == 0) { subdivide = runtime->oss.subdivision; @@ -1842,9 +1896,16 @@ static int snd_pcm_oss_set_subdivide(struct snd_pcm_oss_file *pcm_oss_file, int for (idx = 1; idx >= 0; --idx) { struct snd_pcm_substream *substream = pcm_oss_file->streams[idx]; + struct snd_pcm_runtime *runtime; + if (substream == NULL) continue; - if ((err = snd_pcm_oss_set_subdivide1(substream, subdivide)) < 0) + runtime = substream->runtime; + if (mutex_lock_interruptible(&runtime->oss.params_lock)) + return -ERESTARTSYS; + err = snd_pcm_oss_set_subdivide1(substream, subdivide); + mutex_unlock(&runtime->oss.params_lock); + if (err < 0) return err; } return err; @@ -1854,8 +1915,6 @@ static int snd_pcm_oss_set_fragment1(struct snd_pcm_substream *substream, unsign { struct snd_pcm_runtime *runtime; - if (substream == NULL) - return 0; runtime = substream->runtime; if (runtime->oss.subdivision || runtime->oss.fragshift) return -EINVAL; @@ -1875,9 +1934,16 @@ static int snd_pcm_oss_set_fragment(struct snd_pcm_oss_file *pcm_oss_file, unsig for (idx = 1; idx >= 0; --idx) { struct snd_pcm_substream *substream = pcm_oss_file->streams[idx]; + struct snd_pcm_runtime *runtime; + if (substream == NULL) continue; - if ((err = snd_pcm_oss_set_fragment1(substream, val)) < 0) + runtime = substream->runtime; + if (mutex_lock_interruptible(&runtime->oss.params_lock)) + return -ERESTARTSYS; + err = snd_pcm_oss_set_fragment1(substream, val); + mutex_unlock(&runtime->oss.params_lock); + if (err < 0) return err; } return err; @@ -1961,6 +2027,9 @@ static int snd_pcm_oss_set_trigger(struct snd_pcm_oss_file *pcm_oss_file, int tr } if (psubstream) { runtime = psubstream->runtime; + cmd = 0; + if (mutex_lock_interruptible(&runtime->oss.params_lock)) + return -ERESTARTSYS; if (trigger & PCM_ENABLE_OUTPUT) { if (runtime->oss.trigger) goto _skip1; @@ -1978,13 +2047,19 @@ static int snd_pcm_oss_set_trigger(struct snd_pcm_oss_file *pcm_oss_file, int tr cmd = SNDRV_PCM_IOCTL_DROP; runtime->oss.prepare = 1; } - err = snd_pcm_kernel_ioctl(psubstream, cmd, NULL); - if (err < 0) - return err; - } _skip1: + mutex_unlock(&runtime->oss.params_lock); + if (cmd) { + err = snd_pcm_kernel_ioctl(psubstream, cmd, NULL); + if (err < 0) + return err; + } + } if (csubstream) { runtime = csubstream->runtime; + cmd = 0; + if (mutex_lock_interruptible(&runtime->oss.params_lock)) + return -ERESTARTSYS; if (trigger & PCM_ENABLE_INPUT) { if (runtime->oss.trigger) goto _skip2; @@ -1999,11 +2074,14 @@ static int snd_pcm_oss_set_trigger(struct snd_pcm_oss_file *pcm_oss_file, int tr cmd = SNDRV_PCM_IOCTL_DROP; runtime->oss.prepare = 1; } - err = snd_pcm_kernel_ioctl(csubstream, cmd, NULL); - if (err < 0) - return err; - } _skip2: + mutex_unlock(&runtime->oss.params_lock); + if (cmd) { + err = snd_pcm_kernel_ioctl(csubstream, cmd, NULL); + if (err < 0) + return err; + } + } return 0; }

7 years, 4 months

3
2
0 0

[PATCH 0/4] 4.17-rc1 stable tagged ext4 patches for 3.18.y

by Harsh Shandilya

These are all the ext4 patches that were tagged for -stable and failed to apply to 3.18.y. Side note: Patch e15dc99dbb9c ("ALSA: pcm: Fix endless loop for XRUN recovery in OSS emulation") which was tagged for -stable is not required on 3.18.y so I have skipped the backport. Theodore Ts'o (4): ext4: add validity checks for bitmap block numbers ext4: fail ext4_iget for root directory if unallocated ext4: don't allow r/w mounts if metadata blocks overlap the superblock ext4: force revalidation of directory pointer after seekdir(2) fs/ext4/balloc.c | 16 ++++++++++++++-- fs/ext4/dir.c | 8 +++++--- fs/ext4/ialloc.c | 7 +++++++ fs/ext4/inode.c | 6 ++++++ fs/ext4/super.c | 6 ++++++ 5 files changed, 38 insertions(+), 5 deletions(-) -- 2.15.0.2308.g658a28aa74af

7 years, 4 months

3
13
0 0

[PATCH v6] blk-mq: Avoid that a completion can be ignored for BLK_EH_RESET_TIMER

by Bart Van Assche

The blk-mq timeout handling code ignores completions that occur after blk_mq_check_expired() has been called and before blk_mq_rq_timed_out() has reset rq->aborted_gstate. If a block driver timeout handler always returns BLK_EH_RESET_TIMER then the result will be that the request never terminates. Fix this race as follows: - Use the deadline instead of the request generation to detect whether or not a request timer fired after reinitialization of a request. - Store the request state in the lowest two bits of the deadline instead of the lowest two bits of 'gstate'. - Rename MQ_RQ_STATE_MASK into RQ_STATE_MASK and change it from an enumeration member into a #define such that its type can be changed into unsigned long. That allows to write & ~RQ_STATE_MASK instead of ~(unsigned long)RQ_STATE_MASK. - Remove all request member variables that became superfluous due to this change: gstate, gstate_seq and aborted_gstate_sync. - Remove the request state information that became superfluous due to this patch, namely RQF_MQ_TIMEOUT_EXPIRED. - Remove the code that became superfluous due to this change, namely the RCU lock and unlock statements in blk_mq_complete_request() and also the synchronize_rcu() call in the timeout handler. Signed-off-by: Bart Van Assche <bart.vanassche(a)wdc.com> Cc: Tejun Heo <tj(a)kernel.org> Cc: Christoph Hellwig <hch(a)lst.de> Cc: Ming Lei <ming.lei(a)redhat.com> Cc: Sagi Grimberg <sagi(a)grimberg.me> Cc: Israel Rukshin <israelr(a)mellanox.com>, Cc: Max Gurtovoy <maxg(a)mellanox.com> Cc: <stable(a)vger.kernel.org> # v4.16 --- Changes compared to v5: - Restored the synchronize_rcu() call between marking a request for timeout handling and the actual timeout handling to avoid that timeout handling starts while .queue_rq() is still in progress if the timeout is very short. - Only use cmpxchg() if another context could attempt to change the request state concurrently. Use WRITE_ONCE() otherwise. Changes compared to v4: - Addressed multiple review comments from Christoph. The most important are that atomic_long_cmpxchg() has been changed into cmpxchg() and also that there is now a nice and clean split between the legacy and blk-mq versions of blk_add_timer(). - Changed the patch name and modified the patch description because there is disagreement about whether or not the v4.16 blk-mq core can complete a single request twice. Kept the "Cc: stable" tag because of https://bugzilla.kernel.org/show_bug.cgi?id=199077. Changes compared to v3 (see also https://www.mail-archive.com/linux-block@vger.kernel.org/msg20073.html): - Removed the spinlock again that was introduced to protect the request state. v4 uses atomic_long_cmpxchg() instead. - Split __deadline into two variables - one for the legacy block layer and one for blk-mq. Changes compared to v2 (https://www.mail-archive.com/linux-block@vger.kernel.org/msg18338.html): - Rebased and retested on top of kernel v4.16. Changes compared to v1 (https://www.mail-archive.com/linux-block@vger.kernel.org/msg18089.html): - Removed the gstate and aborted_gstate members of struct request and used the __deadline member to encode both the generation and state information. block/blk-core.c | 6 -- block/blk-mq-debugfs.c | 1 - block/blk-mq.c | 158 ++++++++++--------------------------------------- block/blk-mq.h | 85 +++++++++++++++++--------- block/blk-timeout.c | 89 ++++++++++++++++------------ block/blk.h | 13 ++-- include/linux/blkdev.h | 29 +++------ 7 files changed, 154 insertions(+), 227 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index de90ecab61cd..730a8e3be7ce 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -200,12 +200,6 @@ void blk_rq_init(struct request_queue *q, struct request *rq) rq->start_time = jiffies; set_start_time_ns(rq); rq->part = NULL; - seqcount_init(&rq->gstate_seq); - u64_stats_init(&rq->aborted_gstate_sync); - /* - * See comment of blk_mq_init_request - */ - WRITE_ONCE(rq->gstate, MQ_RQ_GEN_INC); } EXPORT_SYMBOL(blk_rq_init); diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c index adb8d6f00098..529383841b3b 100644 --- a/block/blk-mq-debugfs.c +++ b/block/blk-mq-debugfs.c @@ -346,7 +346,6 @@ static const char *const rqf_name[] = { RQF_NAME(STATS), RQF_NAME(SPECIAL_PAYLOAD), RQF_NAME(ZONE_WRITE_LOCKED), - RQF_NAME(MQ_TIMEOUT_EXPIRED), RQF_NAME(MQ_POLL_SLEPT), }; #undef RQF_NAME diff --git a/block/blk-mq.c b/block/blk-mq.c index bb7f59d319fa..6f20845827f4 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -481,7 +481,8 @@ void blk_mq_free_request(struct request *rq) if (blk_rq_rl(rq)) blk_put_rl(blk_rq_rl(rq)); - blk_mq_rq_update_state(rq, MQ_RQ_IDLE); + if (!blk_mq_change_rq_state(rq, blk_mq_rq_state(rq), MQ_RQ_IDLE)) + WARN_ON_ONCE(true); if (rq->tag != -1) blk_mq_put_tag(hctx, hctx->tags, ctx, rq->tag); if (sched_tag != -1) @@ -527,8 +528,7 @@ static void __blk_mq_complete_request(struct request *rq) bool shared = false; int cpu; - WARN_ON_ONCE(blk_mq_rq_state(rq) != MQ_RQ_IN_FLIGHT); - blk_mq_rq_update_state(rq, MQ_RQ_COMPLETE); + WARN_ON_ONCE(blk_mq_rq_state(rq) != MQ_RQ_COMPLETE); if (rq->internal_tag != -1) blk_mq_sched_completed_request(rq); @@ -577,36 +577,6 @@ static void hctx_lock(struct blk_mq_hw_ctx *hctx, int *srcu_idx) *srcu_idx = srcu_read_lock(hctx->srcu); } -static void blk_mq_rq_update_aborted_gstate(struct request *rq, u64 gstate) -{ - unsigned long flags; - - /* - * blk_mq_rq_aborted_gstate() is used from the completion path and - * can thus be called from irq context. u64_stats_fetch in the - * middle of update on the same CPU leads to lockup. Disable irq - * while updating. - */ - local_irq_save(flags); - u64_stats_update_begin(&rq->aborted_gstate_sync); - rq->aborted_gstate = gstate; - u64_stats_update_end(&rq->aborted_gstate_sync); - local_irq_restore(flags); -} - -static u64 blk_mq_rq_aborted_gstate(struct request *rq) -{ - unsigned int start; - u64 aborted_gstate; - - do { - start = u64_stats_fetch_begin(&rq->aborted_gstate_sync); - aborted_gstate = rq->aborted_gstate; - } while (u64_stats_fetch_retry(&rq->aborted_gstate_sync, start)); - - return aborted_gstate; -} - /** * blk_mq_complete_request - end I/O on a request * @rq: the request being processed @@ -618,27 +588,12 @@ static u64 blk_mq_rq_aborted_gstate(struct request *rq) void blk_mq_complete_request(struct request *rq) { struct request_queue *q = rq->q; - struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, rq->mq_ctx->cpu); - int srcu_idx; if (unlikely(blk_should_fake_timeout(q))) return; - /* - * If @rq->aborted_gstate equals the current instance, timeout is - * claiming @rq and we lost. This is synchronized through - * hctx_lock(). See blk_mq_timeout_work() for details. - * - * Completion path never blocks and we can directly use RCU here - * instead of hctx_lock() which can be either RCU or SRCU. - * However, that would complicate paths which want to synchronize - * against us. Let stay in sync with the issue path so that - * hctx_lock() covers both issue and completion paths. - */ - hctx_lock(hctx, &srcu_idx); - if (blk_mq_rq_aborted_gstate(rq) != rq->gstate) + if (blk_mq_change_rq_state(rq, MQ_RQ_IN_FLIGHT, MQ_RQ_COMPLETE)) __blk_mq_complete_request(rq); - hctx_unlock(hctx, srcu_idx); } EXPORT_SYMBOL(blk_mq_complete_request); @@ -662,27 +617,7 @@ void blk_mq_start_request(struct request *rq) wbt_issue(q->rq_wb, &rq->issue_stat); } - WARN_ON_ONCE(blk_mq_rq_state(rq) != MQ_RQ_IDLE); - - /* - * Mark @rq in-flight which also advances the generation number, - * and register for timeout. Protect with a seqcount to allow the - * timeout path to read both @rq->gstate and @rq->deadline - * coherently. - * - * This is the only place where a request is marked in-flight. If - * the timeout path reads an in-flight @rq->gstate, the - * @rq->deadline it reads together under @rq->gstate_seq is - * guaranteed to be the matching one. - */ - preempt_disable(); - write_seqcount_begin(&rq->gstate_seq); - - blk_mq_rq_update_state(rq, MQ_RQ_IN_FLIGHT); - blk_add_timer(rq); - - write_seqcount_end(&rq->gstate_seq); - preempt_enable(); + blk_mq_add_timer(rq, MQ_RQ_IDLE, MQ_RQ_IN_FLIGHT); if (q->dma_drain_size && blk_rq_bytes(rq)) { /* @@ -695,22 +630,19 @@ void blk_mq_start_request(struct request *rq) } EXPORT_SYMBOL(blk_mq_start_request); -/* - * When we reach here because queue is busy, it's safe to change the state - * to IDLE without checking @rq->aborted_gstate because we should still be - * holding the RCU read lock and thus protected against timeout. - */ static void __blk_mq_requeue_request(struct request *rq) { struct request_queue *q = rq->q; + enum mq_rq_state old_state = blk_mq_rq_state(rq); blk_mq_put_driver_tag(rq); trace_block_rq_requeue(q, rq); wbt_requeue(q->rq_wb, &rq->issue_stat); - if (blk_mq_rq_state(rq) != MQ_RQ_IDLE) { - blk_mq_rq_update_state(rq, MQ_RQ_IDLE); + if (old_state != MQ_RQ_IDLE) { + if (!blk_mq_change_rq_state(rq, old_state, MQ_RQ_IDLE)) + WARN_ON_ONCE(true); if (q->dma_drain_size && blk_rq_bytes(rq)) rq->nr_phys_segments--; } @@ -819,8 +751,6 @@ static void blk_mq_rq_timed_out(struct request *req, bool reserved) const struct blk_mq_ops *ops = req->q->mq_ops; enum blk_eh_timer_return ret = BLK_EH_RESET_TIMER; - req->rq_flags |= RQF_MQ_TIMEOUT_EXPIRED; - if (ops->timeout) ret = ops->timeout(req, reserved); @@ -829,13 +759,7 @@ static void blk_mq_rq_timed_out(struct request *req, bool reserved) __blk_mq_complete_request(req); break; case BLK_EH_RESET_TIMER: - /* - * As nothing prevents from completion happening while - * ->aborted_gstate is set, this may lead to ignored - * completions and further spurious timeouts. - */ - blk_mq_rq_update_aborted_gstate(req, 0); - blk_add_timer(req); + blk_mq_add_timer(req, MQ_RQ_COMPLETE, MQ_RQ_IN_FLIGHT); break; case BLK_EH_NOT_HANDLED: break; @@ -849,48 +773,35 @@ static void blk_mq_check_expired(struct blk_mq_hw_ctx *hctx, struct request *rq, void *priv, bool reserved) { struct blk_mq_timeout_data *data = priv; - unsigned long gstate, deadline; - int start; - - might_sleep(); + unsigned long __deadline = READ_ONCE(rq->__deadline); + unsigned long deadline = __deadline & ~RQ_STATE_MASK; + enum mq_rq_state rq_state = __deadline & RQ_STATE_MASK; - if (rq->rq_flags & RQF_MQ_TIMEOUT_EXPIRED) - return; - - /* read coherent snapshots of @rq->state_gen and @rq->deadline */ - while (true) { - start = read_seqcount_begin(&rq->gstate_seq); - gstate = READ_ONCE(rq->gstate); - deadline = blk_rq_deadline(rq); - if (!read_seqcount_retry(&rq->gstate_seq, start)) - break; - cond_resched(); - } - - /* if in-flight && overdue, mark for abortion */ - if ((gstate & MQ_RQ_STATE_MASK) == MQ_RQ_IN_FLIGHT && - time_after_eq(jiffies, deadline)) { - blk_mq_rq_update_aborted_gstate(rq, gstate); + rq->aborted_gstate = __deadline ^ (1ULL << 63); + if (time_after_eq(jiffies, deadline) && rq_state == MQ_RQ_IN_FLIGHT) { + rq->aborted_gstate = __deadline; data->nr_expired++; hctx->nr_expired++; } else if (!data->next_set || time_after(data->next, deadline)) { data->next = deadline; data->next_set = 1; } + } static void blk_mq_terminate_expired(struct blk_mq_hw_ctx *hctx, struct request *rq, void *priv, bool reserved) { + unsigned long old_val = rq->aborted_gstate; + unsigned long new_val = (rq->aborted_gstate & ~RQ_STATE_MASK) | + MQ_RQ_COMPLETE; + /* - * We marked @rq->aborted_gstate and waited for RCU. If there were - * completions that we lost to, they would have finished and - * updated @rq->gstate by now; otherwise, the completion path is - * now guaranteed to see @rq->aborted_gstate and yield. If - * @rq->aborted_gstate still matches @rq->gstate, @rq is ours. + * We marked @rq->aborted_gstate and waited for ongoing .queue_rq() + * calls. If rq->__deadline has not changed that means that it is + * now safe to change the request state and to handle the timeout. */ - if (!(rq->rq_flags & RQF_MQ_TIMEOUT_EXPIRED) && - READ_ONCE(rq->gstate) == rq->aborted_gstate) + if (cmpxchg(&rq->__deadline, old_val, new_val) == old_val) blk_mq_rq_timed_out(rq, reserved); } @@ -929,10 +840,10 @@ static void blk_mq_timeout_work(struct work_struct *work) bool has_rcu = false; /* - * Wait till everyone sees ->aborted_gstate. The - * sequential waits for SRCUs aren't ideal. If this ever - * becomes a problem, we can add per-hw_ctx rcu_head and - * wait in parallel. + * For very short timeouts it can happen that + * blk_mq_check_expired() modifies the state of a request + * while .queue_rq() is still in progress. Hence wait until + * these .queue_rq() calls have finished. */ queue_for_each_hw_ctx(q, hctx, i) { if (!hctx->nr_expired) @@ -948,7 +859,7 @@ static void blk_mq_timeout_work(struct work_struct *work) if (has_rcu) synchronize_rcu(); - /* terminate the ones we won */ + /* Terminate the requests marked by blk_mq_check_expired(). */ blk_mq_queue_tag_busy_iter(q, blk_mq_terminate_expired, NULL); } @@ -2060,15 +1971,6 @@ static int blk_mq_init_request(struct blk_mq_tag_set *set, struct request *rq, return ret; } - seqcount_init(&rq->gstate_seq); - u64_stats_init(&rq->aborted_gstate_sync); - /* - * start gstate with gen 1 instead of 0, otherwise it will be equal - * to aborted_gstate, and be identified timed out by - * blk_mq_terminate_expired. - */ - WRITE_ONCE(rq->gstate, MQ_RQ_GEN_INC); - return 0; } diff --git a/block/blk-mq.h b/block/blk-mq.h index 88c558f71819..66efc8a3988b 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -27,18 +27,11 @@ struct blk_mq_ctx { struct kobject kobj; } ____cacheline_aligned_in_smp; -/* - * Bits for request->gstate. The lower two bits carry MQ_RQ_* state value - * and the upper bits the generation number. - */ +/* Lowest two bits of request->__deadline. */ enum mq_rq_state { MQ_RQ_IDLE = 0, MQ_RQ_IN_FLIGHT = 1, MQ_RQ_COMPLETE = 2, - - MQ_RQ_STATE_BITS = 2, - MQ_RQ_STATE_MASK = (1 << MQ_RQ_STATE_BITS) - 1, - MQ_RQ_GEN_INC = 1 << MQ_RQ_STATE_BITS, }; void blk_mq_freeze_queue(struct request_queue *q); @@ -100,37 +93,73 @@ extern void blk_mq_hctx_kobj_init(struct blk_mq_hw_ctx *hctx); void blk_mq_release(struct request_queue *q); +/* + * If the state of request @rq equals @old_state, update deadline and request + * state atomically to @time and @new_state. blk-mq only. cmpxchg() is only + * used if there could be a concurrent update attempt from another context. + */ +static inline bool blk_mq_rq_set_deadline(struct request *rq, + unsigned long new_time, + enum mq_rq_state old_state, + enum mq_rq_state new_state) +{ + unsigned long old_val, new_val; + + if (old_state != MQ_RQ_IN_FLIGHT) { + old_val = READ_ONCE(rq->__deadline); + if ((old_val & RQ_STATE_MASK) != old_state) + return false; + new_val = (new_time & ~RQ_STATE_MASK) | + (new_state & RQ_STATE_MASK); + WRITE_ONCE(rq->__deadline, new_val); + return true; + } + + do { + old_val = READ_ONCE(rq->__deadline); + if ((old_val & RQ_STATE_MASK) != old_state) + return false; + new_val = (new_time & ~RQ_STATE_MASK) | + (new_state & RQ_STATE_MASK); + } while (cmpxchg(&rq->__deadline, old_val, new_val) != old_val); + + return true; +} + /** * blk_mq_rq_state() - read the current MQ_RQ_* state of a request * @rq: target request. */ -static inline int blk_mq_rq_state(struct request *rq) +static inline enum mq_rq_state blk_mq_rq_state(struct request *rq) { - return READ_ONCE(rq->gstate) & MQ_RQ_STATE_MASK; + return READ_ONCE(rq->__deadline) & RQ_STATE_MASK; } /** - * blk_mq_rq_update_state() - set the current MQ_RQ_* state of a request - * @rq: target request. - * @state: new state to set. + * blk_mq_change_rq_state - atomically test and set request state + * @rq: Request pointer. + * @old_state: Old request state. + * @new_state: New request state. * - * Set @rq's state to @state. The caller is responsible for ensuring that - * there are no other updaters. A request can transition into IN_FLIGHT - * only from IDLE and doing so increments the generation number. + * Returns %true if and only if the old state was @old and if the state has + * been changed into @new. */ -static inline void blk_mq_rq_update_state(struct request *rq, - enum mq_rq_state state) +static inline bool blk_mq_change_rq_state(struct request *rq, + enum mq_rq_state old_state, + enum mq_rq_state new_state) { - u64 old_val = READ_ONCE(rq->gstate); - u64 new_val = (old_val & ~MQ_RQ_STATE_MASK) | state; - - if (state == MQ_RQ_IN_FLIGHT) { - WARN_ON_ONCE((old_val & MQ_RQ_STATE_MASK) != MQ_RQ_IDLE); - new_val += MQ_RQ_GEN_INC; - } - - /* avoid exposing interim values */ - WRITE_ONCE(rq->gstate, new_val); + unsigned long old_val = (READ_ONCE(rq->__deadline) & ~RQ_STATE_MASK) | + old_state; + unsigned long new_val = (old_val & ~RQ_STATE_MASK) | new_state; + + /* + * For transitions from state in-flight to another state cmpxchg() must + * be used. For other state transitions it is safe to use WRITE_ONCE(). + */ + if (old_state == MQ_RQ_IN_FLIGHT) + return cmpxchg(&rq->__deadline, old_val, new_val) == old_val; + WRITE_ONCE(rq->__deadline, new_val); + return true; } static inline struct blk_mq_ctx *__blk_mq_get_ctx(struct request_queue *q, diff --git a/block/blk-timeout.c b/block/blk-timeout.c index 50a191720055..e98da6db7d4b 100644 --- a/block/blk-timeout.c +++ b/block/blk-timeout.c @@ -165,8 +165,9 @@ void blk_abort_request(struct request *req) * immediately and that scan sees the new timeout value. * No need for fancy synchronizations. */ - blk_rq_set_deadline(req, jiffies); - kblockd_schedule_work(&req->q->timeout_work); + if (blk_mq_rq_set_deadline(req, jiffies, MQ_RQ_IN_FLIGHT, + MQ_RQ_IN_FLIGHT)) + kblockd_schedule_work(&req->q->timeout_work); } else { if (blk_mark_rq_complete(req)) return; @@ -187,52 +188,17 @@ unsigned long blk_rq_timeout(unsigned long timeout) return timeout; } -/** - * blk_add_timer - Start timeout timer for a single request - * @req: request that is about to start running. - * - * Notes: - * Each request has its own timer, and as it is added to the queue, we - * set up the timer. When the request completes, we cancel the timer. - */ -void blk_add_timer(struct request *req) +static void __blk_add_timer(struct request *req) { struct request_queue *q = req->q; unsigned long expiry; - if (!q->mq_ops) - lockdep_assert_held(q->queue_lock); - - /* blk-mq has its own handler, so we don't need ->rq_timed_out_fn */ - if (!q->mq_ops && !q->rq_timed_out_fn) - return; - - BUG_ON(!list_empty(&req->timeout_list)); - - /* - * Some LLDs, like scsi, peek at the timeout to prevent a - * command from being retried forever. - */ - if (!req->timeout) - req->timeout = q->rq_timeout; - - blk_rq_set_deadline(req, jiffies + req->timeout); - req->rq_flags &= ~RQF_MQ_TIMEOUT_EXPIRED; - - /* - * Only the non-mq case needs to add the request to a protected list. - * For the mq case we simply scan the tag map. - */ - if (!q->mq_ops) - list_add_tail(&req->timeout_list, &req->q->timeout_list); - /* * If the timer isn't already pending or this timeout is earlier * than an existing one, modify the timer. Round up to next nearest * second. */ expiry = blk_rq_timeout(round_jiffies_up(blk_rq_deadline(req))); - if (!timer_pending(&q->timeout) || time_before(expiry, q->timeout.expires)) { unsigned long diff = q->timeout.expires - expiry; @@ -247,5 +213,52 @@ void blk_add_timer(struct request *req) if (!timer_pending(&q->timeout) || (diff >= HZ / 2)) mod_timer(&q->timeout, expiry); } +} + +/** + * blk_add_timer - Start timeout timer for a single request + * @req: request that is about to start running. + * + * Notes: + * Each request has its own timer, and as it is added to the queue, we + * set up the timer. When the request completes, we cancel the timer. + */ +void blk_add_timer(struct request *req) +{ + struct request_queue *q = req->q; + + lockdep_assert_held(q->queue_lock); + if (!q->rq_timed_out_fn) + return; + if (!req->timeout) + req->timeout = q->rq_timeout; + + blk_rq_set_deadline(req, jiffies + req->timeout); + list_add_tail(&req->timeout_list, &req->q->timeout_list); + + return __blk_add_timer(req); +} + +/** + * blk_mq_add_timer - set the deadline for a single request + * @req: request for which to set the deadline. + * @old: current request state. + * @new: new request state. + * + * Sets the deadline of a request if and only if it has state @old and + * at the same time changes the request state from @old into @new. The caller + * must guarantee that the request state won't be modified while this function + * is in progress. + */ +void blk_mq_add_timer(struct request *req, enum mq_rq_state old, + enum mq_rq_state new) +{ + struct request_queue *q = req->q; + + if (!req->timeout) + req->timeout = q->rq_timeout; + if (!blk_mq_rq_set_deadline(req, jiffies + req->timeout, old, new)) + WARN_ON_ONCE(true); + return __blk_add_timer(req); } diff --git a/block/blk.h b/block/blk.h index b034fd2460c4..7cd64f533a46 100644 --- a/block/blk.h +++ b/block/blk.h @@ -170,6 +170,8 @@ static inline bool bio_integrity_endio(struct bio *bio) void blk_timeout_work(struct work_struct *work); unsigned long blk_rq_timeout(unsigned long timeout); void blk_add_timer(struct request *req); +void blk_mq_add_timer(struct request *req, enum mq_rq_state old, + enum mq_rq_state new); void blk_delete_timer(struct request *); @@ -308,18 +310,19 @@ static inline void req_set_nomerge(struct request_queue *q, struct request *req) } /* - * Steal a bit from this field for legacy IO path atomic IO marking. Note that - * setting the deadline clears the bottom bit, potentially clearing the - * completed bit. The user has to be OK with this (current ones are fine). + * Steal two bits from this field. The legacy IO path uses the lowest bit for + * atomic IO marking. Note that setting the deadline clears the bottom bit, + * potentially clearing the completed bit. The current legacy block layer is + * fine with that. Must be called with the request queue lock held. */ static inline void blk_rq_set_deadline(struct request *rq, unsigned long time) { - rq->__deadline = time & ~0x1UL; + rq->__deadline = time & RQ_STATE_MASK; } static inline unsigned long blk_rq_deadline(struct request *rq) { - return rq->__deadline & ~0x1UL; + return rq->__deadline & ~RQ_STATE_MASK; } /* diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index b7681f3ee793..51cd69f14537 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -27,8 +27,6 @@ #include <linux/percpu-refcount.h> #include <linux/scatterlist.h> #include <linux/blkzoned.h> -#include <linux/seqlock.h> -#include <linux/u64_stats_sync.h> struct module; struct scsi_ioctl_command; @@ -125,10 +123,8 @@ typedef __u32 __bitwise req_flags_t; #define RQF_SPECIAL_PAYLOAD ((__force req_flags_t)(1 << 18)) /* The per-zone write lock is held for this request */ #define RQF_ZONE_WRITE_LOCKED ((__force req_flags_t)(1 << 19)) -/* timeout is expired */ -#define RQF_MQ_TIMEOUT_EXPIRED ((__force req_flags_t)(1 << 20)) /* already slept for hybrid poll */ -#define RQF_MQ_POLL_SLEPT ((__force req_flags_t)(1 << 21)) +#define RQF_MQ_POLL_SLEPT ((__force req_flags_t)(1 << 20)) /* flags that prevent us from merging requests: */ #define RQF_NOMERGE_FLAGS \ @@ -225,28 +221,19 @@ struct request { unsigned int extra_len; /* length of alignment and padding */ - /* - * On blk-mq, the lower bits of ->gstate (generation number and - * state) carry the MQ_RQ_* state value and the upper bits the - * generation number which is monotonically incremented and used to - * distinguish the reuse instances. - * - * ->gstate_seq allows updates to ->gstate and other fields - * (currently ->deadline) during request start to be read - * atomically from the timeout path, so that it can operate on a - * coherent set of information. - */ - seqcount_t gstate_seq; - u64 gstate; - /* * ->aborted_gstate is used by the timeout to claim a specific * recycle instance of this request. See blk_mq_timeout_work(). */ - struct u64_stats_sync aborted_gstate_sync; u64 aborted_gstate; - /* access through blk_rq_set_deadline, blk_rq_deadline */ + /* + * Access through blk_rq_deadline() and blk_rq_set_deadline(), + * blk_mark_rq_complete(), blk_clear_rq_complete() and + * blk_rq_is_complete() for legacy queues or blk_mq_rq_state() for + * blk-mq queues. + */ +#define RQ_STATE_MASK 0x3UL unsigned long __deadline; struct list_head timeout_list; -- 2.16.3

7 years, 4 months

4
8
0 0

GREETING FROM MISS QADESA,,F

by Miss Qadesa AbdulAziz

Friend, My name is Miss Qadesa AbdulAziz and I am 17 years old girl from Syria. There is serious war crisis here in Syria, and I have lost my parents and my two brothers in this war. I want you to help me and receive ($7.md) which my late father deposited with my name in a bank in London. I want to come to your country and start a new life and invest with you, because am the only survival in my family. I wait to hear from you, Please do not let me die here, I begging you the name of Almighty. please respond here my pirate email qadesa(a)protonmail.com Regards Miss Qadesa AbdulAziz

7 years, 4 months

1
0
0 0

[PATCH v2] target: Fix Fortify_panic kernel exception

by Bryant G. Ly

The bug exists in the memcmp in which the length passed in must be guaranteed to be 1. This bug currently exists because the second pointer passed in, can be smaller than the cmd->data_length, which causes a fortify_panic. The fix is to use memchr_inv instead to find whether or not a 0 exists instead of using memcmp. This way you dont have to worry about buffer overflow which is the reason for the fortify_panic. The bug was found by running a block backstore via LIO. [ 496.212958] Call Trace: [ 496.212960] [c0000007e58e3800] [c000000000cbbefc] fortify_panic+0x24/0x38 (unreliable) [ 496.212965] [c0000007e58e3860] [d00000000f150c28] iblock_execute_write_same+0x3b8/0x3c0 [target_core_iblock] [ 496.212976] [c0000007e58e3910] [d000000006c737d4] __target_execute_cmd+0x54/0x150 [target_core_mod] [ 496.212982] [c0000007e58e3940] [d000000006d32ce4] ibmvscsis_write_pending+0x74/0xe0 [ibmvscsis] [ 496.212991] [c0000007e58e39b0] [d000000006c74fc8] transport_generic_new_cmd+0x318/0x370 [target_core_mod] [ 496.213001] [c0000007e58e3a30] [d000000006c75084] transport_handle_cdb_direct+0x64/0xd0 [target_core_mod] [ 496.213011] [c0000007e58e3aa0] [d000000006c75298] target_submit_cmd_map_sgls+0x1a8/0x320 [target_core_mod] [ 496.213021] [c0000007e58e3b30] [d000000006c75458] target_submit_cmd+0x48/0x60 [target_core_mod] [ 496.213026] [c0000007e58e3bd0] [d000000006d34c20] ibmvscsis_scheduler+0x370/0x600 [ibmvscsis] [ 496.213031] [c0000007e58e3c90] [c00000000013135c] process_one_work+0x1ec/0x580 [ 496.213035] [c0000007e58e3d20] [c000000000131798] worker_thread+0xa8/0x600 [ 496.213039] [c0000007e58e3dc0] [c00000000013a468] kthread+0x168/0x1b0 [ 496.213044] [c0000007e58e3e30] [c00000000000b528] ret_from_kernel_thread+0x5c/0xb4 Fixes: 2237498f0b5c ("target/iblock: Convert WRITE_SAME to blkdev_issue_zeroout") Signed-off-by: Bryant G. Ly <bryantly(a)linux.vnet.ibm.com> Reviewed-by: Steven Royer <seroyer(a)linux.vnet.ibm.com> Tested-by: Taylor Jakobson <tjakobs(a)us.ibm.com> Cc: Christoph Hellwig <hch(a)lst.de> Cc: Nicholas Bellinger <nab(a)linux-iscsi.org> Cc: <stable(a)vger.kernel.org> --- drivers/target/target_core_iblock.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/target/target_core_iblock.c b/drivers/target/target_core_iblock.c index 07c814c..6042901 100644 --- a/drivers/target/target_core_iblock.c +++ b/drivers/target/target_core_iblock.c @@ -427,8 +427,8 @@ iblock_execute_zero_out(struct block_device *bdev, struct se_cmd *cmd) { struct se_device *dev = cmd->se_dev; struct scatterlist *sg = &cmd->t_data_sg[0]; - unsigned char *buf, zero = 0x00, *p = &zero; - int rc, ret; + unsigned char *buf, *not_zero; + int ret; buf = kmap(sg_page(sg)) + sg->offset; if (!buf) @@ -437,10 +437,10 @@ iblock_execute_zero_out(struct block_device *bdev, struct se_cmd *cmd) * Fall back to block_execute_write_same() slow-path if * incoming WRITE_SAME payload does not contain zeros. */ - rc = memcmp(buf, p, cmd->data_length); + not_zero = memchr_inv(buf, 0x00, cmd->data_length); kunmap(sg_page(sg)); - if (rc) + if (not_zero) return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE; ret = blkdev_issue_zeroout(bdev, -- 2.7.2

7 years, 4 months

2
1
0 0

[patch 15/15] mm/filemap.c: fix NULL pointer in page_cache_tree_insert()

by akpm＠linux-foundation.org

From: Matthew Wilcox <mawilcox(a)microsoft.com> Subject: mm/filemap.c: fix NULL pointer in page_cache_tree_insert() f2fs specifies the __GFP_ZERO flag for allocating some of its pages. Unfortunately, the page cache also uses the mapping's GFP flags for allocating radix tree nodes. It always masked off the __GFP_HIGHMEM flag, and masks off __GFP_ZERO in some paths, but not all. That causes radix tree nodes to be allocated with a NULL list_head, which causes backtraces like: [<ffffff80086f4de0>] __list_del_entry+0x30/0xd0 [<ffffff8008362018>] list_lru_del+0xac/0x1ac [<ffffff800830f04c>] page_cache_tree_insert+0xd8/0x110 The __GFP_DMA and __GFP_DMA32 flags would also be able to sneak through if they are ever used. Fix them all by using GFP_RECLAIM_MASK at the innermost location, and remove it from earlier in the callchain. Link: http://lkml.kernel.org/r/20180411060320.14458-2-willy@infradead.org Fixes: 449dd6984d0e ("mm: keep page cache radix tree nodes in check") Signed-off-by: Matthew Wilcox <mawilcox(a)microsoft.com> Reported-by: Chris Fries <cfries(a)google.com> Debugged-by: Minchan Kim <minchan(a)kernel.org> Acked-by: Johannes Weiner <hannes(a)cmpxchg.org> Acked-by: Michal Hocko <mhocko(a)suse.com> Reviewed-by: Jan Kara <jack(a)suse.cz> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/filemap.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff -puN mm/filemap.c~fix-null-pointer-in-page_cache_tree_insert mm/filemap.c --- a/mm/filemap.c~fix-null-pointer-in-page_cache_tree_insert +++ a/mm/filemap.c @@ -786,7 +786,7 @@ int replace_page_cache_page(struct page VM_BUG_ON_PAGE(!PageLocked(new), new); VM_BUG_ON_PAGE(new->mapping, new); - error = radix_tree_preload(gfp_mask & ~__GFP_HIGHMEM); + error = radix_tree_preload(gfp_mask & GFP_RECLAIM_MASK); if (!error) { struct address_space *mapping = old->mapping; void (*freepage)(struct page *); @@ -842,7 +842,7 @@ static int __add_to_page_cache_locked(st return error; } - error = radix_tree_maybe_preload(gfp_mask & ~__GFP_HIGHMEM); + error = radix_tree_maybe_preload(gfp_mask & GFP_RECLAIM_MASK); if (error) { if (!huge) mem_cgroup_cancel_charge(page, memcg, false); @@ -1585,8 +1585,7 @@ no_page: if (fgp_flags & FGP_ACCESSED) __SetPageReferenced(page); - err = add_to_page_cache_lru(page, mapping, offset, - gfp_mask & GFP_RECLAIM_MASK); + err = add_to_page_cache_lru(page, mapping, offset, gfp_mask); if (unlikely(err)) { put_page(page); page = NULL; @@ -2387,7 +2386,7 @@ static int page_cache_read(struct file * if (!page) return -ENOMEM; - ret = add_to_page_cache_lru(page, mapping, offset, gfp_mask & GFP_KERNEL); + ret = add_to_page_cache_lru(page, mapping, offset, gfp_mask); if (ret == 0) ret = mapping->a_ops->readpage(file, page); else if (ret == -EEXIST) _

7 years, 4 months

1
0
0 0

[patch 09/15] autofs: mount point create should honour passed in mode

by akpm＠linux-foundation.org

From: Ian Kent <raven(a)themaw.net> Subject: autofs: mount point create should honour passed in mode The autofs file system mkdir inode operation blindly sets the created directory mode to S_IFDIR | 0555, ingoring the passed in mode, which can cause selinux dac_override denials. But the function also checks if the caller is the daemon (as no-one else should be able to do anything here) so there's no point in not honouring the passed in mode, allowing the daemon to set appropriate mode when required. Link: http://lkml.kernel.org/r/152361593601.8051.14014139124905996173.stgit@pluto… Signed-off-by: Ian Kent <raven(a)themaw.net> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- fs/autofs4/root.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff -puN fs/autofs4/root.c~autofs-mount-point-create-should-honour-passed-in-mode fs/autofs4/root.c --- a/fs/autofs4/root.c~autofs-mount-point-create-should-honour-passed-in-mode +++ a/fs/autofs4/root.c @@ -749,7 +749,7 @@ static int autofs4_dir_mkdir(struct inod autofs4_del_active(dentry); - inode = autofs4_get_inode(dir->i_sb, S_IFDIR | 0555); + inode = autofs4_get_inode(dir->i_sb, S_IFDIR | mode); if (!inode) return -ENOMEM; d_add(dentry, inode); _

7 years, 4 months

1
0
0 0

[patch 06/15] rapidio: fix rio_dma_transfer error handling

by akpm＠linux-foundation.org

From: Ioan Nicu <ioan.nicu.ext(a)nokia.com> Subject: rapidio: fix rio_dma_transfer error handling Some of the mport_dma_req structure members were initialized late inside the do_dma_request() function, just before submitting the request to the dma engine. But we have some error branches before that. In case of such an error, the code would return on the error path and trigger the calling of dma_req_free() with a req structure which is not completely initialized. This causes a NULL pointer dereference in dma_req_free(). This patch fixes these error branches by making sure that all necessary mport_dma_req structure members are initialized in rio_dma_transfer() immediately after the request structure gets allocated. Link: http://lkml.kernel.org/r/20180412150605.GA31409@nokia.com Fixes: bbd876adb8c72 ("rapidio: use a reference count for struct mport_dma_req") Signed-off-by: Ioan Nicu <ioan.nicu.ext(a)nokia.com> Tested-by: Alexander Sverdlin <alexander.sverdlin(a)nokia.com> Acked-by: Alexandre Bounine <alex.bou9(a)gmail.com> Cc: Barry Wood <barry.wood(a)idt.com> Cc: Matt Porter <mporter(a)kernel.crashing.org> Cc: Christophe JAILLET <christophe.jaillet(a)wanadoo.fr> Cc: Logan Gunthorpe <logang(a)deltatee.com> Cc: Chris Wilson <chris(a)chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com> Cc: Frank Kunz <frank.kunz(a)nokia.com> Cc: <stable(a)vger.kernel.org> [4.6+] Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- drivers/rapidio/devices/rio_mport_cdev.c | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-) diff -puN drivers/rapidio/devices/rio_mport_cdev.c~rapidio-fix-rio_dma_transfer-error-handling drivers/rapidio/devices/rio_mport_cdev.c --- a/drivers/rapidio/devices/rio_mport_cdev.c~rapidio-fix-rio_dma_transfer-error-handling +++ a/drivers/rapidio/devices/rio_mport_cdev.c @@ -740,10 +740,7 @@ static int do_dma_request(struct mport_d tx->callback = dma_xfer_callback; tx->callback_param = req; - req->dmach = chan; - req->sync = sync; req->status = DMA_IN_PROGRESS; - init_completion(&req->req_comp); kref_get(&req->refcount); cookie = dmaengine_submit(tx); @@ -831,13 +828,20 @@ rio_dma_transfer(struct file *filp, u32 if (!req) return -ENOMEM; - kref_init(&req->refcount); - ret = get_dma_channel(priv); if (ret) { kfree(req); return ret; } + chan = priv->dmach; + + kref_init(&req->refcount); + init_completion(&req->req_comp); + req->dir = dir; + req->filp = filp; + req->priv = priv; + req->dmach = chan; + req->sync = sync; /* * If parameter loc_addr != NULL, we are transferring data from/to @@ -925,11 +929,6 @@ rio_dma_transfer(struct file *filp, u32 xfer->offset, xfer->length); } - req->dir = dir; - req->filp = filp; - req->priv = priv; - chan = priv->dmach; - nents = dma_map_sg(chan->device->dev, req->sgt.sgl, req->sgt.nents, dir); if (nents == 0) { _

7 years, 4 months

1
0
0 0

[patch 04/15] writeback: safer lock nesting

by akpm＠linux-foundation.org

From: Greg Thelen <gthelen(a)google.com> Subject: writeback: safer lock nesting lock_page_memcg()/unlock_page_memcg() use spin_lock_irqsave/restore() if the page's memcg is undergoing move accounting, which occurs when a process leaves its memcg for a new one that has memory.move_charge_at_immigrate set. unlocked_inode_to_wb_begin,end() use spin_lock_irq/spin_unlock_irq() if the given inode is switching writeback domains. Switches occur when enough writes are issued from a new domain. This existing pattern is thus suspicious: lock_page_memcg(page); unlocked_inode_to_wb_begin(inode, &locked); ... unlocked_inode_to_wb_end(inode, locked); unlock_page_memcg(page); If both inode switch and process memcg migration are both in-flight then unlocked_inode_to_wb_end() will unconditionally enable interrupts while still holding the lock_page_memcg() irq spinlock. This suggests the possibility of deadlock if an interrupt occurs before unlock_page_memcg(). truncate __cancel_dirty_page lock_page_memcg unlocked_inode_to_wb_begin unlocked_inode_to_wb_end <interrupts mistakenly enabled> <interrupt> end_page_writeback test_clear_page_writeback lock_page_memcg <deadlock> unlock_page_memcg Due to configuration limitations this deadlock is not currently possible because we don't mix cgroup writeback (a cgroupv2 feature) and memory.move_charge_at_immigrate (a cgroupv1 feature). If the kernel is hacked to always claim inode switching and memcg moving_account, then this script triggers lockup in less than a minute: cd /mnt/cgroup/memory mkdir a b echo 1 > a/memory.move_charge_at_immigrate echo 1 > b/memory.move_charge_at_immigrate ( echo $BASHPID > a/cgroup.procs while true; do dd if=/dev/zero of=/mnt/big bs=1M count=256 done ) & while true; do sync done & sleep 1h & SLEEP=$! while true; do echo $SLEEP > a/cgroup.procs echo $SLEEP > b/cgroup.procs done The deadlock does not seem possible, so it's debatable if there's any reason to modify the kernel. I suggest we should to prevent future surprises. And Wang Long said "this deadlock occurs three times in our environment", so there's more reason to apply this, even to stable. Stable 4.4 has minor conflicts applying this patch. For a clean 4.4 patch see "[PATCH for-4.4] writeback: safer lock nesting" https://lkml.org/lkml/2018/4/11/146 Wang Long said "this deadlock occurs three times in our environment" [gthelen(a)google.com: v4] Link: http://lkml.kernel.org/r/20180411084653.254724-1-gthelen@google.com [akpm(a)linux-foundation.org: comment tweaks, struct initialization simplification] Change-Id: Ibb773e8045852978f6207074491d262f1b3fb613 Link: http://lkml.kernel.org/r/20180410005908.167976-1-gthelen@google.com Fixes: 682aa8e1a6a1 ("writeback: implement unlocked_inode_to_wb transaction and use it for stat updates") Signed-off-by: Greg Thelen <gthelen(a)google.com> Reported-by: Wang Long <wanglong19(a)meituan.com> Acked-by: Wang Long <wanglong19(a)meituan.com> Acked-by: Michal Hocko <mhocko(a)suse.com> Reviewed-by: Andrew Morton <akpm(a)linux-foundation.org> Cc: Johannes Weiner <hannes(a)cmpxchg.org> Cc: Tejun Heo <tj(a)kernel.org> Cc: Nicholas Piggin <npiggin(a)gmail.com> Cc: <stable(a)vger.kernel.org> [v4.2+] Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- fs/fs-writeback.c | 7 +++--- include/linux/backing-dev-defs.h | 5 ++++ include/linux/backing-dev.h | 30 +++++++++++++++-------------- mm/page-writeback.c | 18 ++++++++--------- 4 files changed, 34 insertions(+), 26 deletions(-) diff -puN fs/fs-writeback.c~writeback-safer-lock-nesting fs/fs-writeback.c --- a/fs/fs-writeback.c~writeback-safer-lock-nesting +++ a/fs/fs-writeback.c @@ -745,11 +745,12 @@ int inode_congested(struct inode *inode, */ if (inode && inode_to_wb_is_valid(inode)) { struct bdi_writeback *wb; - bool locked, congested; + struct wb_lock_cookie lock_cookie = {}; + bool congested; - wb = unlocked_inode_to_wb_begin(inode, &locked); + wb = unlocked_inode_to_wb_begin(inode, &lock_cookie); congested = wb_congested(wb, cong_bits); - unlocked_inode_to_wb_end(inode, locked); + unlocked_inode_to_wb_end(inode, &lock_cookie); return congested; } diff -puN include/linux/backing-dev-defs.h~writeback-safer-lock-nesting include/linux/backing-dev-defs.h --- a/include/linux/backing-dev-defs.h~writeback-safer-lock-nesting +++ a/include/linux/backing-dev-defs.h @@ -223,6 +223,11 @@ static inline void set_bdi_congested(str set_wb_congested(bdi->wb.congested, sync); } +struct wb_lock_cookie { + bool locked; + unsigned long flags; +}; + #ifdef CONFIG_CGROUP_WRITEBACK /** diff -puN include/linux/backing-dev.h~writeback-safer-lock-nesting include/linux/backing-dev.h --- a/include/linux/backing-dev.h~writeback-safer-lock-nesting +++ a/include/linux/backing-dev.h @@ -347,7 +347,7 @@ static inline struct bdi_writeback *inod /** * unlocked_inode_to_wb_begin - begin unlocked inode wb access transaction * @inode: target inode - * @lockedp: temp bool output param, to be passed to the end function + * @cookie: output param, to be passed to the end function * * The caller wants to access the wb associated with @inode but isn't * holding inode->i_lock, the i_pages lock or wb->list_lock. This @@ -355,12 +355,12 @@ static inline struct bdi_writeback *inod * association doesn't change until the transaction is finished with * unlocked_inode_to_wb_end(). * - * The caller must call unlocked_inode_to_wb_end() with *@lockdep - * afterwards and can't sleep during transaction. IRQ may or may not be - * disabled on return. + * The caller must call unlocked_inode_to_wb_end() with *@cookie afterwards and + * can't sleep during the transaction. IRQs may or may not be disabled on + * return. */ static inline struct bdi_writeback * -unlocked_inode_to_wb_begin(struct inode *inode, bool *lockedp) +unlocked_inode_to_wb_begin(struct inode *inode, struct wb_lock_cookie *cookie) { rcu_read_lock(); @@ -368,10 +368,10 @@ unlocked_inode_to_wb_begin(struct inode * Paired with store_release in inode_switch_wb_work_fn() and * ensures that we see the new wb if we see cleared I_WB_SWITCH. */ - *lockedp = smp_load_acquire(&inode->i_state) & I_WB_SWITCH; + cookie->locked = smp_load_acquire(&inode->i_state) & I_WB_SWITCH; - if (unlikely(*lockedp)) - xa_lock_irq(&inode->i_mapping->i_pages); + if (unlikely(cookie->locked)) + xa_lock_irqsave(&inode->i_mapping->i_pages, cookie->flags); /* * Protected by either !I_WB_SWITCH + rcu_read_lock() or the i_pages @@ -383,12 +383,13 @@ unlocked_inode_to_wb_begin(struct inode /** * unlocked_inode_to_wb_end - end inode wb access transaction * @inode: target inode - * @locked: *@lockedp from unlocked_inode_to_wb_begin() + * @cookie: @cookie from unlocked_inode_to_wb_begin() */ -static inline void unlocked_inode_to_wb_end(struct inode *inode, bool locked) +static inline void unlocked_inode_to_wb_end(struct inode *inode, + struct wb_lock_cookie *cookie) { - if (unlikely(locked)) - xa_unlock_irq(&inode->i_mapping->i_pages); + if (unlikely(cookie->locked)) + xa_unlock_irqrestore(&inode->i_mapping->i_pages, cookie->flags); rcu_read_unlock(); } @@ -435,12 +436,13 @@ static inline struct bdi_writeback *inod } static inline struct bdi_writeback * -unlocked_inode_to_wb_begin(struct inode *inode, bool *lockedp) +unlocked_inode_to_wb_begin(struct inode *inode, struct wb_lock_cookie *cookie) { return inode_to_wb(inode); } -static inline void unlocked_inode_to_wb_end(struct inode *inode, bool locked) +static inline void unlocked_inode_to_wb_end(struct inode *inode, + struct wb_lock_cookie *cookie) { } diff -puN mm/page-writeback.c~writeback-safer-lock-nesting mm/page-writeback.c --- a/mm/page-writeback.c~writeback-safer-lock-nesting +++ a/mm/page-writeback.c @@ -2502,13 +2502,13 @@ void account_page_redirty(struct page *p if (mapping && mapping_cap_account_dirty(mapping)) { struct inode *inode = mapping->host; struct bdi_writeback *wb; - bool locked; + struct wb_lock_cookie cookie = {}; - wb = unlocked_inode_to_wb_begin(inode, &locked); + wb = unlocked_inode_to_wb_begin(inode, &cookie); current->nr_dirtied--; dec_node_page_state(page, NR_DIRTIED); dec_wb_stat(wb, WB_DIRTIED); - unlocked_inode_to_wb_end(inode, locked); + unlocked_inode_to_wb_end(inode, &cookie); } } EXPORT_SYMBOL(account_page_redirty); @@ -2614,15 +2614,15 @@ void __cancel_dirty_page(struct page *pa if (mapping_cap_account_dirty(mapping)) { struct inode *inode = mapping->host; struct bdi_writeback *wb; - bool locked; + struct wb_lock_cookie cookie = {}; lock_page_memcg(page); - wb = unlocked_inode_to_wb_begin(inode, &locked); + wb = unlocked_inode_to_wb_begin(inode, &cookie); if (TestClearPageDirty(page)) account_page_cleaned(page, mapping, wb); - unlocked_inode_to_wb_end(inode, locked); + unlocked_inode_to_wb_end(inode, &cookie); unlock_page_memcg(page); } else { ClearPageDirty(page); @@ -2654,7 +2654,7 @@ int clear_page_dirty_for_io(struct page if (mapping && mapping_cap_account_dirty(mapping)) { struct inode *inode = mapping->host; struct bdi_writeback *wb; - bool locked; + struct wb_lock_cookie cookie = {}; /* * Yes, Virginia, this is indeed insane. @@ -2691,14 +2691,14 @@ int clear_page_dirty_for_io(struct page * always locked coming in here, so we get the desired * exclusion. */ - wb = unlocked_inode_to_wb_begin(inode, &locked); + wb = unlocked_inode_to_wb_begin(inode, &cookie); if (TestClearPageDirty(page)) { dec_lruvec_page_state(page, NR_FILE_DIRTY); dec_zone_page_state(page, NR_ZONE_WRITE_PENDING); dec_wb_stat(wb, WB_RECLAIMABLE); ret = 1; } - unlocked_inode_to_wb_end(inode, locked); + unlocked_inode_to_wb_end(inode, &cookie); return ret; } return TestClearPageDirty(page); _

7 years, 4 months

1
0
0 0

None

by Guillaume Morin

linux-kernel@vger.kernel.org,jack@suse.com,decui@microsoft.com Bcc: Subject: Re: kernel panics with 4.14.X versions Reply-To: In-Reply-To: <47d114b6-cf57-152a-32ad-07a541b05198(a)gmail.com> Fwiw, there have been already reports of similar soft lockups in fsnotify() on 4.14: https://lkml.org/lkml/2018/3/2/1038 We have also noticed similar softlockups with 4.14.22 here. On 16 Apr 13:54, Pavlos Parissis wrote: > > Hi all, > > We have observed kernel panics on several master kubernetes clusters, where we run > kubernetes API services and not application workloads. > > Those clusters use kernel version 4.14.14 and 4.14.32, but we switched everything > to kernel version 4.14.32 as a way to address the issue. > > We have HP and Dell hardware on those clusters, and network cards are also different, > we have bnx2x and mlx5_core in use. > > We also run kernel version 4.14.32 on different type of workloads, software load > balancing using HAProxy, and we don't have any crashes there. > > Since the crash happens on different hardware, we think it could be a kernel issue, > but we aren't sure about it. Thus, I am contacting kernel people in order to get some > hint, which can help us to figure out what causes this. > > In our kubernetes clusters, we have instructed the kernel to panic upon soft lockup, > we use 'kernel.softlockup_panic=1', 'kernel.hung_task_panic=1' and 'kernel.watchdog_thresh=10'. > Thus, we see the stack traces. Today, we have disabled this, later I will explain why. > > I believe we have two discint types of panics, one is trigger upon soft lockup and another one > where the call trace is about scheduler("sched: Unexpected reschedule of offline CPU#8!) > > > Let me walk you through the kernel panics and some observations. > > The followin series of stack traces are happening when one CPU (CPU 24) is stuck for ~22 seconds. > watchdog_thresh is set to 10 and as far as I remember softlockup threshold is (2 * watchdog_thresh), > so it makes sense to see the kernel crashing after ~20seconds. > > After the stack trace, we have the output of sar for CPU#24 and we see that just before the > crash CPU utilization for system level went to 100%. Now let's move to another panic. > > [373782.361064] watchdog: BUG: soft lockup - CPU#24 stuck for 22s! [kube-apiserver:24261] > [373782.378225] Modules linked in: binfmt_misc sctp_diag sctp dccp_diag dccp tcp_diag udp_diag > inet_diag unix_diag cfg80211 rfkill dell_rbu 8021q garp mrp xfs libcrc32c loop x86_pkg_temp_thermal > intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel > pcbc aesni_intel vfat fat crypto_simd glue_helper cryptd intel_cstate intel_rapl_perf iTCO_wdt ses > iTCO_vendor_support mxm_wmi ipmi_si dcdbas enclosure mei_me pcspkr ipmi_devintf lpc_ich sg mei > ipmi_msghandler mfd_core shpchp wmi acpi_power_meter netconsole nfsd auth_rpcgss nfs_acl lockd grace > sunrpc ip_tables ext4 mbcache jbd2 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt > fb_sys_fops sd_mod ttm crc32c_intel ahci libahci mlx5_core drm mlxfw mpt3sas ptp libata raid_class > pps_core scsi_transport_sas > [373782.516807] dm_mirror dm_region_hash dm_log dm_mod dax > [373782.531739] CPU: 24 PID: 24261 Comm: kube-apiserver Not tainted 4.14.32-1.el7.x86_64 #1 > [373782.549848] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.4.3 01/17/2017 > [373782.567486] task: ffff882f66d28000 task.stack: ffffc9002120c000 > [373782.583441] RIP: 0010:fsnotify+0x197/0x510 > [373782.597319] RSP: 0018:ffffc9002120fdb8 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff10 > [373782.615308] RAX: 0000000000000000 RBX: ffff882f9ec65c20 RCX: 0000000000000002 > [373782.632950] RDX: 0000000000028700 RSI: 0000000000000002 RDI: ffffffff8269a4e0 > [373782.650616] RBP: ffffc9002120fe98 R08: 0000000000000000 R09: 0000000000000000 > [373782.668287] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 > [373782.685918] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > [373782.703302] FS: 000000c42009f090(0000) GS:ffff882fbf900000(0000) knlGS:0000000000000000 > [373782.721887] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [373782.737741] CR2: 00007f82b6539244 CR3: 0000002f3de2a005 CR4: 00000000003606e0 > [373782.755247] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [373782.772722] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [373782.790043] Call Trace: > [373782.802041] vfs_write+0x151/0x1b0 > [373782.815081] ? syscall_trace_enter+0x1cd/0x2b0 > [373782.829175] SyS_write+0x55/0xc0 > [373782.841870] do_syscall_64+0x79/0x1b0 > [373782.855073] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 > [373782.869807] RIP: 0033:0x483084 > [373782.882293] RSP: 002b:000000c4387e57f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 > [373782.899997] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000483084 > [373782.917177] RDX: 00000000000002b3 RSI: 000000c42e27d800 RDI: 000000000000014b > [373782.934268] RBP: 000000c4387e5840 R08: 0000000000000000 R09: 0000000000000000 > [373782.951297] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 > [373782.968208] R13: 00000000000000f2 R14: 0000000000000032 R15: 0000000000000002 > [373782.985003] Code: 0f 84 f6 02 00 00 48 8b 45 a0 4d 85 d2 48 8b 00 48 89 45 a8 48 89 45 a0 0f 85 > ef 02 00 00 48 8b 45 b0 48 89 45 98 48 83 7d a0 00 <0f> 95 c0 48 83 7d 98 00 0f 95 c2 89 d1 08 c1 0f > 84 fc 02 00 00 > [373783.024208] Kernel panic - not syncing: softlockup: hung tasks > [373783.039881] CPU: 24 PID: 24261 Comm: kube-apiserver Tainted: G L > 4.14.32-1.el7.x86_64 #1 > [373783.059497] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.4.3 01/17/2017 > [373783.077206] Call Trace: > [373783.089115] <IRQ> > [373783.100422] dump_stack+0x63/0x88 > [373783.113081] panic+0xe8/0x258 > [373783.125109] watchdog_timer_fn+0x21a/0x230 > [373783.138546] ? watchdog+0x30/0x30 > [373783.150870] __hrtimer_run_queues+0xe7/0x230 > [373783.164081] hrtimer_interrupt+0xa8/0x1a0 > [373783.176703] smp_apic_timer_interrupt+0x6b/0x140 > [373783.189788] apic_timer_interrupt+0x8e/0xa0 > [373783.202198] </IRQ> > [373783.211900] RIP: 0010:fsnotify+0x197/0x510 > [373783.223746] RSP: 0018:ffffc9002120fdb8 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff10 > [373783.239434] RAX: 0000000000000000 RBX: ffff882f9ec65c20 RCX: 0000000000000002 > [373783.254599] RDX: 0000000000028700 RSI: 0000000000000002 RDI: ffffffff8269a4e0 > [373783.269673] RBP: ffffc9002120fe98 R08: 0000000000000000 R09: 0000000000000000 > [373783.284629] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 > [373783.299460] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > [373783.314200] ? fsnotify+0x4bb/0x510 > [373783.324757] vfs_write+0x151/0x1b0 > [373783.335115] ? syscall_trace_enter+0x1cd/0x2b0 > [373783.346617] SyS_write+0x55/0xc0 > [373783.356735] do_syscall_64+0x79/0x1b0 > [373783.367330] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 > [373783.379606] RIP: 0033:0x483084 > [373783.389540] RSP: 002b:000000c4387e57f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 > [373783.404657] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000483084 > [373783.419294] RDX: 00000000000002b3 RSI: 000000c42e27d800 RDI: 000000000000014b > [373783.433922] RBP: 000000c4387e5840 R08: 0000000000000000 R09: 0000000000000000 > [373783.448565] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 > [373783.463128] R13: 00000000000000f2 R14: 0000000000000032 R15: 0000000000000002 > [373783.477744] Kernel Offset: disabled > [373783.492343] ---[ end Kernel panic - not syncing: softlockup: hung tasks > [373783.506452] ------------[ cut here ]------------ > [373783.518376] WARNING: CPU: 24 PID: 24261 at kernel/sched/core.c:1179 set_task_cpu+0x197/0x1a0 > [373783.534730] Modules linked in: binfmt_misc sctp_diag sctp dccp_diag dccp tcp_diag udp_diag > inet_diag unix_diag cfg80211 rfkill dell_rbu 8021q garp mrp xfs libcrc32c loop x86_pkg_temp_thermal > intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel > pcbc aesni_intel vfat fat crypto_simd glue_helper cryptd intel_cstate intel_rapl_perf iTCO_wdt ses > iTCO_vendor_support mxm_wmi ipmi_si dcdbas enclosure mei_me pcspkr ipmi_devintf lpc_ich sg mei > ipmi_msghandler mfd_core shpchp wmi acpi_power_meter netconsole nfsd auth_rpcgss nfs_acl lockd grace > sunrpc ip_tables ext4 mbcache jbd2 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt > fb_sys_fops sd_mod ttm crc32c_intel ahci libahci mlx5_core drm mlxfw mpt3sas ptp libata raid_class > pps_core scsi_transport_sas > [373783.667938] dm_mirror dm_region_hash dm_log dm_mod dax > [373783.682082] CPU: 24 PID: 24261 Comm: kube-apiserver Tainted: G L > 4.14.32-1.el7.x86_64 #1 > [373783.700753] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.4.3 01/17/2017 > [373783.717501] task: ffff882f66d28000 task.stack: ffffc9002120c000 > [373783.732386] RIP: 0010:set_task_cpu+0x197/0x1a0 > [373783.745458] RSP: 0018:ffff882fbf903b88 EFLAGS: 00010046 > [373783.759432] RAX: 0000000000000200 RBX: ffff885fb3cb45c0 RCX: 0000000000000001 > [373783.775692] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff885fb3cb45c0 > [373783.791999] RBP: ffff882fbf903ba8 R08: 0000000000000000 R09: 0000000000000000 > [373783.808362] R10: 0000000000000000 R11: 0000000000000000 R12: ffff885fb3cb516c > [373783.824785] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000022ac0 > [373783.841196] FS: 000000c42009f090(0000) GS:ffff882fbf900000(0000) knlGS:0000000000000000 > [373783.858761] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [373783.873710] CR2: 00007f82b6539244 CR3: 0000002f3de2a005 CR4: 00000000003606e0 > [373783.890304] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [373783.906951] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [373783.923503] Call Trace: > [373783.934742] <IRQ> > [373783.945346] try_to_wake_up+0x16c/0x480 > [373783.957961] default_wake_function+0x12/0x20 > [373783.971086] autoremove_wake_function+0x16/0x60 > [373783.984483] __wake_up_common+0x8f/0x160 > [373783.997154] __wake_up_common_lock+0x7e/0xc0 > [373784.010293] __wake_up+0x13/0x20 > [373784.022125] wake_up_klogd_work_func+0x40/0x60 > [373784.035365] irq_work_run_list+0x53/0x80 > [373784.048042] irq_work_run+0x2c/0x30 > [373784.060132] flush_smp_call_function_queue+0x88/0x110 > [373784.074076] generic_smp_call_function_single_interrupt+0x13/0x30 > [373784.089312] smp_call_function_single_interrupt+0x3a/0xe0 > [373784.103788] call_function_single_interrupt+0x8e/0xa0 > [373784.117820] RIP: 0010:panic+0x206/0x258 > [373784.130402] RSP: 0018:ffff882fbf903e80 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff04 > [373784.147325] RAX: 000000000000003b RBX: 0000000000000000 RCX: 0000000000000006 > [373784.163842] RDX: 0000000000000000 RSI: 0000000000000092 RDI: ffff882fbf9169d0 > [373784.180394] RBP: ffff882fbf903ef0 R08: 0000000000000001 R09: 00000000000006b1 > [373784.197041] R10: 0000000000000001 R11: 0000000000000002 R12: ffffffff81e6be9f > [373784.213609] R13: 0000000000000000 R14: 0000000000000000 R15: 00000000ee6b2800 > [373784.230077] watchdog_timer_fn+0x21a/0x230 > [373784.243095] ? watchdog+0x30/0x30 > [373784.255113] __hrtimer_run_queues+0xe7/0x230 > [373784.267974] hrtimer_interrupt+0xa8/0x1a0 > [373784.280195] smp_apic_timer_interrupt+0x6b/0x140 > [373784.292919] apic_timer_interrupt+0x8e/0xa0 > [373784.304979] </IRQ> > [373784.314365] RIP: 0010:fsnotify+0x197/0x510 > [373784.325739] RSP: 0018:ffffc9002120fdb8 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff10 > [373784.340979] RAX: 0000000000000000 RBX: ffff882f9ec65c20 RCX: 0000000000000002 > [373784.355767] RDX: 0000000000028700 RSI: 0000000000000002 RDI: ffffffff8269a4e0 > [373784.370474] RBP: ffffc9002120fe98 R08: 0000000000000000 R09: 0000000000000000 > [373784.385000] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 > [373784.399438] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > [373784.413725] ? fsnotify+0x4bb/0x510 > [373784.423875] vfs_write+0x151/0x1b0 > [373784.433861] ? syscall_trace_enter+0x1cd/0x2b0 > [373784.444973] SyS_write+0x55/0xc0 > [373784.454738] do_syscall_64+0x79/0x1b0 > [373784.464901] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 > [373784.476731] RIP: 0033:0x483084 > [373784.486201] RSP: 002b:000000c4387e57f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 > [373784.500878] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000483084 > [373784.515015] RDX: 00000000000002b3 RSI: 000000c42e27d800 RDI: 000000000000014b > [373784.529155] RBP: 000000c4387e5840 R08: 0000000000000000 R09: 0000000000000000 > [373784.543400] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 > [373784.557490] R13: 00000000000000f2 R14: 0000000000000032 R15: 0000000000000002 > [373784.571578] Code: ff 80 8b ac 08 00 00 04 e9 20 ff ff ff 0f 0b e9 b9 fe ff ff f7 83 84 00 00 00 > fd ff ff ff 0f 84 c3 fe ff ff 0f 0b e9 bc fe ff ff <0f> 0b e9 cb fe ff ff 66 90 0f 1f 44 00 00 55 48 > 89 e5 41 56 49 > [373784.605527] ---[ end trace d3faf76bdc3ca403 ]--- > [373784.617188] sched: Unexpected reschedule of offline CPU#0! > [373784.629856] ------------[ cut here ]------------ > [373784.641694] WARNING: CPU: 24 PID: 24261 at arch/x86/kernel/smp.c:128 > native_smp_send_reschedule+0x42/0x50 > [373784.659370] Modules linked in: binfmt_misc sctp_diag sctp dccp_diag dccp tcp_diag udp_diag > inet_diag unix_diag cfg80211 rfkill dell_rbu 8021q garp mrp xfs libcrc32c loop x86_pkg_temp_thermal > intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel > pcbc aesni_intel vfat fat crypto_simd glue_helper cryptd intel_cstate intel_rapl_perf iTCO_wdt ses > iTCO_vendor_support mxm_wmi ipmi_si dcdbas enclosure mei_me pcspkr ipmi_devintf lpc_ich sg mei > ipmi_msghandler mfd_core shpchp wmi acpi_power_meter netconsole nfsd auth_rpcgss nfs_acl lockd grace > sunrpc ip_tables ext4 mbcache jbd2 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt > fb_sys_fops sd_mod ttm crc32c_intel ahci libahci mlx5_core drm mlxfw mpt3sas ptp libata raid_class > pps_core scsi_transport_sas > [373784.793557] dm_mirror dm_region_hash dm_log dm_mod dax > [373784.807848] CPU: 24 PID: 24261 Comm: kube-apiserver Tainted: G W L > 4.14.32-1.el7.x86_64 #1 > [373784.826743] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.4.3 01/17/2017 > [373784.843685] task: ffff882f66d28000 task.stack: ffffc9002120c000 > [373784.858935] RIP: 0010:native_smp_send_reschedule+0x42/0x50 > [373784.873706] RSP: 0018:ffff882fbf903b10 EFLAGS: 00010046 > [373784.888200] RAX: 000000000000002e RBX: 0000000000000000 RCX: 0000000000000006 > [373784.904979] RDX: 0000000000000000 RSI: 0000000000000092 RDI: ffff882fbf9169d0 > [373784.921626] RBP: ffff882fbf903b10 R08: 0000000000000001 R09: 00000000000006f8 > [373784.938313] R10: 0000000000000001 R11: 0000000000000000 R12: ffff882fbf622ac0 > [373784.955106] R13: ffff885fb3cb45c0 R14: ffff882fbf903bc8 R15: ffff882fbf622ac0 > [373784.971891] FS: 000000c42009f090(0000) GS:ffff882fbf900000(0000) knlGS:0000000000000000 > [373784.989852] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [373785.005204] CR2: 00007f82b6539244 CR3: 0000002f3de2a005 CR4: 00000000003606e0 > [373785.022197] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [373785.039227] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [373785.056132] Call Trace: > [373785.067623] <IRQ> > [373785.078506] resched_curr+0xae/0xd0 > [373785.091051] check_preempt_curr+0x79/0xa0 > [373785.104217] ttwu_do_wakeup+0x1e/0x160 > [373785.117171] ttwu_do_activate+0x7a/0x90 > [373785.130058] try_to_wake_up+0x1e7/0x480 > [373785.142959] default_wake_function+0x12/0x20 > [373785.156411] autoremove_wake_function+0x16/0x60 > [373785.170119] __wake_up_common+0x8f/0x160 > [373785.183152] __wake_up_common_lock+0x7e/0xc0 > [373785.196508] __wake_up+0x13/0x20 > [373785.208612] wake_up_klogd_work_func+0x40/0x60 > [373785.222065] irq_work_run_list+0x53/0x80 > [373785.234885] irq_work_run+0x2c/0x30 > [373785.247071] flush_smp_call_function_queue+0x88/0x110 > [373785.261146] generic_smp_call_function_single_interrupt+0x13/0x30 > [373785.276556] smp_call_function_single_interrupt+0x3a/0xe0 > [373785.291300] call_function_single_interrupt+0x8e/0xa0 > [373785.305485] RIP: 0010:panic+0x206/0x258 > [373785.318154] RSP: 0018:ffff882fbf903e80 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff04 > [373785.335001] RAX: 000000000000003b RBX: 0000000000000000 RCX: 0000000000000006 > [373785.351418] RDX: 0000000000000000 RSI: 0000000000000092 RDI: ffff882fbf9169d0 > [373785.367776] RBP: ffff882fbf903ef0 R08: 0000000000000001 R09: 00000000000006b1 > [373785.383990] R10: 0000000000000001 R11: 0000000000000002 R12: ffffffff81e6be9f > [373785.400019] R13: 0000000000000000 R14: 0000000000000000 R15: 00000000ee6b2800 > [373785.415792] watchdog_timer_fn+0x21a/0x230 > [373785.427910] ? watchdog+0x30/0x30 > [373785.438891] __hrtimer_run_queues+0xe7/0x230 > [373785.450736] hrtimer_interrupt+0xa8/0x1a0 > [373785.462037] smp_apic_timer_interrupt+0x6b/0x140 > [373785.473814] apic_timer_interrupt+0x8e/0xa0 > [373785.485054] </IRQ> > [373785.493740] RIP: 0010:fsnotify+0x197/0x510 > [373785.504592] RSP: 0018:ffffc9002120fdb8 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff10 > [373785.519343] RAX: 0000000000000000 RBX: ffff882f9ec65c20 RCX: 0000000000000002 > [373785.533627] RDX: 0000000000028700 RSI: 0000000000000002 RDI: ffffffff8269a4e0 > [373785.547934] RBP: ffffc9002120fe98 R08: 0000000000000000 R09: 0000000000000000 > [373785.562192] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 > [373785.576431] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > [373785.590592] ? fsnotify+0x4bb/0x510 > [373785.600647] vfs_write+0x151/0x1b0 > [373785.610507] ? syscall_trace_enter+0x1cd/0x2b0 > [373785.621459] SyS_write+0x55/0xc0 > [373785.630952] do_syscall_64+0x79/0x1b0 > [373785.640818] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 > [373785.652319] RIP: 0033:0x483084 > [373785.661599] RSP: 002b:000000c4387e57f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 > [373785.676059] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000483084 > [373785.690181] RDX: 00000000000002b3 RSI: 000000c42e27d800 RDI: 000000000000014b > [373785.704317] RBP: 000000c4387e5840 R08: 0000000000000000 R09: 0000000000000000 > [373785.718448] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 > [373785.732562] R13: 00000000000000f2 R14: 0000000000000032 R15: 0000000000000002 > [373785.746624] Code: c0 74 1a 48 8b 05 7f 44 ec 00 be fd 00 00 00 48 8b 80 a0 00 00 00 e8 ae 1a 9b > 00 5d c3 89 fe 48 c7 c7 b8 26 e5 81 e8 21 45 09 00 <0f> 0b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f > 44 00 00 55 48 > [373785.780531] ---[ end trace d3faf76bdc3ca404 ]--- > [373785.792207] sched: Unexpected reschedule of offline CPU#42! > [373785.804993] ------------[ cut here ]------------ > [373785.816775] WARNING: CPU: 24 PID: 24261 at arch/x86/kernel/smp.c:128 > native_smp_send_reschedule+0x42/0x50 > [373785.834478] Modules linked in: binfmt_misc sctp_diag sctp dccp_diag dccp tcp_diag udp_diag > inet_diag unix_diag cfg80211 rfkill dell_rbu 8021q garp mrp xfs libcrc32c loop x86_pkg_temp_thermal > intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel > pcbc aesni_intel vfat fat crypto_simd glue_helper cryptd intel_cstate intel_rapl_perf iTCO_wdt ses > iTCO_vendor_support mxm_wmi ipmi_si dcdbas enclosure mei_me pcspkr ipmi_devintf lpc_ich sg mei > ipmi_msghandler mfd_core shpchp wmi acpi_power_meter netconsole nfsd auth_rpcgss nfs_acl lockd grace > sunrpc ip_tables ext4 mbcache jbd2 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt > fb_sys_fops sd_mod ttm crc32c_intel ahci libahci mlx5_core drm mlxfw mpt3sas ptp libata raid_class > pps_core scsi_transport_sas > [373785.968794] dm_mirror dm_region_hash dm_log dm_mod dax > [373785.983020] CPU: 24 PID: 24261 Comm: kube-apiserver Tainted: G W L > 4.14.32-1.el7.x86_64 #1 > [373786.001870] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.4.3 01/17/2017 > [373786.018790] task: ffff882f66d28000 task.stack: ffffc9002120c000 > [373786.034031] RIP: 0010:native_smp_send_reschedule+0x42/0x50 > [373786.048836] RSP: 0018:ffff882fbf9039e0 EFLAGS: 00010046 > [373786.063302] RAX: 000000000000002f RBX: 000000000000002a RCX: 0000000000000006 > [373786.080012] RDX: 0000000000000000 RSI: 0000000000000092 RDI: ffff882fbf9169d0 > [373786.096647] RBP: ffff882fbf9039e0 R08: 0000000000000001 R09: 0000000000000743 > [373786.113328] R10: 0000000000000001 R11: 0000000000000000 R12: ffff882fbfb62ac0 > [373786.130019] R13: ffff882fb3f61740 R14: ffff882fbf903a98 R15: ffff882fbfb62ac0 > [373786.146724] FS: 000000c42009f090(0000) GS:ffff882fbf900000(0000) knlGS:0000000000000000 > [373786.164613] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [373786.179892] CR2: 00007f82b6539244 CR3: 0000002f3de2a005 CR4: 00000000003606e0 > [373786.196879] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [373786.213858] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [373786.230669] Call Trace: > [373786.242081] <IRQ> > [373786.252989] resched_curr+0xae/0xd0 > [373786.265510] check_preempt_curr+0x79/0xa0 > [373786.278628] ttwu_do_wakeup+0x1e/0x160 > [373786.291544] ttwu_do_activate+0x7a/0x90 > [373786.304508] try_to_wake_up+0x1e7/0x480 > [373786.317475] ? check_preempt_curr+0x79/0xa0 > [373786.330755] default_wake_function+0x12/0x20 > [373786.344077] __wake_up_common+0x8f/0x160 > [373786.357105] __wake_up_locked+0x16/0x20 > [373786.369982] complete+0x42/0x60 > [373786.381975] mlx5_cmd_comp_handler+0x28f/0x4b0 [mlx5_core] > [373786.396534] mlx5_eq_int+0x1ae/0x550 [mlx5_core] > [373786.410080] ? __wake_up_common+0x8f/0x160 > [373786.423054] __handle_irq_event_percpu+0x42/0x1a0 > [373786.436719] handle_irq_event_percpu+0x32/0x80 > [373786.450184] handle_irq_event+0x3b/0x60 > [373786.462935] handle_edge_irq+0x95/0x1a0 > [373786.475441] handle_irq+0xb5/0x140 > [373786.487323] ? irq_work_run+0x2c/0x30 > [373786.499336] ? flush_smp_call_function_queue+0x88/0x110 > [373786.513191] do_IRQ+0x48/0xe0 > [373786.524434] common_interrupt+0x8e/0x8e > [373786.536517] RIP: 0010:panic+0x206/0x258 > [373786.548351] RSP: 0018:ffff882fbf903e80 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff7e > [373786.564290] RAX: 000000000000003b RBX: 0000000000000000 RCX: 0000000000000006 > [373786.579556] RDX: 0000000000000000 RSI: 0000000000000092 RDI: ffff882fbf9169d0 > [373786.594559] RBP: ffff882fbf903ef0 R08: 0000000000000001 R09: 00000000000006b1 > [373786.609374] R10: 0000000000000001 R11: 0000000000000002 R12: ffffffff81e6be9f > [373786.623990] R13: 0000000000000000 R14: 0000000000000000 R15: 00000000ee6b2800 > [373786.638331] watchdog_timer_fn+0x21a/0x230 > [373786.649202] ? watchdog+0x30/0x30 > [373786.659024] __hrtimer_run_queues+0xe7/0x230 > [373786.669762] hrtimer_interrupt+0xa8/0x1a0 > [373786.680120] smp_apic_timer_interrupt+0x6b/0x140 > [373786.691100] apic_timer_interrupt+0x8e/0xa0 > [373786.701618] </IRQ> > [373786.709633] RIP: 0010:fsnotify+0x197/0x510 > [373786.719960] RSP: 0018:ffffc9002120fdb8 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff10 > [373786.734322] RAX: 0000000000000000 RBX: ffff882f9ec65c20 RCX: 0000000000000002 > [373786.748258] RDX: 0000000000028700 RSI: 0000000000000002 RDI: ffffffff8269a4e0 > [373786.762175] RBP: ffffc9002120fe98 R08: 0000000000000000 R09: 0000000000000000 > [373786.776003] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 > [373786.789766] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > [373786.803354] ? fsnotify+0x4bb/0x510 > [373786.812823] vfs_write+0x151/0x1b0 > [373786.822215] ? syscall_trace_enter+0x1cd/0x2b0 > [373786.832724] SyS_write+0x55/0xc0 > [373786.841898] do_syscall_64+0x79/0x1b0 > [373786.851586] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 > [373786.862893] RIP: 0033:0x483084 > [373786.871921] RSP: 002b:000000c4387e57f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 > [373786.886319] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000483084 > [373786.900279] RDX: 00000000000002b3 RSI: 000000c42e27d800 RDI: 000000000000014b > [373786.914247] RBP: 000000c4387e5840 R08: 0000000000000000 R09: 0000000000000000 > [373786.928229] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 > [373786.942195] R13: 00000000000000f2 R14: 0000000000000032 R15: 0000000000000002 > [373786.956171] Code: c0 74 1a 48 8b 05 7f 44 ec 00 be fd 00 00 00 48 8b 80 a0 00 00 00 e8 ae 1a 9b > 00 5d c3 89 fe 48 c7 c7 b8 26 e5 81 e8 21 45 09 00 <0f> 0b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f > 44 00 00 55 48 > [373786.989819] ---[ end trace d3faf76bdc3ca405 ]--- > [373787.001313] sched: Unexpected reschedule of offline CPU#36! > [373787.013940] ------------[ cut here ]------------ > [373787.025482] WARNING: CPU: 24 PID: 24261 at arch/x86/kernel/smp.c:128 > native_smp_send_reschedule+0x42/0x50 > [373787.042884] Modules linked in: binfmt_misc sctp_diag sctp dccp_diag dccp tcp_diag udp_diag > inet_diag unix_diag cfg80211 rfkill dell_rbu 8021q garp mrp xfs libcrc32c loop x86_pkg_temp_thermal > intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel > pcbc aesni_intel vfat fat crypto_simd glue_helper cryptd intel_cstate intel_rapl_perf iTCO_wdt ses > iTCO_vendor_support mxm_wmi ipmi_si dcdbas enclosure mei_me pcspkr ipmi_devintf lpc_ich sg mei > ipmi_msghandler mfd_core shpchp wmi acpi_power_meter netconsole nfsd auth_rpcgss nfs_acl lockd grace > sunrpc ip_tables ext4 mbcache jbd2 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt > fb_sys_fops sd_mod ttm crc32c_intel ahci libahci mlx5_core drm mlxfw mpt3sas ptp libata raid_class > pps_core scsi_transport_sas > [373787.175654] dm_mirror dm_region_hash dm_log dm_mod dax > [373787.189862] CPU: 24 PID: 24261 Comm: kube-apiserver Tainted: G W L > 4.14.32-1.el7.x86_64 #1 > [373787.208727] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.4.3 01/17/2017 > [373787.225686] task: ffff882f66d28000 task.stack: ffffc9002120c000 > [373787.240916] RIP: 0010:native_smp_send_reschedule+0x42/0x50 > [373787.255668] RSP: 0018:ffff882fbf9039e0 EFLAGS: 00010046 > [373787.270138] RAX: 000000000000002f RBX: 0000000000000024 RCX: 0000000000000006 > [373787.286911] RDX: 0000000000000000 RSI: 0000000000000092 RDI: ffff882fbf9169d0 > [373787.303602] RBP: ffff882fbf9039e0 R08: 0000000000000001 R09: 0000000000000793 > [373787.320314] R10: 0000000000000001 R11: 0000000000000000 R12: ffff882fbfaa2ac0 > [373787.337037] R13: ffff882fb78bdd00 R14: ffff882fbf903a98 R15: ffff882fbfaa2ac0 > [373787.353793] FS: 000000c42009f090(0000) GS:ffff882fbf900000(0000) knlGS:0000000000000000 > [373787.371708] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [373787.387114] CR2: 00007f82b6539244 CR3: 0000002f3de2a005 CR4: 00000000003606e0 > [373787.404143] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [373787.421146] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [373787.438016] Call Trace: > [373787.449503] <IRQ> > [373787.460353] resched_curr+0xae/0xd0 > [373787.472913] check_preempt_curr+0x79/0xa0 > [373787.486064] ttwu_do_wakeup+0x1e/0x160 > [373787.499014] ttwu_do_activate+0x7a/0x90 > [373787.511930] try_to_wake_up+0x1e7/0x480 > [373787.524803] ? check_preempt_curr+0x79/0xa0 > [373787.538097] default_wake_function+0x12/0x20 > [373787.551463] __wake_up_common+0x8f/0x160 > [373787.564411] __wake_up_locked+0x16/0x20 > [373787.577191] complete+0x42/0x60 > [373787.589104] mlx5_cmd_comp_handler+0x28f/0x4b0 [mlx5_core] > [373787.603704] mlx5_eq_int+0x1ae/0x550 [mlx5_core] > [373787.617258] ? __wake_up_common+0x8f/0x160 > [373787.630170] __handle_irq_event_percpu+0x42/0x1a0 > [373787.643819] handle_irq_event_percpu+0x32/0x80 > [373787.657224] handle_irq_event+0x3b/0x60 > [373787.670045] handle_edge_irq+0x95/0x1a0 > [373787.682656] handle_irq+0xb5/0x140 > [373787.694520] ? irq_work_run+0x2c/0x30 > [373787.706546] ? flush_smp_call_function_queue+0x88/0x110 > [373787.720372] do_IRQ+0x48/0xe0 > [373787.731599] common_interrupt+0x8e/0x8e > [373787.743630] RIP: 0010:panic+0x206/0x258 > [373787.755405] RSP: 0018:ffff882fbf903e80 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff7e > [373787.771355] RAX: 000000000000003b RBX: 0000000000000000 RCX: 0000000000000006 > [373787.786634] RDX: 0000000000000000 RSI: 0000000000000092 RDI: ffff882fbf9169d0 > [373787.801646] RBP: ffff882fbf903ef0 R08: 0000000000000001 R09: 00000000000006b1 > [373787.816462] R10: 0000000000000001 R11: 0000000000000002 R12: ffffffff81e6be9f > [373787.831010] R13: 0000000000000000 R14: 0000000000000000 R15: 00000000ee6b2800 > [373787.845323] watchdog_timer_fn+0x21a/0x230 > [373787.856160] ? watchdog+0x30/0x30 > [373787.866021] __hrtimer_run_queues+0xe7/0x230 > [373787.876785] hrtimer_interrupt+0xa8/0x1a0 > [373787.887167] smp_apic_timer_interrupt+0x6b/0x140 > [373787.898177] apic_timer_interrupt+0x8e/0xa0 > [373787.908668] </IRQ> > [373787.916761] RIP: 0010:fsnotify+0x197/0x510 > [373787.927091] RSP: 0018:ffffc9002120fdb8 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff10 > [373787.941434] RAX: 0000000000000000 RBX: ffff882f9ec65c20 RCX: 0000000000000002 > [373787.955328] RDX: 0000000000028700 RSI: 0000000000000002 RDI: ffffffff8269a4e0 > [373787.969286] RBP: ffffc9002120fe98 R08: 0000000000000000 R09: 0000000000000000 > [373787.983117] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 > [373787.996820] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > [373788.010389] ? fsnotify+0x4bb/0x510 > [373788.019908] vfs_write+0x151/0x1b0 > [373788.029296] ? syscall_trace_enter+0x1cd/0x2b0 > [373788.039801] SyS_write+0x55/0xc0 > [373788.048985] do_syscall_64+0x79/0x1b0 > [373788.058645] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 > [373788.069978] RIP: 0033:0x483084 > [373788.079028] RSP: 002b:000000c4387e57f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 > [373788.093401] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000483084 > [373788.107361] RDX: 00000000000002b3 RSI: 000000c42e27d800 RDI: 000000000000014b > [373788.121337] RBP: 000000c4387e5840 R08: 0000000000000000 R09: 0000000000000000 > [373788.135346] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 > [373788.149304] R13: 00000000000000f2 R14: 0000000000000032 R15: 0000000000000002 > [373788.163236] Code: c0 74 1a 48 8b 05 7f 44 ec 00 be fd 00 00 00 48 8b 80 a0 00 00 00 e8 ae 1a 9b > 00 5d c3 89 fe 48 c7 c7 b8 26 e5 81 e8 21 45 09 00 <0f> 0b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f > 44 00 00 55 48 > [373788.196867] ---[ end trace d3faf76bdc3ca406 ]--- > > ------[ sar -f ./sa15 -s 20:16:00 -P 24 ]----------- > Linux 4.14.32-1.el7.x86_64 (foobar) 04/15/2018 _x86_64_ (56 CPU) > > 08:16:00 PM CPU %user %nice %system %iowait %steal %idle > 08:16:01 PM 24 0.00 0.00 0.00 0.00 0.00 100.00 > 08:16:02 PM 24 0.00 0.00 0.00 0.00 0.00 100.00 > 08:16:03 PM 24 0.99 0.00 0.00 0.00 0.00 99.01 > 08:16:04 PM 24 0.00 0.00 0.00 0.00 0.00 100.00 > 08:16:05 PM 24 1.00 0.00 0.00 0.00 0.00 99.00 > 08:16:06 PM 24 3.00 0.00 0.00 0.00 0.00 97.00 > 08:16:07 PM 24 2.00 0.00 0.00 0.00 0.00 98.00 > 08:16:08 PM 24 1.00 0.00 1.00 0.00 0.00 98.00 > 08:16:09 PM 24 0.99 0.00 0.00 0.00 0.00 99.01 > 08:16:10 PM 24 0.00 0.00 0.00 0.00 0.00 100.00 > 08:16:11 PM 24 1.00 0.00 0.00 0.00 0.00 99.00 > 08:16:12 PM 24 0.00 0.00 0.00 0.00 0.00 100.00 > 08:16:13 PM 24 1.01 0.00 0.00 0.00 0.00 98.99 > 08:16:14 PM 24 0.00 0.00 0.00 0.00 0.00 100.00 > 08:16:15 PM 24 0.00 0.00 0.00 0.00 0.00 100.00 > 08:16:16 PM 24 0.00 0.00 0.00 0.00 0.00 100.00 > 08:16:17 PM 24 0.00 0.00 0.00 0.00 0.00 100.00 > 08:16:18 PM 24 0.00 0.00 0.99 0.00 0.00 99.01 > 08:16:19 PM 24 0.00 0.00 0.00 0.00 0.00 100.00 > 08:16:20 PM 24 0.00 0.00 0.00 0.00 0.00 100.00 > 08:16:21 PM 24 1.00 0.00 0.00 0.00 0.00 99.00 > 08:16:22 PM 24 0.00 0.00 0.00 0.00 0.00 100.00 > 08:16:23 PM 24 1.00 0.00 17.00 0.00 0.00 82.00 > 08:16:24 PM 24 0.00 0.00 100.00 0.00 0.00 0.00 > 08:16:25 PM 24 0.00 0.00 100.00 0.00 0.00 0.00 > 08:16:26 PM 24 0.00 0.00 100.00 0.00 0.00 0.00 > 08:16:27 PM 24 0.00 0.00 100.00 0.00 0.00 0.00 > 08:16:28 PM 24 0.00 0.00 100.00 0.00 0.00 0.00 > 08:16:29 PM 24 0.00 0.00 100.00 0.00 0.00 0.00 > 08:16:30 PM 24 0.00 0.00 100.00 0.00 0.00 0.00 > 08:16:31 PM 24 0.00 0.00 100.00 0.00 0.00 0.00 > 08:16:32 PM 24 0.00 0.00 100.00 0.00 0.00 0.00 > 08:16:33 PM 24 0.00 0.00 100.00 0.00 0.00 0.00 > 08:16:34 PM 24 0.00 0.00 100.00 0.00 0.00 0.00 > 08:16:35 PM 24 0.00 0.00 100.00 0.00 0.00 0.00 > 08:16:36 PM 24 0.00 0.00 100.00 0.00 0.00 0.00 > 08:16:37 PM 24 0.00 0.00 100.00 0.00 0.00 0.00 > 08:16:38 PM 24 0.00 0.00 100.00 0.00 0.00 0.00 > 08:16:39 PM 24 0.00 0.00 100.00 0.00 0.00 0.00 > 08:16:40 PM 24 0.00 0.00 100.00 0.00 0.00 0.00 > 08:16:41 PM 24 0.00 0.00 100.00 0.00 0.00 0.00 > 08:16:42 PM 24 0.00 0.00 100.00 0.00 0.00 0.00 > ------[ sar -f ./sa15 -s 20:16:00 -P 24 ]----------- > > > > > The following panic is from a different server and we see the same symptom, kernel panics > due to a soft lockup and CPU#21 has 100% utilization for system level. In this panic we see > a timeout from the network driver for queuing packets, I believe this is the symptom and not > the cause, as a server with mellox driver had a similar soft lockup. > > > > 391838.033960] NETDEV WATCHDOG: eth0 (bnx2x): transmit queue 2 timed out > [391838.065545] ------------[ cut here ]------------ > [391838.088431] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:320 dev_watchdog+0x22b/0x230 > [391838.128800] Modules linked in: binfmt_misc sctp_diag sctp dccp_diag dccp tcp_diag udp_diag > inet_diag unix_diag cfg80211 rfkill 8021q garp mrp xfs loop vfat fat x86_pkg_temp_thermal > intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel > pcbc aesni_intel crypto_simd glue_helper cryptd intel_cstate iTCO_wdt iTCO_vendor_support > intel_rapl_perf sg hpilo hpwdt ipmi_si pcspkr lpc_ich ioatdma ipmi_devintf dca mfd_core i2c_i801 > shpchp wmi ipmi_msghandler nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 > i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm sd_mod bnx2x mdio drm > libcrc32c crc32c_intel hpsa ptp scsi_transport_sas pps_core dm_mirror dm_region_hash dm_log dm_mod dax > [391838.456941] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.32-1.el7.x86_64 #1 > [391838.491589] Hardware name: HP ProLiant BL460c Gen9, BIOS I36 10/25/2017 > [391838.524202] task: ffffffff82012480 task.stack: ffffffff82000000 > [391838.553322] RIP: 0010:dev_watchdog+0x22b/0x230 > [391838.575252] RSP: 0018:ffff88103fa03e60 EFLAGS: 00010246 > [391838.601054] RAX: 0000000000000039 RBX: 0000000000000002 RCX: 0000000000000000 > [391838.636022] RDX: 0000000000000000 RSI: ffff88103fa169d8 RDI: ffff88103fa169d8 > [391838.671651] RBP: ffff88103fa03e90 R08: 0000000000000000 R09: 00000000000004df > [391838.707021] R10: 0000000000000001 R11: 0000000000aaaaaa R12: ffff881036674000 > [391838.758515] R13: 000000000000005b R14: ffff88103667f100 R15: 0000000000000000 > [391838.810815] FS: 0000000000000000(0000) GS:ffff88103fa00000(0000) knlGS:0000000000000000 > [391838.867323] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [391838.912602] CR2: 00007f912eb7fff0 CR3: 000000000200a006 CR4: 00000000003606f0 > [391838.964401] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [391839.016170] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [391839.067361] Call Trace: > [391839.096085] <IRQ> > [391839.122674] ? dev_deactivate_queue.constprop.30+0x60/0x60 > [391839.166424] call_timer_fn+0x37/0x140 > [391839.201029] run_timer_softirq+0x1eb/0x450 > [391839.238196] ? timerqueue_add+0x59/0x90 > [391839.273260] ? ktime_get+0x3e/0xa0 > [391839.306253] __do_softirq+0xd2/0x27c > [391839.340016] irq_exit+0xd9/0xf0 > [391839.371464] smp_apic_timer_interrupt+0x75/0x140 > [391839.410012] apic_timer_interrupt+0x8e/0xa0 > [391839.446764] </IRQ> > [391839.472682] RIP: 0010:cpuidle_enter_state+0xdd/0x2b0 > [391839.512914] RSP: 0018:ffffffff82003e00 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10 > [391839.565090] RAX: ffff88103fa22ac0 RBX: ffffe8f000200000 RCX: 000000000000001f > [391839.615998] RDX: 0000000000000000 RSI: fff936788221f82b RDI: 0000000000000000 > [391839.666639] RBP: ffffffff82003e38 R08: 000000000000034d R09: 00000000ffffffff > [391839.717691] R10: 000000000000037a R11: 0000000000000008 R12: 0000000000000004 > [391839.768401] R13: 0000000000000000 R14: ffffffff8216d980 R15: 0001645fe6c35649 > [391839.819280] cpuidle_enter+0x17/0x20 > [391839.852911] call_cpuidle+0x23/0x40 > [391839.885828] do_idle+0x172/0x1e0 > [391839.916662] cpu_startup_entry+0x73/0x80 > [391839.950559] rest_init+0xaa/0xb0 > [391839.981142] start_kernel+0x4b7/0x4d8 > [391840.013407] ? set_init_arg+0x5a/0x5a > [391840.045237] x86_64_start_reservations+0x2a/0x2c > [391840.081722] x86_64_start_kernel+0x72/0x75 > [391840.114722] secondary_startup_64+0xa5/0xb0 > [391840.149320] Code: 60 04 00 00 eb 89 4c 89 e7 c6 05 77 bb b2 00 01 e8 6b 38 fd ff 89 d9 48 89 c2 > 4c 89 e6 48 c7 c7 98 6a ef 81 31 c0 e8 b8 52 a2 ff <0f> 0b eb b9 90 0f 1f 44 00 00 55 48 89 e5 41 57 > 49 89 d7 41 56 > [391840.265586] ---[ end trace c661065d595325a9 ]--- > [391842.302965] bnx2x: [bnx2x_clean_tx_queue:1205(eth0)]timeout waiting for queue[2]: > txdata->tx_pkt_prod(11525) != txdata->tx_pkt_cons(11500) > [391844.388943] bnx2x: [bnx2x_clean_tx_queue:1205(eth0)]timeout waiting for queue[2]: > txdata->tx_pkt_prod(11525) != txdata->tx_pkt_cons(11500) > [391850.094964] watchdog: BUG: soft lockup - CPU#21 stuck for 22s! [kube-apiserver:60495] > [391850.146079] Modules linked in: binfmt_misc sctp_diag sctp dccp_diag dccp tcp_diag udp_diag > inet_diag unix_diag cfg80211 rfkill 8021q garp mrp xfs loop vfat fat x86_pkg_temp_thermal > intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel > pcbc aesni_intel crypto_simd glue_helper cryptd intel_cstate iTCO_wdt iTCO_vendor_support > intel_rapl_perf sg hpilo hpwdt ipmi_si pcspkr lpc_ich ioatdma ipmi_devintf dca mfd_core i2c_i801 > shpchp wmi ipmi_msghandler nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 > i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm sd_mod bnx2x mdio drm > libcrc32c crc32c_intel hpsa ptp scsi_transport_sas pps_core dm_mirror dm_region_hash dm_log dm_mod dax > [391850.573524] CPU: 21 PID: 60495 Comm: kube-apiserver Tainted: G W > 4.14.32-1.el7.x86_64 #1 > [391850.634311] Hardware name: HP ProLiant BL460c Gen9, BIOS I36 10/25/2017 > [391850.682799] task: ffff881022172e80 task.stack: ffffc9000b874000 > [391850.727891] RIP: 0010:fsnotify+0x218/0x510 > [391850.763842] RSP: 0018:ffffc9000b877db8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10 > [391850.820076] RAX: ffff882001c77a98 RBX: ffff882001c77a70 RCX: 0000000000000002 > [391850.873470] RDX: 0000000000028400 RSI: 0000000000000002 RDI: ffffffff8269a4e0 > [391850.925414] RBP: ffffc9000b877e98 R08: 0000000000000000 R09: 0000000000000000 > [391850.976777] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 > [391851.028138] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > [391851.079135] FS: 000000c42be02090(0000) GS:ffff88103fd40000(0000) knlGS:0000000000000000 > [391851.135142] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [391851.180107] CR2: 00007f5c3c0690c0 CR3: 0000000fc47c4004 CR4: 00000000003606e0 > [391851.231704] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [391851.283258] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [391851.335898] Call Trace: > [391851.367161] vfs_write+0x151/0x1b0 > [391851.401673] ? syscall_trace_enter+0x1cd/0x2b0 > [391851.440900] SyS_write+0x55/0xc0 > [391851.474214] do_syscall_64+0x79/0x1b0 > [391851.510034] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 > [391851.551320] RIP: 0033:0x483084 > [391851.583001] RSP: 002b:000000c43197d7f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 > [391851.636289] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000483084 > [391851.688719] RDX: 00000000000002a9 RSI: 000000c424283c00 RDI: 0000000000000040 > [391851.740825] RBP: 000000c43197d840 R08: 0000000000000000 R09: 0000000000000000 > [391851.792257] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 > [391851.843292] R13: 00000000000000f2 R14: 0000000000000032 R15: 0000000000000002 > [391851.896703] Code: 0f 85 08 02 00 00 48 85 db 41 0f 94 c4 4d 85 ed 0f 94 c1 84 c9 0f 85 ef 02 00 > 00 8b 4d 90 85 c9 74 26 48 85 db 74 0d f6 43 44 01 <75> 07 c7 43 40 00 00 00 00 4d 85 ed 74 0f 41 f6 > 45 44 01 75 08 > [391852.022198] Kernel panic - not syncing: softlockup: hung tasks > [391852.068204] CPU: 21 PID: 60495 Comm: kube-apiserver Tainted: G W L > 4.14.32-1.el7.x86_64 #1 > [391852.130544] Hardware name: HP ProLiant BL460c Gen9, BIOS I36 10/25/2017 > [391852.180598] Call Trace: > [391852.210411] <IRQ> > [391852.237477] dump_stack+0x63/0x88 > [391852.270360] panic+0xe8/0x258 > [391852.301307] watchdog_timer_fn+0x21a/0x230 > [391852.337395] ? watchdog+0x30/0x30 > [391852.368943] __hrtimer_run_queues+0xe7/0x230 > [391852.405003] hrtimer_interrupt+0xa8/0x1a0 > [391852.439190] smp_apic_timer_interrupt+0x6b/0x140 > [391852.476151] apic_timer_interrupt+0x8e/0xa0 > [391852.511089] </IRQ> > [391852.535014] RIP: 0010:fsnotify+0x218/0x510 > [391852.568048] RSP: 0018:ffffc9000b877db8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10 > [391852.617533] RAX: ffff882001c77a98 RBX: ffff882001c77a70 RCX: 0000000000000002 > [391852.664520] RDX: 0000000000028400 RSI: 0000000000000002 RDI: ffffffff8269a4e0 > [391852.711835] RBP: ffffc9000b877e98 R08: 0000000000000000 R09: 0000000000000000 > [391852.758813] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 > [391852.805527] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > [391852.851877] ? fsnotify+0x4bb/0x510 > [391852.880665] vfs_write+0x151/0x1b0 > [391852.909135] ? syscall_trace_enter+0x1cd/0x2b0 > [391852.942798] SyS_write+0x55/0xc0 > [391852.969978] do_syscall_64+0x79/0x1b0 > [391852.999194] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 > [391853.035095] RIP: 0033:0x483084 > [391853.061289] RSP: 002b:000000c43197d7f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 > [391853.109641] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000483084 > [391853.155956] RDX: 00000000000002a9 RSI: 000000c424283c00 RDI: 0000000000000040 > [391853.202552] RBP: 000000c43197d840 R08: 0000000000000000 R09: 0000000000000000 > [391853.248842] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 > [391853.295051] R13: 00000000000000f2 R14: 0000000000000032 R15: 0000000000000002 > [391853.341016] Kernel Offset: disabled > [391853.375061] ---[ end Kernel panic - not syncing: softlockup: hung tasks > [391853.419102] sched: Unexpected reschedule of offline CPU#0! > [391853.457084] ------------[ cut here ]------------ > [391853.491472] WARNING: CPU: 21 PID: 60495 at arch/x86/kernel/smp.c:128 > native_smp_send_reschedule+0x42/0x50 > [391853.549474] Modules linked in: binfmt_misc sctp_diag sctp dccp_diag dccp tcp_diag udp_diag > inet_diag unix_diag cfg80211 rfkill 8021q garp mrp xfs loop vfat fat x86_pkg_temp_thermal > intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel > pcbc aesni_intel crypto_simd glue_helper cryptd intel_cstate iTCO_wdt iTCO_vendor_support > intel_rapl_perf sg hpilo hpwdt ipmi_si pcspkr lpc_ich ioatdma ipmi_devintf dca mfd_core i2c_i801 > shpchp wmi ipmi_msghandler nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 > i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm sd_mod bnx2x mdio drm > libcrc32c crc32c_intel hpsa ptp scsi_transport_sas pps_core dm_mirror dm_region_hash dm_log dm_mod dax > [391853.967080] CPU: 21 PID: 60495 Comm: kube-apiserver Tainted: G W L > 4.14.32-1.el7.x86_64 #1 > [391854.026457] Hardware name: HP ProLiant BL460c Gen9, BIOS I36 10/25/2017 > [391854.073417] task: ffff881022172e80 task.stack: ffffc9000b874000 > [391854.116927] RIP: 0010:native_smp_send_reschedule+0x42/0x50 > [391854.158063] RSP: 0018:ffff88103fd43b10 EFLAGS: 00010046 > [391854.197408] RAX: 000000000000002e RBX: 0000000000000000 RCX: 0000000000000000 > [391854.246409] RDX: 0000000000000000 RSI: ffff88103fd569d8 RDI: ffff88103fd569d8 > [391854.295777] RBP: ffff88103fd43b10 R08: 0000000000000000 R09: 0000000000000556 > [391854.345373] R10: 0000000000000001 R11: 0000000000aaaaaa R12: ffff88103fa22ac0 > [391854.395334] R13: ffff880f8be48000 R14: ffff88103fd43bc8 R15: ffff88103fa22ac0 > [391854.444983] FS: 000000c42be02090(0000) GS:ffff88103fd40000(0000) knlGS:0000000000000000 > [391854.498575] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [391854.541675] CR2: 00007f5c3c0690c0 CR3: 0000000fc47c4004 CR4: 00000000003606e0 > [391854.591999] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [391854.642263] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [391854.692678] Call Trace: > [391854.719793] <IRQ> > [391854.744771] resched_curr+0xae/0xd0 > [391854.776585] check_preempt_curr+0x79/0xa0 > [391854.811170] ttwu_do_wakeup+0x1e/0x160 > [391854.844514] ttwu_do_activate+0x7a/0x90 > [391854.877774] try_to_wake_up+0x1e7/0x480 > [391854.910892] default_wake_function+0x12/0x20 > [391854.946665] autoremove_wake_function+0x16/0x60 > [391854.984069] __wake_up_common+0x8f/0x160 > [391855.018321] __wake_up_common_lock+0x7e/0xc0 > [391855.053398] __wake_up+0x13/0x20 > [391855.083708] wake_up_klogd_work_func+0x40/0x60 > [391855.119905] irq_work_run_list+0x53/0x80 > [391855.153377] irq_work_run+0x2c/0x30 > [391855.184508] flush_smp_call_function_queue+0x88/0x110 > [391855.223509] generic_smp_call_function_single_interrupt+0x13/0x30 > [391855.267592] smp_call_function_single_interrupt+0x3a/0xe0 > [391855.308323] call_function_single_interrupt+0x8e/0xa0 > [391855.347202] RIP: 0010:panic+0x206/0x258 > [391855.380345] RSP: 0018:ffff88103fd43e80 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff04 > [391855.431894] RAX: 000000000000003b RBX: 0000000000000000 RCX: 0000000000000006 > [391855.481301] RDX: 0000000000000000 RSI: 0000000000000092 RDI: ffff88103fd569d0 > [391855.530810] RBP: ffff88103fd43ef0 R08: 0000000000000000 R09: 0000000000000555 > [391855.579985] R10: 0000000000000001 R11: 0000000000aaaaaa R12: ffffffff81e6be9f > [391855.629525] R13: 0000000000000000 R14: 0000000000000000 R15: 00000000ee6b2800 > [391855.677925] watchdog_timer_fn+0x21a/0x230 > [391855.711211] ? watchdog+0x30/0x30 > [391855.740236] __hrtimer_run_queues+0xe7/0x230 > [391855.773231] hrtimer_interrupt+0xa8/0x1a0 > [391855.804713] smp_apic_timer_interrupt+0x6b/0x140 > [391855.838740] apic_timer_interrupt+0x8e/0xa0 > [391855.870671] </IRQ> > [391855.892208] RIP: 0010:fsnotify+0x218/0x510 > [391855.922974] RSP: 0018:ffffc9000b877db8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10 > [391855.970885] RAX: ffff882001c77a98 RBX: ffff882001c77a70 RCX: 0000000000000002 > [391856.016803] RDX: 0000000000028400 RSI: 0000000000000002 RDI: ffffffff8269a4e0 > [391856.062423] RBP: ffffc9000b877e98 R08: 0000000000000000 R09: 0000000000000000 > [391856.108153] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 > [391856.153683] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > [391856.200197] ? fsnotify+0x4bb/0x510 > [391856.228102] vfs_write+0x151/0x1b0 > [391856.256421] ? syscall_trace_enter+0x1cd/0x2b0 > [391856.288496] SyS_write+0x55/0xc0 > [391856.314643] do_syscall_64+0x79/0x1b0 > [391856.342704] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 > [391856.377545] RIP: 0033:0x483084 > [391856.402822] RSP: 002b:000000c43197d7f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 > [391856.449735] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000483084 > [391856.494804] RDX: 00000000000002a9 RSI: 000000c424283c00 RDI: 0000000000000040 > [391856.540308] RBP: 000000c43197d840 R08: 0000000000000000 R09: 0000000000000000 > [391856.585743] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 > [391856.630940] R13: 00000000000000f2 R14: 0000000000000032 R15: 0000000000000002 > [391856.676366] Code: c0 74 1a 48 8b 05 7f 44 ec 00 be fd 00 00 00 48 8b 80 a0 00 00 00 e8 ae 1a 9b > 00 5d c3 89 fe 48 c7 c7 b8 26 e5 81 e8 21 45 09 00 <0f> 0b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f > 44 00 00 55 48 > [391856.792915] ---[ end trace c661065d595325aa ]--- > [391856.826793] ------------[ cut here ]------------ > [391856.860523] WARNING: CPU: 21 PID: 60495 at kernel/sched/core.c:1179 set_task_cpu+0x197/0x1a0 > [391856.913620] Modules linked in: binfmt_misc sctp_diag sctp dccp_diag dccp tcp_diag udp_diag > inet_diag unix_diag cfg80211 rfkill 8021q garp mrp xfs loop vfat fat x86_pkg_temp_thermal > intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel > pcbc aesni_intel crypto_simd glue_helper cryptd intel_cstate iTCO_wdt iTCO_vendor_support > intel_rapl_perf sg hpilo hpwdt ipmi_si pcspkr lpc_ich ioatdma ipmi_devintf dca mfd_core i2c_i801 > shpchp wmi ipmi_msghandler nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 > i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm sd_mod bnx2x mdio drm > libcrc32c crc32c_intel hpsa ptp scsi_transport_sas pps_core dm_mirror dm_region_hash dm_log dm_mod dax > [391857.333766] CPU: 21 PID: 60495 Comm: kube-apiserver Tainted: G W L > 4.14.32-1.el7.x86_64 #1 > [391857.393681] Hardware name: HP ProLiant BL460c Gen9, BIOS I36 10/25/2017 > [391857.440546] task: ffff881022172e80 task.stack: ffffc9000b874000 > [391857.484076] RIP: 0010:set_task_cpu+0x197/0x1a0 > [391857.520542] RSP: 0018:ffff88103fd43ae8 EFLAGS: 00010046 > [391857.560948] RAX: 0000000000000200 RBX: ffff881038cb45c0 RCX: 0000000000000001 > [391857.610782] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff881038cb45c0 > [391857.660456] RBP: ffff88103fd43b08 R08: 0000000000000008 R09: 0000000000000000 > [391857.710401] R10: 0000000000000001 R11: 0000000000aaaaaa R12: ffff881038cb516c > [391857.760003] R13: 0000000000000008 R14: 0000000000000008 R15: 0000000000022ac0 > [391857.809282] FS: 000000c42be02090(0000) GS:ffff88103fd40000(0000) knlGS:0000000000000000 > [391857.863581] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [391857.906806] CR2: 00007f5c3c0690c0 CR3: 0000000fc47c4004 CR4: 00000000003606e0 > [391857.956620] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [391858.007011] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [391858.057596] Call Trace: > [391858.085525] <IRQ> > [391858.110876] try_to_wake_up+0x16c/0x480 > [391858.145085] ? resched_curr+0xae/0xd0 > [391858.178173] default_wake_function+0x12/0x20 > [391858.214468] __wake_up_common+0x8f/0x160 > [391858.248941] __wake_up_locked+0x16/0x20 > [391858.283175] ep_poll_callback+0xd0/0x300 > [391858.316965] __wake_up_common+0x8f/0x160 > [391858.351271] __wake_up_common_lock+0x7e/0xc0 > [391858.387289] __wake_up+0x13/0x20 > [391858.417695] wake_up_klogd_work_func+0x40/0x60 > [391858.454575] irq_work_run_list+0x53/0x80 > [391858.488737] irq_work_run+0x2c/0x30 > [391858.520329] flush_smp_call_function_queue+0x88/0x110 > [391858.559946] generic_smp_call_function_single_interrupt+0x13/0x30 > [391858.603988] smp_call_function_single_interrupt+0x3a/0xe0 > [391858.645713] call_function_single_interrupt+0x8e/0xa0 > [391858.685706] RIP: 0010:panic+0x206/0x258 > [391858.720431] RSP: 0018:ffff88103fd43e80 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff04 > [391858.772695] RAX: 000000000000003b RBX: 0000000000000000 RCX: 0000000000000006 > [391858.822759] RDX: 0000000000000000 RSI: 0000000000000092 RDI: ffff88103fd569d0 > [391858.872167] RBP: ffff88103fd43ef0 R08: 0000000000000000 R09: 0000000000000555 > [391858.921420] R10: 0000000000000001 R11: 0000000000aaaaaa R12: ffffffff81e6be9f > [391858.971071] R13: 0000000000000000 R14: 0000000000000000 R15: 00000000ee6b2800 > [391859.020677] watchdog_timer_fn+0x21a/0x230 > [391859.054291] ? watchdog+0x30/0x30 > [391859.083991] __hrtimer_run_queues+0xe7/0x230 > [391859.118087] hrtimer_interrupt+0xa8/0x1a0 > [391859.150361] smp_apic_timer_interrupt+0x6b/0x140 > [391859.185167] apic_timer_interrupt+0x8e/0xa0 > [391859.217429] </IRQ> > [391859.239165] RIP: 0010:fsnotify+0x218/0x510 > [391859.269961] RSP: 0018:ffffc9000b877db8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10 > [391859.317370] RAX: ffff882001c77a98 RBX: ffff882001c77a70 RCX: 0000000000000002 > [391859.363263] RDX: 0000000000028400 RSI: 0000000000000002 RDI: ffffffff8269a4e0 > [391859.409279] RBP: ffffc9000b877e98 R08: 0000000000000000 R09: 0000000000000000 > [391859.455080] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 > [391859.500518] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > [391859.546063] ? fsnotify+0x4bb/0x510 > [391859.574081] vfs_write+0x151/0x1b0 > [391859.601468] ? syscall_trace_enter+0x1cd/0x2b0 > [391859.634055] SyS_write+0x55/0xc0 > [391859.660517] do_syscall_64+0x79/0x1b0 > [391859.688919] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 > [391859.723536] RIP: 0033:0x483084 > [391859.748891] RSP: 002b:000000c43197d7f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 > [391859.796455] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000483084 > [391859.841781] RDX: 00000000000002a9 RSI: 000000c424283c00 RDI: 0000000000000040 > [391859.887303] RBP: 000000c43197d840 R08: 0000000000000000 R09: 0000000000000000 > [391859.932494] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 > [391859.977838] R13: 00000000000000f2 R14: 0000000000000032 R15: 0000000000000002 > [391860.023361] Code: ff 80 8b ac 08 00 00 04 e9 20 ff ff ff 0f 0b e9 b9 fe ff ff f7 83 84 00 00 00 > fd ff ff ff 0f 84 c3 fe ff ff 0f 0b e9 bc fe ff ff <0f> 0b e9 cb fe ff ff 66 90 0f 1f 44 00 00 55 48 > 89 e5 41 56 49 > [391860.138078] ---[ end trace c661065d595325ab ]--- > [391860.172166] sched: Unexpected reschedule of offline CPU#8! > [391860.210690] ------------[ cut here ]------------ > [391860.244671] WARNING: CPU: 21 PID: 60495 at arch/x86/kernel/smp.c:128 > native_smp_send_reschedule+0x42/0x50 > [391860.303820] Modules linked in: binfmt_misc sctp_diag sctp dccp_diag dccp tcp_diag udp_diag > inet_diag unix_diag cfg80211 rfkill 8021q garp mrp xfs loop vfat fat x86_pkg_temp_thermal > intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel > pcbc aesni_intel crypto_simd glue_helper cryptd intel_cstate iTCO_wdt iTCO_vendor_support > intel_rapl_perf sg hpilo hpwdt ipmi_si pcspkr lpc_ich ioatdma ipmi_devintf dca mfd_core i2c_i801 > shpchp wmi ipmi_msghandler nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 > i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm sd_mod bnx2x mdio drm > libcrc32c crc32c_intel hpsa ptp scsi_transport_sas pps_core dm_mirror dm_region_hash dm_log dm_mod dax > [391860.726277] CPU: 21 PID: 60495 Comm: kube-apiserver Tainted: G W L > 4.14.32-1.el7.x86_64 #1 > [391860.786402] Hardware name: HP ProLiant BL460c Gen9, BIOS I36 10/25/2017 > [391860.834206] task: ffff881022172e80 task.stack: ffffc9000b874000 > [391860.878669] RIP: 0010:native_smp_send_reschedule+0x42/0x50 > [391860.920832] RSP: 0018:ffff88103fd43b08 EFLAGS: 00010046 > [391860.961851] RAX: 000000000000002e RBX: ffff881038cb45c0 RCX: 0000000000000006 > [391861.012094] RDX: 0000000000000000 RSI: 0000000000000086 RDI: ffff88103fd569d0 > [391861.062447] RBP: ffff88103fd43b08 R08: 0000000000000000 R09: 00000000000005e8 > [391861.112691] R10: 0000000000000001 R11: 0000000000aaaaaa R12: ffff881038cb516c > [391861.163322] R13: 0000000000000004 R14: 0000000000000046 R15: 0000000000022ac0 > [391861.213440] FS: 000000c42be02090(0000) GS:ffff88103fd40000(0000) knlGS:0000000000000000 > [391861.268665] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [391861.311928] CR2: 00007f5c3c0690c0 CR3: 0000000fc47c4004 CR4: 00000000003606e0 > [391861.362717] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [391861.414065] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [391861.464505] Call Trace: > [391861.492319] <IRQ> > [391861.517992] try_to_wake_up+0x405/0x480 > [391861.551956] default_wake_function+0x12/0x20 > [391861.588252] __wake_up_common+0x8f/0x160 > [391861.622982] __wake_up_locked+0x16/0x20 > [391861.657272] ep_poll_callback+0xd0/0x300 > [391861.691535] __wake_up_common+0x8f/0x160 > [391861.726097] __wake_up_common_lock+0x7e/0xc0 > [391861.762240] __wake_up+0x13/0x20 > [391861.793096] wake_up_klogd_work_func+0x40/0x60 > [391861.830133] irq_work_run_list+0x53/0x80 > [391861.864538] irq_work_run+0x2c/0x30 > [391861.896744] flush_smp_call_function_queue+0x88/0x110 > [391861.936872] generic_smp_call_function_single_interrupt+0x13/0x30 > [391861.981074] smp_call_function_single_interrupt+0x3a/0xe0 > [391862.022733] call_function_single_interrupt+0x8e/0xa0 > [391862.062300] RIP: 0010:panic+0x206/0x258 > [391862.096123] RSP: 0018:ffff88103fd43e80 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff04 > [391862.148335] RAX: 000000000000003b RBX: 0000000000000000 RCX: 0000000000000006 > [391862.197879] RDX: 0000000000000000 RSI: 0000000000000092 RDI: ffff88103fd569d0 > [391862.247474] RBP: ffff88103fd43ef0 R08: 0000000000000000 R09: 0000000000000555 > [391862.296985] R10: 0000000000000001 R11: 0000000000aaaaaa R12: ffffffff81e6be9f > [391862.346312] R13: 0000000000000000 R14: 0000000000000000 R15: 00000000ee6b2800 > [391862.395985] watchdog_timer_fn+0x21a/0x230 > [391862.430116] ? watchdog+0x30/0x30 > [391862.460248] __hrtimer_run_queues+0xe7/0x230 > [391862.494845] hrtimer_interrupt+0xa8/0x1a0 > [391862.527650] smp_apic_timer_interrupt+0x6b/0x140 > [391862.563130] apic_timer_interrupt+0x8e/0xa0 > [391862.596032] </IRQ> > [391862.618884] RIP: 0010:fsnotify+0x218/0x510 > [391862.650285] RSP: 0018:ffffc9000b877db8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10 > [391862.698849] RAX: ffff882001c77a98 RBX: ffff882001c77a70 RCX: 0000000000000002 > [391862.744636] RDX: 0000000000028400 RSI: 0000000000000002 RDI: ffffffff8269a4e0 > [391862.791246] RBP: ffffc9000b877e98 R08: 0000000000000000 R09: 0000000000000000 > [391862.837248] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 > [391862.883324] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > [391862.928937] ? fsnotify+0x4bb/0x510 > [391862.957183] vfs_write+0x151/0x1b0 > [391862.984840] ? syscall_trace_enter+0x1cd/0x2b0 > [391863.017128] SyS_write+0x55/0xc0 > [391863.043812] do_syscall_64+0x79/0x1b0 > [391863.072403] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 > [391863.107687] RIP: 0033:0x483084 > [391863.133412] RSP: 002b:000000c43197d7f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 > [391863.180683] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000483084 > [391863.226639] RDX: 00000000000002a9 RSI: 000000c424283c00 RDI: 0000000000000040 > [391863.272308] RBP: 000000c43197d840 R08: 0000000000000000 R09: 0000000000000000 > [391863.317590] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 > [391863.363056] R13: 00000000000000f2 R14: 0000000000000032 R15: 0000000000000002 > [391863.409871] Code: c0 74 1a 48 8b 05 7f 44 ec 00 be fd 00 00 00 48 8b 80 a0 00 00 00 e8 ae 1a 9b > 00 5d c3 89 fe 48 c7 c7 b8 26 e5 81 e8 21 45 09 00 <0f> 0b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f > 44 00 00 55 48 > [391863.522945] ---[ end trace c661065d595325ac ]--- > > > ----[ sar -f ./sa16 -s 04:25:50 -e 05:00:00 -P 21 ]---- > Linux 4.14.32-1.el7.x86_64 (foobar) 04/16/2018 _x86_64_ (32 CPU) > > 04:25:50 AM CPU %user %nice %system %iowait %steal %idle > 04:25:51 AM 21 0.00 0.00 0.00 0.00 0.00 100.00 > 04:25:52 AM 21 1.00 0.00 1.00 0.00 0.00 98.00 > 04:25:53 AM 21 0.00 0.00 0.00 0.00 0.00 100.00 > 04:25:54 AM 21 1.00 0.00 0.00 0.00 0.00 99.00 > 04:25:55 AM 21 0.00 0.00 70.71 0.00 0.00 29.29 > 04:25:56 AM 21 0.00 0.00 100.00 0.00 0.00 0.00 > 04:25:57 AM 21 0.00 0.00 100.00 0.00 0.00 0.00 > 04:25:58 AM 21 0.00 0.00 100.00 0.00 0.00 0.00 > 04:25:59 AM 21 0.00 0.00 100.00 0.00 0.00 0.00 > 04:26:00 AM 21 0.00 0.00 100.00 0.00 0.00 0.00 > 04:26:01 AM 21 0.00 0.00 100.00 0.00 0.00 0.00 > 04:26:02 AM 21 0.00 0.00 100.00 0.00 0.00 0.00 > 04:26:03 AM 21 0.00 0.00 100.00 0.00 0.00 0.00 > ----[ sar -f ./sa16 -s 04:25:50 -e 05:00:00 -P 21 ]---- > > > The fact we see one CPU spinning at 100% utilization in all above crashes is a good thing, > as we can use it as a start point for our investigation. We just need to find out which > (kernel/hardware/network driver/userland application) process makes a single CPU to be stuck. > Thus, we disabled the trigger to panic the kernel when a soft lockup occurs, and we hope > can find out the process. > > The following panic is from the second type of panics I mentioned, where we don't > observe soft lockups and CPU utilization is close to zero before the crash. > > [123379.816452] perf: interrupt took too long (6243 > 6231), lowering > kernel.perf_event_max_sample_rate to 32000 > [295349.255065] general protection fault: 0000 [#1] SMP PTI > [295349.281440] Modules linked in: binfmt_misc sctp_diag sctp dccp_diag dccp tcp_diag udp_diag > inet_diag unix_diag cfg80211 rfkill 8021q garp mrp xfs x86_pkg_temp_thermal intel_powerclamp loop > coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel > crypto_simd glue_helper cryptd iTCO_wdt ipmi_si iTCO_vendor_support intel_cstate intel_rapl_perf > lpc_ich sg hpilo hpwdt ioatdma pcspkr ipmi_devintf i2c_i801 dca shpchp mfd_core wmi ipmi_msghandler > nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 i2c_algo_bit drm_kms_helper > syscopyarea sysfillrect sysimgblt sd_mod fb_sys_fops ttm bnx2x mdio libcrc32c crc32c_intel serio_raw > hpsa ptp drm scsi_transport_sas pps_core dm_mirror dm_region_hash dm_log dm_mod dax > [295349.615070] CPU: 26 PID: 1384 Comm: thread.rb:70 Not tainted 4.14.32-1.el7.x86_64 #1 > [295349.654011] Hardware name: HP ProLiant BL460c Gen9, BIOS I36 10/25/2017 > [295349.686931] task: ffff882035430000 task.stack: ffffc90007bb4000 > [295349.716421] RIP: 0010:prefetch_freepointer.isra.63+0x11/0x20 > [295349.744812] RSP: 0018:ffffc90007bb7e08 EFLAGS: 00010202 > [295349.771654] RAX: 0000000000000000 RBX: 6236612d38373234 RCX: 00000000000199bb > [295349.807690] RDX: 00000000000199ba RSI: 6236612d38373234 RDI: ffff88203ec259a0 > [295349.843664] RBP: ffffc90007bb7e08 R08: 0000000000028060 R09: ffffffff82051cc0 > [295349.879868] R10: 0000000000002000 R11: 0000000000000040 R12: 00000000014000c0 > [295349.916097] R13: ffff88203ec25980 R14: ffff88203ec25980 R15: ffff882000000000 > [295349.951868] FS: 00007f3f439f9700(0000) GS:ffff88203f480000(0000) knlGS:0000000000000000 > [295349.993039] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [295350.021664] CR2: 000000c43069c000 CR3: 000000203943e001 CR4: 00000000003606e0 > [295350.057534] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [295350.093663] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [295350.129254] Call Trace: > [295350.141644] kmem_cache_alloc+0x9c/0x1b0 > [295350.161581] ? fsnotify_add_mark_locked+0x153/0x320 > [295350.186330] fsnotify_add_mark_locked+0x153/0x320 > [295350.210023] SyS_inotify_add_watch+0x2d5/0x350 > [295350.232414] do_syscall_64+0x79/0x1b0 > [295350.250528] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 > [295350.275482] RIP: 0033:0x7f3f53f409b7 > [295350.293330] RSP: 002b:00007f3f439f70c8 EFLAGS: 00000202 ORIG_RAX: 00000000000000fe > [295350.330889] RAX: ffffffffffffffda RBX: 00007f3f2c232fc0 RCX: 00007f3f53f409b7 > [295350.365971] RDX: 0000000022000fc6 RSI: 0000000002eaba50 RDI: 0000000000000018 > [295350.400949] RBP: 0000000002677d20 R08: 000000005ad2a563 R09: 0000000009caa9a8 > [295350.436090] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000002677d20 > [295350.471552] R13: 000000000000fd02 R14: 000000000005dc08 R15: 00000000000081a4 > [295350.507348] Code: 31 d2 e8 b3 ea ff ff 5b 41 5c 5d c3 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 > 0f 1f 44 00 00 55 48 85 f6 48 89 e5 74 0a 48 63 07 <48> 8b 04 06 0f 18 08 5d c3 66 0f 1f 44 00 00 0f > 1f 44 00 00 48 > [295350.601490] RIP: prefetch_freepointer.isra.63+0x11/0x20 RSP: ffffc90007bb7e08 > [295350.637891] ---[ end trace 97f09d2dbcdbfe07 ]--- > [295350.666426] Kernel panic - not syncing: Fatal exception > [295350.692470] Kernel Offset: disabled > [295350.715267] ---[ end Kernel panic - not syncing: Fatal exception > [295350.745027] ------------[ cut here ]------------ > [295350.767882] WARNING: CPU: 26 PID: 1384 at kernel/sched/core.c:1179 set_task_cpu+0x197/0x1a0 > [295350.809229] Modules linked in: binfmt_misc sctp_diag sctp dccp_diag dccp tcp_diag udp_diag > inet_diag unix_diag cfg80211 rfkill 8021q garp mrp xfs x86_pkg_temp_thermal intel_powerclamp loop > coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel > crypto_simd glue_helper cryptd iTCO_wdt ipmi_si iTCO_vendor_support intel_cstate intel_rapl_perf > lpc_ich sg hpilo hpwdt ioatdma pcspkr ipmi_devintf i2c_i801 dca shpchp mfd_core wmi ipmi_msghandler > nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 i2c_algo_bit drm_kms_helper > syscopyarea sysfillrect sysimgblt sd_mod fb_sys_fops ttm bnx2x mdio libcrc32c crc32c_intel serio_raw > hpsa ptp drm scsi_transport_sas pps_core dm_mirror dm_region_hash dm_log dm_mod dax > [295351.141701] CPU: 26 PID: 1384 Comm: thread.rb:70 Tainted: G D 4.14.32-1.el7.x86_64 #1 > [295351.186528] Hardware name: HP ProLiant BL460c Gen9, BIOS I36 10/25/2017 > [295351.219763] task: ffff882035430000 task.stack: ffffc90007bb4000 > [295351.249425] RIP: 0010:set_task_cpu+0x197/0x1a0 > [295351.272046] RSP: 0018:ffff88203f483cd8 EFLAGS: 00010046 > [295351.298021] RAX: 0000000000000200 RBX: ffff880fc6730000 RCX: 0000000000000001 > [295351.333003] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff880fc6730000 > [295351.368440] RBP: ffff88203f483cf8 R08: 0000000000000008 R09: 0000000000000000 > [295351.404295] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880fc6730bac > [295351.440065] R13: 0000000000000008 R14: 0000000000000008 R15: 0000000000022ac0 > [295351.475936] FS: 00007f3f439f9700(0000) GS:ffff88203f480000(0000) knlGS:0000000000000000 > [295351.516850] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [295351.545941] CR2: 000000c43069c000 CR3: 000000203943e001 CR4: 00000000003606e0 > [295351.581551] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [295351.616790] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [295351.652332] Call Trace: > [295351.664980] <IRQ> > [295351.675389] try_to_wake_up+0x16c/0x480 > [295351.694771] default_wake_function+0x12/0x20 > [295351.716287] autoremove_wake_function+0x16/0x60 > [295351.738731] __wake_up_common+0x8f/0x160 > [295351.758434] __wake_up_common_lock+0x7e/0xc0 > [295351.780379] __wake_up+0x13/0x20 > [295351.796700] wake_up_klogd_work_func+0x40/0x60 > [295351.818797] irq_work_run_list+0x53/0x80 > [295351.838265] ? tick_sched_do_timer+0x70/0x70 > [295351.859777] irq_work_tick+0x40/0x50 > [295351.877976] update_process_times+0x42/0x60 > [295351.899104] tick_sched_handle+0x2d/0x60 > [295351.919406] tick_sched_timer+0x39/0x70 > [295351.938722] __hrtimer_run_queues+0xe7/0x230 > [295351.960148] hrtimer_interrupt+0xa8/0x1a0 > [295351.979989] smp_apic_timer_interrupt+0x6b/0x140 > [295352.003308] apic_timer_interrupt+0x8e/0xa0 > [295352.024371] </IRQ> > [295352.035497] RIP: 0010:panic+0x206/0x258 > [295352.055056] RSP: 0018:ffffc90007bb7c58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10 > [295352.092974] RAX: 0000000000000034 RBX: 0000000000000200 RCX: 0000000000000006 > [295352.129345] RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffff88203f4969d0 > [295352.164888] RBP: ffffc90007bb7cc8 R08: 0000000000000000 R09: 00000000000004bf > [295352.200268] R10: ffffffff8140e7c0 R11: 00000000000004be R12: ffffffff81e4b096 > [295352.236368] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > [295352.272653] ? vgacon_invert_region+0x80/0x80 > [295352.294690] ? panic+0x1ff/0x258 > [295352.311125] oops_end+0xba/0xd0 > [295352.327275] die+0x42/0x50 > [295352.341034] do_general_protection+0xd2/0x160 > [295352.362771] general_protection+0x25/0x50 > [295352.382624] RIP: 0010:prefetch_freepointer.isra.63+0x11/0x20 > [295352.410365] RSP: 0018:ffffc90007bb7e08 EFLAGS: 00010202 > [295352.435958] RAX: 0000000000000000 RBX: 6236612d38373234 RCX: 00000000000199bb > [295352.471228] RDX: 00000000000199ba RSI: 6236612d38373234 RDI: ffff88203ec259a0 > [295352.506333] RBP: ffffc90007bb7e08 R08: 0000000000028060 R09: ffffffff82051cc0 > [295352.541869] R10: 0000000000002000 R11: 0000000000000040 R12: 00000000014000c0 > [295352.577452] R13: ffff88203ec25980 R14: ffff88203ec25980 R15: ffff882000000000 > [295352.613390] ? idr_alloc_cmn+0x98/0xe0 > [295352.633360] kmem_cache_alloc+0x9c/0x1b0 > [295352.653132] ? fsnotify_add_mark_locked+0x153/0x320 > [295352.677495] fsnotify_add_mark_locked+0x153/0x320 > [295352.700960] SyS_inotify_add_watch+0x2d5/0x350 > [295352.723337] do_syscall_64+0x79/0x1b0 > [295352.741929] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 > [295352.767022] RIP: 0033:0x7f3f53f409b7 > [295352.785431] RSP: 002b:00007f3f439f70c8 EFLAGS: 00000202 ORIG_RAX: 00000000000000fe > [295352.823469] RAX: ffffffffffffffda RBX: 00007f3f2c232fc0 RCX: 00007f3f53f409b7 > [295352.859222] RDX: 0000000022000fc6 RSI: 0000000002eaba50 RDI: 0000000000000018 > [295352.901958] RBP: 0000000002677d20 R08: 000000005ad2a563 R09: 0000000009caa9a8 > [295352.937907] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000002677d20 > [295352.974108] R13: 000000000000fd02 R14: 000000000005dc08 R15: 00000000000081a4 > [295353.010354] Code: ff 80 8b ac 08 00 00 04 e9 20 ff ff ff 0f 0b e9 b9 fe ff ff f7 83 84 00 00 00 > fd ff ff ff 0f 84 c3 fe ff ff 0f 0b e9 bc fe ff ff <0f> 0b e9 cb fe ff ff 66 90 0f 1f 44 00 00 55 48 > 89 e5 41 56 49 > [295353.103228] ---[ end trace 97f09d2dbcdbfe08 ]--- > [295353.126793] sched: Unexpected reschedule of offline CPU#8! > [295353.154571] ------------[ cut here ]------------ > [295353.178193] WARNING: CPU: 26 PID: 1384 at arch/x86/kernel/smp.c:128 > native_smp_send_reschedule+0x42/0x50 > [295353.225115] Modules linked in: binfmt_misc sctp_diag sctp dccp_diag dccp tcp_diag udp_diag > inet_diag unix_diag cfg80211 rfkill 8021q garp mrp xfs x86_pkg_temp_thermal intel_powerclamp loop > coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel > crypto_simd glue_helper cryptd iTCO_wdt ipmi_si iTCO_vendor_support intel_cstate intel_rapl_perf > lpc_ich sg hpilo hpwdt ioatdma pcspkr ipmi_devintf i2c_i801 dca shpchp mfd_core wmi ipmi_msghandler > nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 i2c_algo_bit drm_kms_helper > syscopyarea sysfillrect sysimgblt sd_mod fb_sys_fops ttm bnx2x mdio libcrc32c crc32c_intel serio_raw > hpsa ptp drm scsi_transport_sas pps_core dm_mirror dm_region_hash dm_log dm_mod dax > [295353.554858] CPU: 26 PID: 1384 Comm: thread.rb:70 Tainted: G D W 4.14.32-1.el7.x86_64 #1 > [295353.600673] Hardware name: HP ProLiant BL460c Gen9, BIOS I36 10/25/2017 > [295353.634304] task: ffff882035430000 task.stack: ffffc90007bb4000 > [295353.664086] RIP: 0010:native_smp_send_reschedule+0x42/0x50 > [295353.691429] RSP: 0018:ffff88203f483c60 EFLAGS: 00010046 > [295353.717211] RAX: 000000000000002e RBX: 0000000000000008 RCX: 0000000000000006 > [295353.753162] RDX: 0000000000000000 RSI: 0000000000000096 RDI: ffff88203f4969d0 > [295353.789028] RBP: ffff88203f483c60 R08: 0000000000000000 R09: 000000000000050a > [295353.824901] R10: ffffffff8140e7c0 R11: 0000000000000509 R12: ffff88203f222ac0 > [295353.860780] R13: ffff880fc6730000 R14: ffff88203f483d18 R15: ffff88203f222ac0 > [295353.897041] FS: 00007f3f439f9700(0000) GS:ffff88203f480000(0000) knlGS:0000000000000000 > [295353.937015] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [295353.965230] CR2: 000000c43069c000 CR3: 000000203943e001 CR4: 00000000003606e0 > [295354.001263] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [295354.037348] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [295354.073079] Call Trace: > [295354.085676] <IRQ> > [295354.096271] resched_curr+0xae/0xd0 > [295354.114398] check_preempt_curr+0x79/0xa0 > [295354.134774] ttwu_do_wakeup+0x1e/0x160 > [295354.153738] ttwu_do_activate+0x7a/0x90 > [295354.173017] try_to_wake_up+0x1e7/0x480 > [295354.192199] default_wake_function+0x12/0x20 > [295354.213726] autoremove_wake_function+0x16/0x60 > [295354.236555] __wake_up_common+0x8f/0x160 > [295354.256636] __wake_up_common_lock+0x7e/0xc0 > [295354.278570] __wake_up+0x13/0x20 > [295354.295265] wake_up_klogd_work_func+0x40/0x60 > [295354.317984] irq_work_run_list+0x53/0x80 > [295354.337965] ? tick_sched_do_timer+0x70/0x70 > [295354.359264] irq_work_tick+0x40/0x50 > [295354.377736] update_process_times+0x42/0x60 > [295354.399024] tick_sched_handle+0x2d/0x60 > [295354.418996] tick_sched_timer+0x39/0x70 > [295354.438406] __hrtimer_run_queues+0xe7/0x230 > [295354.459586] hrtimer_interrupt+0xa8/0x1a0 > [295354.479258] smp_apic_timer_interrupt+0x6b/0x140 > [295354.502194] apic_timer_interrupt+0x8e/0xa0 > [295354.523081] </IRQ> > [295354.533789] RIP: 0010:panic+0x206/0x258 > [295354.553565] RSP: 0018:ffffc90007bb7c58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10 > [295354.590890] RAX: 0000000000000034 RBX: 0000000000000200 RCX: 0000000000000006 > [295354.626876] RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffff88203f4969d0 > [295354.662703] RBP: ffffc90007bb7cc8 R08: 0000000000000000 R09: 00000000000004bf > [295354.698251] R10: ffffffff8140e7c0 R11: 00000000000004be R12: ffffffff81e4b096 > [295354.733758] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > [295354.769850] ? vgacon_invert_region+0x80/0x80 > [295354.791724] ? panic+0x1ff/0x258 > [295354.808021] oops_end+0xba/0xd0 > [295354.823809] die+0x42/0x50 > [295354.837948] do_general_protection+0xd2/0x160 > [295354.859636] general_protection+0x25/0x50 > [295354.880150] RIP: 0010:prefetch_freepointer.isra.63+0x11/0x20 > [295354.908869] RSP: 0018:ffffc90007bb7e08 EFLAGS: 00010202 > [295354.935002] RAX: 0000000000000000 RBX: 6236612d38373234 RCX: 00000000000199bb > [295354.970812] RDX: 00000000000199ba RSI: 6236612d38373234 RDI: ffff88203ec259a0 > [295355.006560] RBP: ffffc90007bb7e08 R08: 0000000000028060 R09: ffffffff82051cc0 > [295355.042849] R10: 0000000000002000 R11: 0000000000000040 R12: 00000000014000c0 > [295355.077849] R13: ffff88203ec25980 R14: ffff88203ec25980 R15: ffff882000000000 > [295355.113175] ? idr_alloc_cmn+0x98/0xe0 > [295355.132128] kmem_cache_alloc+0x9c/0x1b0 > [295355.151819] ? fsnotify_add_mark_locked+0x153/0x320 > [295355.176264] fsnotify_add_mark_locked+0x153/0x320 > [295355.199925] SyS_inotify_add_watch+0x2d5/0x350 > [295355.222164] do_syscall_64+0x79/0x1b0 > [295355.240555] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 > [295355.266353] RIP: 0033:0x7f3f53f409b7 > [295355.284573] RSP: 002b:00007f3f439f70c8 EFLAGS: 00000202 ORIG_RAX: 00000000000000fe > [295355.322272] RAX: ffffffffffffffda RBX: 00007f3f2c232fc0 RCX: 00007f3f53f409b7 > [295355.357920] RDX: 0000000022000fc6 RSI: 0000000002eaba50 RDI: 0000000000000018 > [295355.393626] RBP: 0000000002677d20 R08: 000000005ad2a563 R09: 0000000009caa9a8 > [295355.429391] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000002677d20 > [295355.464726] R13: 000000000000fd02 R14: 000000000005dc08 R15: 00000000000081a4 > [295355.500091] Code: c0 74 1a 48 8b 05 7f 44 ec 00 be fd 00 00 00 48 8b 80 a0 00 00 00 e8 ae 1a 9b > 00 5d c3 89 fe 48 c7 c7 b8 26 e5 81 e8 21 45 09 00 <0f> 0b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f > 44 00 00 55 48 > [295355.592809] ---[ end trace 97f09d2dbcdbfe09 ]--- > [295355.616249] sched: Unexpected reschedule of offline CPU#0! > [295355.642901] ------------[ cut here ]------------ > [295355.666243] WARNING: CPU: 26 PID: 1384 at arch/x86/kernel/smp.c:128 > native_smp_send_reschedule+0x42/0x50 > [295355.713782] Modules linked in: binfmt_misc sctp_diag sctp dccp_diag dccp tcp_diag udp_diag > inet_diag unix_diag cfg80211 rfkill 8021q garp mrp xfs x86_pkg_temp_thermal intel_powerclamp loop > coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel > crypto_simd glue_helper cryptd iTCO_wdt ipmi_si iTCO_vendor_support intel_cstate intel_rapl_perf > lpc_ich sg hpilo hpwdt ioatdma pcspkr ipmi_devintf i2c_i801 dca shpchp mfd_core wmi ipmi_msghandler > nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 i2c_algo_bit drm_kms_helper > syscopyarea sysfillrect sysimgblt sd_mod fb_sys_fops ttm bnx2x mdio libcrc32c crc32c_intel serio_raw > hpsa ptp drm scsi_transport_sas pps_core dm_mirror dm_region_hash dm_log dm_mod dax > [295356.048067] CPU: 26 PID: 1384 Comm: thread.rb:70 Tainted: G D W 4.14.32-1.el7.x86_64 #1 > [295356.094292] Hardware name: HP ProLiant BL460c Gen9, BIOS I36 10/25/2017 > [295356.127304] task: ffff882035430000 task.stack: ffffc90007bb4000 > [295356.157937] RIP: 0010:native_smp_send_reschedule+0x42/0x50 > [295356.186118] RSP: 0018:ffff88203f483c58 EFLAGS: 00010046 > [295356.212721] RAX: 000000000000002e RBX: ffff8810391945c0 RCX: 0000000000000006 > [295356.247928] RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffff88203f4969d0 > [295356.284320] RBP: ffff88203f483c58 R08: 0000000000000000 R09: 0000000000000559 > [295356.320685] R10: ffffffff8140e7c0 R11: 0000000000000558 R12: ffff88103919516c > [295356.356635] R13: 0000000000000004 R14: 0000000000000046 R15: 0000000000022ac0 > [295356.392135] FS: 00007f3f439f9700(0000) GS:ffff88203f480000(0000) knlGS:0000000000000000 > [295356.432737] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [295356.461522] CR2: 000000c43069c000 CR3: 000000203943e001 CR4: 00000000003606e0 > [295356.497800] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [295356.533485] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [295356.569205] Call Trace: > [295356.581694] <IRQ> > [295356.591921] try_to_wake_up+0x405/0x480 > [295356.611188] default_wake_function+0x12/0x20 > [295356.632564] __wake_up_common+0x8f/0x160 > [295356.652486] __wake_up_locked+0x16/0x20 > [295356.671808] ep_poll_callback+0xd0/0x300 > [295356.691565] __wake_up_common+0x8f/0x160 > [295356.711684] __wake_up_common_lock+0x7e/0xc0 > [295356.733447] __wake_up+0x13/0x20 > [295356.749916] wake_up_klogd_work_func+0x40/0x60 > [295356.772512] irq_work_run_list+0x53/0x80 > [295356.792701] ? tick_sched_do_timer+0x70/0x70 > [295356.821294] irq_work_tick+0x40/0x50 > [295356.839929] update_process_times+0x42/0x60 > [295356.860941] tick_sched_handle+0x2d/0x60 > [295356.881072] tick_sched_timer+0x39/0x70 > [295356.900787] __hrtimer_run_queues+0xe7/0x230 > [295356.922396] hrtimer_interrupt+0xa8/0x1a0 > [295356.942760] smp_apic_timer_interrupt+0x6b/0x140 > [295356.966377] apic_timer_interrupt+0x8e/0xa0 > [295356.987700] </IRQ> > [295356.998764] RIP: 0010:panic+0x206/0x258 > [295357.018139] RSP: 0018:ffffc90007bb7c58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10 > [295357.055880] RAX: 0000000000000034 RBX: 0000000000000200 RCX: 0000000000000006 > [295357.092139] RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffff88203f4969d0 > [295357.127348] RBP: ffffc90007bb7cc8 R08: 0000000000000000 R09: 00000000000004bf > [295357.163530] R10: ffffffff8140e7c0 R11: 00000000000004be R12: ffffffff81e4b096 > [295357.200334] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > [295357.236063] ? vgacon_invert_region+0x80/0x80 > [295357.257667] ? panic+0x1ff/0x258 > [295357.274076] oops_end+0xba/0xd0 > [295357.290155] die+0x42/0x50 > [295357.303914] do_general_protection+0xd2/0x160 > [295357.326145] general_protection+0x25/0x50 > [295357.346126] RIP: 0010:prefetch_freepointer.isra.63+0x11/0x20 > [295357.374233] RSP: 0018:ffffc90007bb7e08 EFLAGS: 00010202 > [295357.400584] RAX: 0000000000000000 RBX: 6236612d38373234 RCX: 00000000000199bb > [295357.436122] RDX: 00000000000199ba RSI: 6236612d38373234 RDI: ffff88203ec259a0 > [295357.471905] RBP: ffffc90007bb7e08 R08: 0000000000028060 R09: ffffffff82051cc0 > [295357.508220] R10: 0000000000002000 R11: 0000000000000040 R12: 00000000014000c0 > [295357.544201] R13: ffff88203ec25980 R14: ffff88203ec25980 R15: ffff882000000000 > [295357.580063] ? idr_alloc_cmn+0x98/0xe0 > [295357.598651] kmem_cache_alloc+0x9c/0x1b0 > [295357.617905] ? fsnotify_add_mark_locked+0x153/0x320 > [295357.641988] fsnotify_add_mark_locked+0x153/0x320 > [295357.665286] SyS_inotify_add_watch+0x2d5/0x350 > [295357.687722] do_syscall_64+0x79/0x1b0 > [295357.706171] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 > [295357.731499] RIP: 0033:0x7f3f53f409b7 > [295357.749414] RSP: 002b:00007f3f439f70c8 EFLAGS: 00000202 ORIG_RAX: 00000000000000fe > [295357.787490] RAX: ffffffffffffffda RBX: 00007f3f2c232fc0 RCX: 00007f3f53f409b7 > [295357.823420] RDX: 0000000022000fc6 RSI: 0000000002eaba50 RDI: 0000000000000018 > [295357.859615] RBP: 0000000002677d20 R08: 000000005ad2a563 R09: 0000000009caa9a8 > [295357.895120] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000002677d20 > [295357.931829] R13: 000000000000fd02 R14: 000000000005dc08 R15: 00000000000081a4 > [295357.967565] Code: c0 74 1a 48 8b 05 7f 44 ec 00 be fd 00 00 00 48 8b 80 a0 00 00 00 e8 ae 1a 9b > 00 5d c3 89 fe 48 c7 c7 b8 26 e5 81 e8 21 45 09 00 <0f> 0b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f > 44 00 00 55 48 > [295358.060705] ---[ end trace 97f09d2dbcdbfe0a ]--- > > > ---[ sar -f ./sa15 -s 01:05:00 -e 02:00:00 -P 26 ]--- > Linux 4.14.32-1.el7.x86_64 (foomar) 04/15/2018 _x86_64_ (32 CPU) > > 01:05:00 AM CPU %user %nice %system %iowait %steal %idle > 01:05:01 AM 26 0.00 0.00 0.00 0.00 0.00 100.00 > 01:05:02 AM 26 0.00 0.00 0.00 0.00 0.00 100.00 > 01:05:03 AM 26 0.00 0.00 0.00 0.00 0.00 100.00 > 01:05:04 AM 26 0.00 0.00 0.00 0.00 0.00 100.00 > 01:05:05 AM 26 0.99 0.00 0.99 0.00 0.00 98.02 > 01:05:06 AM 26 0.00 0.00 0.00 0.00 0.00 100.00 > 01:05:07 AM 26 0.00 0.00 0.00 0.00 0.00 100.00 > 01:05:08 AM 26 0.00 0.00 0.00 0.00 0.00 100.00 > 01:05:09 AM 26 0.00 0.00 0.00 0.00 0.00 100.00 > 01:05:10 AM 26 0.00 0.00 0.00 0.00 0.00 100.00 > 01:05:11 AM 26 0.99 0.00 0.00 0.00 0.00 99.01 > 01:05:12 AM 26 0.00 0.00 0.00 0.00 0.00 100.00 > 01:05:13 AM 26 0.00 0.00 0.00 0.00 0.00 100.00 > 01:05:14 AM 26 0.00 0.00 0.00 0.00 0.00 100.00 > 01:05:15 AM 26 0.00 0.00 0.00 0.00 0.00 100.00 > 01:05:16 AM 26 2.00 0.00 1.00 0.00 0.00 97.00 > 01:05:17 AM 26 0.00 0.00 0.00 0.00 0.00 100.00 > 01:05:18 AM 26 0.00 0.00 0.00 0.00 0.00 100.00 > ---[ sar -f ./sa15 -s 01:05:00 -e 02:00:00 -P 26 ]--- > > > Any ideas would be very much appreciated. > > Cheers, > Pavlos Parissis > -- Guillaume Morin <guillaume(a)morinfr.org>

7 years, 4 months

7
18
0 0

[RHEL7.6 PATCH 35/37] Drivers: hv: vmbus: do not mark HV_PCIE as perf_device

by Mohammed Gamal

The pci-hyperv driver's channel callback hv_pci_onchannelcallback() is not really a hot path, so we don't need to mark it as a perf_device, meaning with this patch all HV_PCIE channels' target_cpu will be CPU0. Signed-off-by: Dexuan Cui <decui(a)microsoft.com> Cc: stable(a)vger.kernel.org Cc: Stephen Hemminger <sthemmin(a)microsoft.com> Cc: K. Y. Srinivasan <kys(a)microsoft.com> Signed-off-by: K. Y. Srinivasan <kys(a)microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> (cherry picked from commit 238064f13d057390a8c5e1a6a80f4f0a0ec46499) Signed-off-by: Mohammed Gamal <mgamal(a)redhat.com> --- drivers/hv/channel_mgmt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index c6d9d19..ecc2bd2 100644 --- a/drivers/hv/channel_mgmt.c +++ b/drivers/hv/channel_mgmt.c @@ -71,7 +71,7 @@ static const struct vmbus_device vmbus_devs[] = { /* PCIE */ { .dev_type = HV_PCIE, HV_PCIE_GUID, - .perf_device = true, + .perf_device = false, }, /* Synthetic Frame Buffer */ -- 1.8.3.1

7 years, 4 months

1
1
0 0

FAILED: patch "[PATCH] pwm: mediatek: Improve precision in rate calculation" failed to apply to 4.14-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.14-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 04c0a4e00dc11fedc0b0a8593adcf0f4310505d4 Mon Sep 17 00:00:00 2001 From: Sean Wang <sean.wang(a)mediatek.com> Date: Fri, 2 Mar 2018 16:49:14 +0800 Subject: [PATCH] pwm: mediatek: Improve precision in rate calculation Add a way that turning resolution from in nanosecond into in picosecond to improve noticeably almost 4.5% precision. It's necessary to hold the new resolution with type u64 and thus related operations on u64 are applied instead in those rate calculations. And the patch has a dependency on [1]. [1] http://lists.infradead.org/pipermail/linux-mediatek/2018-March/012225.html Cc: stable(a)vger.kernel.org Fixes: caf065f8fd58 ("pwm: Add MediaTek PWM support") Signed-off-by: Sean Wang <sean.wang(a)mediatek.com> Signed-off-by: Thierry Reding <thierry.reding(a)gmail.com> diff --git a/drivers/pwm/pwm-mediatek.c b/drivers/pwm/pwm-mediatek.c index 502c366c7d7c..328c124773b2 100644 --- a/drivers/pwm/pwm-mediatek.c +++ b/drivers/pwm/pwm-mediatek.c @@ -135,19 +135,25 @@ static int mtk_pwm_config(struct pwm_chip *chip, struct pwm_device *pwm, { struct mtk_pwm_chip *pc = to_mtk_pwm_chip(chip); struct clk *clk = pc->clks[MTK_CLK_PWM1 + pwm->hwpwm]; - u32 resolution, clkdiv = 0, reg_width = PWMDWIDTH, + u32 clkdiv = 0, cnt_period, cnt_duty, reg_width = PWMDWIDTH, reg_thres = PWMTHRES; + u64 resolution; int ret; ret = mtk_pwm_clk_enable(chip, pwm); if (ret < 0) return ret; - resolution = NSEC_PER_SEC / clk_get_rate(clk); + /* Using resolution in picosecond gets accuracy higher */ + resolution = (u64)NSEC_PER_SEC * 1000; + do_div(resolution, clk_get_rate(clk)); - while (period_ns / resolution > 8191) { + cnt_period = DIV_ROUND_CLOSEST_ULL((u64)period_ns * 1000, resolution); + while (cnt_period > 8191) { resolution *= 2; clkdiv++; + cnt_period = DIV_ROUND_CLOSEST_ULL((u64)period_ns * 1000, + resolution); } if (clkdiv > PWM_CLK_DIV_MAX) { @@ -165,9 +171,10 @@ static int mtk_pwm_config(struct pwm_chip *chip, struct pwm_device *pwm, reg_thres = PWM45THRES_FIXUP; } + cnt_duty = DIV_ROUND_CLOSEST_ULL((u64)duty_ns * 1000, resolution); mtk_pwm_writel(pc, pwm->hwpwm, PWMCON, BIT(15) | clkdiv); - mtk_pwm_writel(pc, pwm->hwpwm, reg_width, period_ns / resolution); - mtk_pwm_writel(pc, pwm->hwpwm, reg_thres, duty_ns / resolution); + mtk_pwm_writel(pc, pwm->hwpwm, reg_width, cnt_period); + mtk_pwm_writel(pc, pwm->hwpwm, reg_thres, cnt_duty); mtk_pwm_clk_disable(chip, pwm);

7 years, 4 months

1
0
0 0

FAILED: patch "[PATCH] pwm: mediatek: Fix up PWM4 and PWM5 malfunction on MT7623" failed to apply to 4.14-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.14-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 360cc036563db27881ce08049f69138438f2ddd0 Mon Sep 17 00:00:00 2001 From: Sean Wang <sean.wang(a)mediatek.com> Date: Thu, 1 Mar 2018 16:19:12 +0800 Subject: [PATCH] pwm: mediatek: Fix up PWM4 and PWM5 malfunction on MT7623 Since the offset for both registers, PWMDWIDTH and PWMTHRES, used to control PWM4 or PWM5 are distinct from the other PWMs, whose wrong programming on PWM hardware causes waveform cannot be output as expected. Thus, the patch adds the extra condition for fixing up the weird case to let PWM4 or PWM5 able to work on MT7623. v1 -> v2: use pwm45_fixup naming instead of pwm45_quirk v2 -> v3: add more tags for Reviewed-by, Fixes, and Cc stable Cc: stable(a)vger.kernel.org Fixes: caf065f8fd58 ("pwm: Add MediaTek PWM support") Signed-off-by: Sean Wang <sean.wang(a)mediatek.com> Reviewed-by: Matthias Brugger <matthias.bgg(a)gmail.com> Cc: Zhi Mao <zhi.mao(a)mediatek.com> Cc: John Crispin <john(a)phrozen.org> Cc: Matthias Brugger <matthias.bgg(a)gmail.com> Signed-off-by: Thierry Reding <thierry.reding(a)gmail.com> diff --git a/drivers/pwm/pwm-mediatek.c b/drivers/pwm/pwm-mediatek.c index f5d97e0ad52b..796baea7e8fe 100644 --- a/drivers/pwm/pwm-mediatek.c +++ b/drivers/pwm/pwm-mediatek.c @@ -29,7 +29,9 @@ #define PWMGDUR 0x0c #define PWMWAVENUM 0x28 #define PWMDWIDTH 0x2c +#define PWM45DWIDTH_FIXUP 0x30 #define PWMTHRES 0x30 +#define PWM45THRES_FIXUP 0x34 #define PWM_CLK_DIV_MAX 7 @@ -54,6 +56,7 @@ static const char * const mtk_pwm_clk_name[MTK_CLK_MAX] = { struct mtk_pwm_platform_data { unsigned int num_pwms; + bool pwm45_fixup; }; /** @@ -66,6 +69,7 @@ struct mtk_pwm_chip { struct pwm_chip chip; void __iomem *regs; struct clk *clks[MTK_CLK_MAX]; + const struct mtk_pwm_platform_data *soc; }; static const unsigned int mtk_pwm_reg_offset[] = { @@ -131,7 +135,8 @@ static int mtk_pwm_config(struct pwm_chip *chip, struct pwm_device *pwm, { struct mtk_pwm_chip *pc = to_mtk_pwm_chip(chip); struct clk *clk = pc->clks[MTK_CLK_PWM1 + pwm->hwpwm]; - u32 resolution, clkdiv = 0; + u32 resolution, clkdiv = 0, reg_width = PWMDWIDTH, + reg_thres = PWMTHRES; int ret; ret = mtk_pwm_clk_enable(chip, pwm); @@ -151,9 +156,18 @@ static int mtk_pwm_config(struct pwm_chip *chip, struct pwm_device *pwm, return -EINVAL; } + if (pc->soc->pwm45_fixup && pwm->hwpwm > 2) { + /* + * PWM[4,5] has distinct offset for PWMDWIDTH and PWMTHRES + * from the other PWMs on MT7623. + */ + reg_width = PWM45DWIDTH_FIXUP; + reg_thres = PWM45THRES_FIXUP; + } + mtk_pwm_writel(pc, pwm->hwpwm, PWMCON, BIT(15) | clkdiv); - mtk_pwm_writel(pc, pwm->hwpwm, PWMDWIDTH, period_ns / resolution); - mtk_pwm_writel(pc, pwm->hwpwm, PWMTHRES, duty_ns / resolution); + mtk_pwm_writel(pc, pwm->hwpwm, reg_width, period_ns / resolution); + mtk_pwm_writel(pc, pwm->hwpwm, reg_thres, duty_ns / resolution); mtk_pwm_clk_disable(chip, pwm); @@ -211,6 +225,7 @@ static int mtk_pwm_probe(struct platform_device *pdev) data = of_device_get_match_data(&pdev->dev); if (data == NULL) return -EINVAL; + pc->soc = data; res = platform_get_resource(pdev, IORESOURCE_MEM, 0); pc->regs = devm_ioremap_resource(&pdev->dev, res); @@ -251,14 +266,17 @@ static int mtk_pwm_remove(struct platform_device *pdev) static const struct mtk_pwm_platform_data mt2712_pwm_data = { .num_pwms = 8, + .pwm45_fixup = false, }; static const struct mtk_pwm_platform_data mt7622_pwm_data = { .num_pwms = 6, + .pwm45_fixup = false, }; static const struct mtk_pwm_platform_data mt7623_pwm_data = { .num_pwms = 5, + .pwm45_fixup = true, }; static const struct of_device_id mtk_pwm_of_match[] = {

7 years, 4 months

1
0
0 0

FAILED: patch "[PATCH] pwm: rcar: Fix a condition to prevent mismatch value setting" failed to apply to 4.4-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.4-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 6225f9c64b40bc8a22503e9cda70f55d7a9dd3c6 Mon Sep 17 00:00:00 2001 From: Ryo Kodama <ryo.kodama.vz(a)renesas.com> Date: Fri, 9 Mar 2018 20:24:21 +0900 Subject: [PATCH] pwm: rcar: Fix a condition to prevent mismatch value setting to duty This patch fixes an issue that is possible to set mismatch value to duty for R-Car PWM if we input the following commands: # cd /sys/class/pwm/<pwmchip>/ # echo 0 > export # cd pwm0 # echo 30 > period # echo 30 > duty_cycle # echo 0 > duty_cycle # cat duty_cycle 0 # echo 1 > enable --> Then, the actual duty_cycle is 30, not 0. So, this patch adds a condition into rcar_pwm_config() to fix this issue. Signed-off-by: Ryo Kodama <ryo.kodama.vz(a)renesas.com> [shimoda: revise the commit log and add Fixes and Cc tags] Fixes: ed6c1476bf7f ("pwm: Add support for R-Car PWM Timer") Cc: Cc: <stable(a)vger.kernel.org> # v4.4+ Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com> Signed-off-by: Thierry Reding <thierry.reding(a)gmail.com> diff --git a/drivers/pwm/pwm-rcar.c b/drivers/pwm/pwm-rcar.c index 1c85ecc9e7ac..0fcf94ffad32 100644 --- a/drivers/pwm/pwm-rcar.c +++ b/drivers/pwm/pwm-rcar.c @@ -156,8 +156,12 @@ static int rcar_pwm_config(struct pwm_chip *chip, struct pwm_device *pwm, if (div < 0) return div; - /* Let the core driver set pwm->period if disabled and duty_ns == 0 */ - if (!pwm_is_enabled(pwm) && !duty_ns) + /* + * Let the core driver set pwm->period if disabled and duty_ns == 0. + * But, this driver should prevent to set the new duty_ns if current + * duty_cycle is not set + */ + if (!pwm_is_enabled(pwm) && !duty_ns && !pwm->state.duty_cycle) return 0; rcar_pwm_update(rp, RCAR_PWMCR_SYNC, RCAR_PWMCR_SYNC, RCAR_PWMCR);

7 years, 4 months

1
0
0 0

FAILED: patch "[PATCH] drm/amd/display: check for ipp before calling cursor" failed to apply to 4.16-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.16-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 5d447f09b8d8346c64f4c952a67c61f7ce88d3c1 Mon Sep 17 00:00:00 2001 From: Shirish S <shirish.s(a)amd.com> Date: Wed, 21 Feb 2018 16:10:33 +0530 Subject: [PATCH] drm/amd/display: check for ipp before calling cursor operations Currently all cursor related functions are made to all pipes that are attached to a particular stream. This is not applicable to pipes that do not have cursor plane initialised like underlay. Hence this patch allows cursor related operations on a pipe only if ipp in available on that particular pipe. The check is added to set_cursor_position & set_cursor_attribute. Signed-off-by: Shirish S <shirish.s(a)amd.com> Reviewed-by: Harry Wentland <harry.wentland(a)amd.com> Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com> Cc: stable(a)vger.kernel.org diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c index 87a193ac2883..cd5819789d76 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c @@ -198,7 +198,8 @@ bool dc_stream_set_cursor_attributes( for (i = 0; i < MAX_PIPES; i++) { struct pipe_ctx *pipe_ctx = &res_ctx->pipe_ctx[i]; - if (pipe_ctx->stream != stream || (!pipe_ctx->plane_res.xfm && !pipe_ctx->plane_res.dpp)) + if (pipe_ctx->stream != stream || (!pipe_ctx->plane_res.xfm && + !pipe_ctx->plane_res.dpp) || !pipe_ctx->plane_res.ipp) continue; if (pipe_ctx->top_pipe && pipe_ctx->plane_state != pipe_ctx->top_pipe->plane_state) continue; @@ -237,7 +238,8 @@ bool dc_stream_set_cursor_position( if (pipe_ctx->stream != stream || (!pipe_ctx->plane_res.mi && !pipe_ctx->plane_res.hubp) || !pipe_ctx->plane_state || - (!pipe_ctx->plane_res.xfm && !pipe_ctx->plane_res.dpp)) + (!pipe_ctx->plane_res.xfm && !pipe_ctx->plane_res.dpp) || + !pipe_ctx->plane_res.ipp) continue; core_dc->hwss.set_cursor_position(pipe_ctx);

7 years, 4 months

1
0
0 0

FAILED: patch "[PATCH] drm/amd/display: Default HDMI6G support to true. Log VBIOS" failed to apply to 4.16-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.16-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From ea74e15fb547483f9f86088443f2d3c9f518de8b Mon Sep 17 00:00:00 2001 From: Harry Wentland <harry.wentland(a)amd.com> Date: Tue, 20 Feb 2018 13:36:23 -0500 Subject: [PATCH] drm/amd/display: Default HDMI6G support to true. Log VBIOS table error. There have been many reports of Ellesmere and Baffin systems not being able to drive HDMI 4k60 due to the fact that we check the HDMI_6GB_EN bit from VBIOS table. Windows seems to not have this issue. On some systems we fail to the encoder cap info from VBIOS. In that case we should default to enabling HDMI6G support. This was tested by dwagner on https://bugs.freedesktop.org/show_bug.cgi?id=102820 Signed-off-by: Harry Wentland <harry.wentland(a)amd.com> Reviewed-by: Roman Li <Roman.Li(a)amd.com> Reviewed-by: Tony Cheng <Tony.Cheng(a)amd.com> Acked-by: Harry Wentland <harry.wentland(a)amd.com> Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com> Cc: stable(a)vger.kernel.org diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_link_encoder.c b/drivers/gpu/drm/amd/display/dc/dce/dce_link_encoder.c index f0d63ac7724a..81776e4797ed 100644 --- a/drivers/gpu/drm/amd/display/dc/dce/dce_link_encoder.c +++ b/drivers/gpu/drm/amd/display/dc/dce/dce_link_encoder.c @@ -678,6 +678,7 @@ void dce110_link_encoder_construct( { struct bp_encoder_cap_info bp_cap_info = {0}; const struct dc_vbios_funcs *bp_funcs = init_data->ctx->dc_bios->funcs; + enum bp_result result = BP_RESULT_OK; enc110->base.funcs = &dce110_lnk_enc_funcs; enc110->base.ctx = init_data->ctx; @@ -752,15 +753,24 @@ void dce110_link_encoder_construct( enc110->base.preferred_engine = ENGINE_ID_UNKNOWN; } + /* default to one to mirror Windows behavior */ + enc110->base.features.flags.bits.HDMI_6GB_EN = 1; + + result = bp_funcs->get_encoder_cap_info(enc110->base.ctx->dc_bios, + enc110->base.id, &bp_cap_info); + /* Override features with DCE-specific values */ - if (BP_RESULT_OK == bp_funcs->get_encoder_cap_info( - enc110->base.ctx->dc_bios, enc110->base.id, - &bp_cap_info)) { + if (BP_RESULT_OK == result) { enc110->base.features.flags.bits.IS_HBR2_CAPABLE = bp_cap_info.DP_HBR2_EN; enc110->base.features.flags.bits.IS_HBR3_CAPABLE = bp_cap_info.DP_HBR3_EN; enc110->base.features.flags.bits.HDMI_6GB_EN = bp_cap_info.HDMI_6GB_EN; + } else { + dm_logger_write(enc110->base.ctx->logger, LOG_WARNING, + "%s: Failed to get encoder_cap_info from VBIOS with error code %d!\n", + __func__, + result); } }

7 years, 4 months

1
0
0 0

FAILED: patch "[PATCH] nfit: skip region registration for incomplete control regions" failed to apply to 4.4-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.4-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 0731de476a37c33485af82d64041c9d193208df8 Mon Sep 17 00:00:00 2001 From: Dan Williams <dan.j.williams(a)intel.com> Date: Wed, 21 Mar 2018 21:22:34 -0700 Subject: [PATCH] nfit: skip region registration for incomplete control regions Per the ACPI specification the only functional purpose for a DIMM Control Region to be mapped into the system physical address space, from an OSPM perspective, is to support block-apertures. However, there are some BIOSen that publish DIMM Control Region SPA entries for pre-boot environment consumption. Undo the kernel policy of generating disabled 'ndblk' regions when this configuration is detected. Cc: <stable(a)vger.kernel.org> Fixes: 1f7df6f88b92 ("libnvdimm, nfit: regions (block-data-window...)") Reviewed-by: Toshi Kani <toshi.kani(a)hpe.com> Signed-off-by: Dan Williams <dan.j.williams(a)intel.com> diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c index 39ad06143e78..4530d89044db 100644 --- a/drivers/acpi/nfit/core.c +++ b/drivers/acpi/nfit/core.c @@ -2578,7 +2578,7 @@ static int acpi_nfit_init_mapping(struct acpi_nfit_desc *acpi_desc, struct acpi_nfit_system_address *spa = nfit_spa->spa; struct nd_blk_region_desc *ndbr_desc; struct nfit_mem *nfit_mem; - int blk_valid = 0, rc; + int rc; if (!nvdimm) { dev_err(acpi_desc->dev, "spa%d dimm: %#x not found\n", @@ -2598,15 +2598,14 @@ static int acpi_nfit_init_mapping(struct acpi_nfit_desc *acpi_desc, if (!nfit_mem || !nfit_mem->bdw) { dev_dbg(acpi_desc->dev, "spa%d %s missing bdw\n", spa->range_index, nvdimm_name(nvdimm)); - } else { - mapping->size = nfit_mem->bdw->capacity; - mapping->start = nfit_mem->bdw->start_address; - ndr_desc->num_lanes = nfit_mem->bdw->windows; - blk_valid = 1; + break; } + mapping->size = nfit_mem->bdw->capacity; + mapping->start = nfit_mem->bdw->start_address; + ndr_desc->num_lanes = nfit_mem->bdw->windows; ndr_desc->mapping = mapping; - ndr_desc->num_mappings = blk_valid; + ndr_desc->num_mappings = 1; ndbr_desc = to_blk_region_desc(ndr_desc); ndbr_desc->enable = acpi_nfit_blk_region_enable; ndbr_desc->do_io = acpi_desc->blk_do_io;

7 years, 4 months

1
0
0 0

FAILED: patch "[PATCH] nfit: skip region registration for incomplete control regions" failed to apply to 4.9-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.9-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 0731de476a37c33485af82d64041c9d193208df8 Mon Sep 17 00:00:00 2001 From: Dan Williams <dan.j.williams(a)intel.com> Date: Wed, 21 Mar 2018 21:22:34 -0700 Subject: [PATCH] nfit: skip region registration for incomplete control regions Per the ACPI specification the only functional purpose for a DIMM Control Region to be mapped into the system physical address space, from an OSPM perspective, is to support block-apertures. However, there are some BIOSen that publish DIMM Control Region SPA entries for pre-boot environment consumption. Undo the kernel policy of generating disabled 'ndblk' regions when this configuration is detected. Cc: <stable(a)vger.kernel.org> Fixes: 1f7df6f88b92 ("libnvdimm, nfit: regions (block-data-window...)") Reviewed-by: Toshi Kani <toshi.kani(a)hpe.com> Signed-off-by: Dan Williams <dan.j.williams(a)intel.com> diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c index 39ad06143e78..4530d89044db 100644 --- a/drivers/acpi/nfit/core.c +++ b/drivers/acpi/nfit/core.c @@ -2578,7 +2578,7 @@ static int acpi_nfit_init_mapping(struct acpi_nfit_desc *acpi_desc, struct acpi_nfit_system_address *spa = nfit_spa->spa; struct nd_blk_region_desc *ndbr_desc; struct nfit_mem *nfit_mem; - int blk_valid = 0, rc; + int rc; if (!nvdimm) { dev_err(acpi_desc->dev, "spa%d dimm: %#x not found\n", @@ -2598,15 +2598,14 @@ static int acpi_nfit_init_mapping(struct acpi_nfit_desc *acpi_desc, if (!nfit_mem || !nfit_mem->bdw) { dev_dbg(acpi_desc->dev, "spa%d %s missing bdw\n", spa->range_index, nvdimm_name(nvdimm)); - } else { - mapping->size = nfit_mem->bdw->capacity; - mapping->start = nfit_mem->bdw->start_address; - ndr_desc->num_lanes = nfit_mem->bdw->windows; - blk_valid = 1; + break; } + mapping->size = nfit_mem->bdw->capacity; + mapping->start = nfit_mem->bdw->start_address; + ndr_desc->num_lanes = nfit_mem->bdw->windows; ndr_desc->mapping = mapping; - ndr_desc->num_mappings = blk_valid; + ndr_desc->num_mappings = 1; ndbr_desc = to_blk_region_desc(ndr_desc); ndbr_desc->enable = acpi_nfit_blk_region_enable; ndbr_desc->do_io = acpi_desc->blk_do_io;

7 years, 4 months

1
0
0 0

FAILED: patch "[PATCH] libnvdimm, dimm: fix dpa reservation vs uninitialized label" failed to apply to 4.4-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.4-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From c31898c8c711f2bbbcaebe802a55827e288d875a Mon Sep 17 00:00:00 2001 From: Dan Williams <dan.j.williams(a)intel.com> Date: Fri, 6 Apr 2018 11:25:38 -0700 Subject: [PATCH] libnvdimm, dimm: fix dpa reservation vs uninitialized label area At initialization time the 'dimm' driver caches a copy of the memory device's label area and reserves address space for each of the namespaces defined. However, as can be seen below, the reservation occurs even when the index blocks are invalid: nvdimm nmem0: nvdimm_init_config_data: len: 131072 rc: 0 nvdimm nmem0: config data size: 131072 nvdimm nmem0: __nd_label_validate: nsindex0 labelsize 1 invalid nvdimm nmem0: __nd_label_validate: nsindex1 labelsize 1 invalid nvdimm nmem0: : pmem-6025e505: 0x1000000000 @ 0xf50000000 reserve <-- bad Gate dpa reservation on the presence of valid index blocks. Cc: <stable(a)vger.kernel.org> Fixes: 4a826c83db4e ("libnvdimm: namespace indices: read and validate") Reported-by: Krzysztof Rusocki <krzysztof.rusocki(a)intel.com> Signed-off-by: Dan Williams <dan.j.williams(a)intel.com> diff --git a/drivers/nvdimm/dimm.c b/drivers/nvdimm/dimm.c index f8913b8124b6..233907889f96 100644 --- a/drivers/nvdimm/dimm.c +++ b/drivers/nvdimm/dimm.c @@ -67,9 +67,11 @@ static int nvdimm_probe(struct device *dev) ndd->ns_next = nd_label_next_nsindex(ndd->ns_current); nd_label_copy(ndd, to_next_namespace_index(ndd), to_current_namespace_index(ndd)); - rc = nd_label_reserve_dpa(ndd); - if (ndd->ns_current >= 0) - nvdimm_set_aliasing(dev); + if (ndd->ns_current >= 0) { + rc = nd_label_reserve_dpa(ndd); + if (rc == 0) + nvdimm_set_aliasing(dev); + } nvdimm_clear_locked(dev); nvdimm_bus_unlock(dev);

7 years, 4 months

1
0
0 0

FAILED: patch "[PATCH] libnvdimm, dimm: fix dpa reservation vs uninitialized label" failed to apply to 4.9-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.9-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From c31898c8c711f2bbbcaebe802a55827e288d875a Mon Sep 17 00:00:00 2001 From: Dan Williams <dan.j.williams(a)intel.com> Date: Fri, 6 Apr 2018 11:25:38 -0700 Subject: [PATCH] libnvdimm, dimm: fix dpa reservation vs uninitialized label area At initialization time the 'dimm' driver caches a copy of the memory device's label area and reserves address space for each of the namespaces defined. However, as can be seen below, the reservation occurs even when the index blocks are invalid: nvdimm nmem0: nvdimm_init_config_data: len: 131072 rc: 0 nvdimm nmem0: config data size: 131072 nvdimm nmem0: __nd_label_validate: nsindex0 labelsize 1 invalid nvdimm nmem0: __nd_label_validate: nsindex1 labelsize 1 invalid nvdimm nmem0: : pmem-6025e505: 0x1000000000 @ 0xf50000000 reserve <-- bad Gate dpa reservation on the presence of valid index blocks. Cc: <stable(a)vger.kernel.org> Fixes: 4a826c83db4e ("libnvdimm: namespace indices: read and validate") Reported-by: Krzysztof Rusocki <krzysztof.rusocki(a)intel.com> Signed-off-by: Dan Williams <dan.j.williams(a)intel.com> diff --git a/drivers/nvdimm/dimm.c b/drivers/nvdimm/dimm.c index f8913b8124b6..233907889f96 100644 --- a/drivers/nvdimm/dimm.c +++ b/drivers/nvdimm/dimm.c @@ -67,9 +67,11 @@ static int nvdimm_probe(struct device *dev) ndd->ns_next = nd_label_next_nsindex(ndd->ns_current); nd_label_copy(ndd, to_next_namespace_index(ndd), to_current_namespace_index(ndd)); - rc = nd_label_reserve_dpa(ndd); - if (ndd->ns_current >= 0) - nvdimm_set_aliasing(dev); + if (ndd->ns_current >= 0) { + rc = nd_label_reserve_dpa(ndd); + if (rc == 0) + nvdimm_set_aliasing(dev); + } nvdimm_clear_locked(dev); nvdimm_bus_unlock(dev);

7 years, 4 months

1
0
0 0

FAILED: patch "[PATCH] tpm: add retry logic" failed to apply to 4.16-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.16-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From e2fb992d82c626c43ed0566e07c410e56a087af3 Mon Sep 17 00:00:00 2001 From: James Bottomley <James.Bottomley(a)HansenPartnership.com> Date: Wed, 21 Mar 2018 11:43:48 -0700 Subject: [PATCH] tpm: add retry logic MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit TPM2 can return TPM2_RC_RETRY to any command and when it does we get unexpected failures inside the kernel that surprise users (this is mostly observed in the trusted key handling code). The UEFI 2.6 spec has advice on how to handle this: The firmware SHALL not return TPM2_RC_RETRY prior to the completion of the call to ExitBootServices(). Implementer’s Note: the implementation of this function should check the return value in the TPM response and, if it is TPM2_RC_RETRY, resend the command. The implementation may abort if a sufficient number of retries has been done. So we follow that advice in our tpm_transmit() code using TPM2_DURATION_SHORT as the initial wait duration and TPM2_DURATION_LONG as the maximum wait time. This should fix all the in-kernel use cases and also means that user space TSS implementations don't have to have their own retry handling. Signed-off-by: James Bottomley <James.Bottomley(a)HansenPartnership.com> Cc: stable(a)vger.kernel.org Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen(a)linux.intel.com> Tested-by: Jarkko Sakkinen <jarkko.sakkinen(a)linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen(a)linux.intel.com> diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c index 22288ff70a0b..d5379a79274c 100644 --- a/drivers/char/tpm/tpm-interface.c +++ b/drivers/char/tpm/tpm-interface.c @@ -399,21 +399,10 @@ static void tpm_relinquish_locality(struct tpm_chip *chip) chip->locality = -1; } -/** - * tpm_transmit - Internal kernel interface to transmit TPM commands. - * - * @chip: TPM chip to use - * @space: tpm space - * @buf: TPM command buffer - * @bufsiz: length of the TPM command buffer - * @flags: tpm transmit flags - bitmap - * - * Return: - * 0 when the operation is successful. - * A negative number for system errors (errno). - */ -ssize_t tpm_transmit(struct tpm_chip *chip, struct tpm_space *space, - u8 *buf, size_t bufsiz, unsigned int flags) +static ssize_t tpm_try_transmit(struct tpm_chip *chip, + struct tpm_space *space, + u8 *buf, size_t bufsiz, + unsigned int flags) { struct tpm_output_header *header = (void *)buf; int rc; @@ -544,6 +533,62 @@ ssize_t tpm_transmit(struct tpm_chip *chip, struct tpm_space *space, return rc ? rc : len; } +/** + * tpm_transmit - Internal kernel interface to transmit TPM commands. + * + * @chip: TPM chip to use + * @space: tpm space + * @buf: TPM command buffer + * @bufsiz: length of the TPM command buffer + * @flags: tpm transmit flags - bitmap + * + * A wrapper around tpm_try_transmit that handles TPM2_RC_RETRY + * returns from the TPM and retransmits the command after a delay up + * to a maximum wait of TPM2_DURATION_LONG. + * + * Note: TPM1 never returns TPM2_RC_RETRY so the retry logic is TPM2 + * only + * + * Return: + * the length of the return when the operation is successful. + * A negative number for system errors (errno). + */ +ssize_t tpm_transmit(struct tpm_chip *chip, struct tpm_space *space, + u8 *buf, size_t bufsiz, unsigned int flags) +{ + struct tpm_output_header *header = (struct tpm_output_header *)buf; + /* space for header and handles */ + u8 save[TPM_HEADER_SIZE + 3*sizeof(u32)]; + unsigned int delay_msec = TPM2_DURATION_SHORT; + u32 rc = 0; + ssize_t ret; + const size_t save_size = min(space ? sizeof(save) : TPM_HEADER_SIZE, + bufsiz); + + /* + * Subtlety here: if we have a space, the handles will be + * transformed, so when we restore the header we also have to + * restore the handles. + */ + memcpy(save, buf, save_size); + + for (;;) { + ret = tpm_try_transmit(chip, space, buf, bufsiz, flags); + if (ret < 0) + break; + rc = be32_to_cpu(header->return_code); + if (rc != TPM2_RC_RETRY) + break; + delay_msec *= 2; + if (delay_msec > TPM2_DURATION_LONG) { + dev_err(&chip->dev, "TPM is in retry loop\n"); + break; + } + tpm_msleep(delay_msec); + memcpy(buf, save, save_size); + } + return ret; +} /** * tpm_transmit_cmd - send a tpm command to the device * The function extracts tpm out header return code diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h index ab3bcdd4d328..67656a97793a 100644 --- a/drivers/char/tpm/tpm.h +++ b/drivers/char/tpm/tpm.h @@ -115,6 +115,7 @@ enum tpm2_return_codes { TPM2_RC_COMMAND_CODE = 0x0143, TPM2_RC_TESTING = 0x090A, /* RC_WARN */ TPM2_RC_REFERENCE_H0 = 0x0910, + TPM2_RC_RETRY = 0x0922, }; enum tpm2_algorithms {

7 years, 4 months

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror