From: Leo Li <sunpeng.li(a)amd.com>
[Why]
It seems HPD interrupts are enabled by default for all connectors, even
if the hpd source isn't valid. An eDP for example, does not have a valid
hpd source (but does have a valid hpdrx source; see construct_phy()).
Thus, eDPs should have their hpd interrupt disabled.
In the past, this wasn't really an issue. Although the driver gets
interrupted, then acks by writing to hw registers, there weren't any
subscribed handlers that did anything meaningful (see
register_hpd_handlers()).
But things changed with the introduction of IPS. s2idle requires that
the driver allows IPS for DMUB fw to put hw to sleep. Since register
access requires hw to be awake, the driver will block IPS entry to do
so. And no IPS means no hw sleep during s2idle.
This was the observation on DCN35 systems with an eDP. During suspend,
the eDP toggled its hpd pin as part of the panel power down sequence.
The driver was then interrupted, and acked by writing to registers,
blocking IPS entry.
[How]
Since DC marks eDP connections as having invalid hpd sources (see
construct_phy()), DM should disable them at the hw level. Do so in
amdgpu_dm_hpd_init() by disabling all hpd ints first, then selectively
enabling ones for connectors that have valid hpd sources.
Signed-off-by: Leo Li <sunpeng.li(a)amd.com>
Cc: stable(a)vger.kernel.org
---
.../drm/amd/display/amdgpu_dm/amdgpu_dm_irq.c | 64 +++++++++++++------
1 file changed, 45 insertions(+), 19 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_irq.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_irq.c
index c4a7fd453e5fc..a215234151ac3 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_irq.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_irq.c
@@ -894,8 +894,16 @@ void amdgpu_dm_hpd_init(struct amdgpu_device *adev)
struct drm_device *dev = adev_to_drm(adev);
struct drm_connector *connector;
struct drm_connector_list_iter iter;
+ int irq_type;
int i;
+ /* First, clear all hpd and hpdrx interrupts */
+ for (i = DC_IRQ_SOURCE_HPD1; i <= DC_IRQ_SOURCE_HPD6RX; i++) {
+ if (!dc_interrupt_set(adev->dm.dc, i, false))
+ drm_err(dev, "Failed to clear hpd(rx) source=%d on init\n",
+ i);
+ }
+
drm_connector_list_iter_begin(dev, &iter);
drm_for_each_connector_iter(connector, &iter) {
struct amdgpu_dm_connector *amdgpu_dm_connector;
@@ -908,10 +916,31 @@ void amdgpu_dm_hpd_init(struct amdgpu_device *adev)
dc_link = amdgpu_dm_connector->dc_link;
+ /*
+ * Get a base driver irq reference for hpd ints for the lifetime
+ * of dm. Note that only hpd interrupt types are registered with
+ * base driver; hpd_rx types aren't. IOW, amdgpu_irq_get/put on
+ * hpd_rx isn't available. DM currently controls hpd_rx
+ * explicitly with dc_interrupt_set()
+ */
if (dc_link->irq_source_hpd != DC_IRQ_SOURCE_INVALID) {
- dc_interrupt_set(adev->dm.dc,
- dc_link->irq_source_hpd,
- true);
+ irq_type = dc_link->irq_source_hpd - DC_IRQ_SOURCE_HPD1;
+ /*
+ * TODO: There's a mismatch between mode_info.num_hpd
+ * and what bios reports as the # of connectors with hpd
+ * sources. Since the # of hpd source types registered
+ * with base driver == mode_info.num_hpd, we have to
+ * fallback to dc_interrupt_set for the remaining types.
+ */
+ if (irq_type < adev->mode_info.num_hpd) {
+ if (amdgpu_irq_get(adev, &adev->hpd_irq, irq_type))
+ drm_err(dev, "DM_IRQ: Failed get HPD for source=%d)!\n",
+ dc_link->irq_source_hpd);
+ } else {
+ dc_interrupt_set(adev->dm.dc,
+ dc_link->irq_source_hpd,
+ true);
+ }
}
if (dc_link->irq_source_hpd_rx != DC_IRQ_SOURCE_INVALID) {
@@ -921,12 +950,6 @@ void amdgpu_dm_hpd_init(struct amdgpu_device *adev)
}
}
drm_connector_list_iter_end(&iter);
-
- /* Update reference counts for HPDs */
- for (i = DC_IRQ_SOURCE_HPD1; i <= adev->mode_info.num_hpd; i++) {
- if (amdgpu_irq_get(adev, &adev->hpd_irq, i - DC_IRQ_SOURCE_HPD1))
- drm_err(dev, "DM_IRQ: Failed get HPD for source=%d)!\n", i);
- }
}
/**
@@ -942,7 +965,7 @@ void amdgpu_dm_hpd_fini(struct amdgpu_device *adev)
struct drm_device *dev = adev_to_drm(adev);
struct drm_connector *connector;
struct drm_connector_list_iter iter;
- int i;
+ int irq_type;
drm_connector_list_iter_begin(dev, &iter);
drm_for_each_connector_iter(connector, &iter) {
@@ -956,9 +979,18 @@ void amdgpu_dm_hpd_fini(struct amdgpu_device *adev)
dc_link = amdgpu_dm_connector->dc_link;
if (dc_link->irq_source_hpd != DC_IRQ_SOURCE_INVALID) {
- dc_interrupt_set(adev->dm.dc,
- dc_link->irq_source_hpd,
- false);
+ irq_type = dc_link->irq_source_hpd - DC_IRQ_SOURCE_HPD1;
+
+ /* TODO: See same TODO in amdgpu_dm_hpd_init() */
+ if (irq_type < adev->mode_info.num_hpd) {
+ if (amdgpu_irq_put(adev, &adev->hpd_irq, irq_type))
+ drm_err(dev, "DM_IRQ: Failed put HPD for source=%d!\n",
+ dc_link->irq_source_hpd);
+ } else {
+ dc_interrupt_set(adev->dm.dc,
+ dc_link->irq_source_hpd,
+ false);
+ }
}
if (dc_link->irq_source_hpd_rx != DC_IRQ_SOURCE_INVALID) {
@@ -968,10 +1000,4 @@ void amdgpu_dm_hpd_fini(struct amdgpu_device *adev)
}
}
drm_connector_list_iter_end(&iter);
-
- /* Update reference counts for HPDs */
- for (i = DC_IRQ_SOURCE_HPD1; i <= adev->mode_info.num_hpd; i++) {
- if (amdgpu_irq_put(adev, &adev->hpd_irq, i - DC_IRQ_SOURCE_HPD1))
- drm_err(dev, "DM_IRQ: Failed put HPD for source=%d!\n", i);
- }
}
--
2.48.1
The patch below does not apply to the 6.13-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.13.y
git checkout FETCH_HEAD
git cherry-pick -x d584198a6fe4c51f4aa88ad72f258f8961a0f11c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025021802-scrimmage-oppressor-8e61@gregkh' --subject-prefix 'PATCH 6.13.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d584198a6fe4c51f4aa88ad72f258f8961a0f11c Mon Sep 17 00:00:00 2001
From: Lancelot SIX <lancelot.six(a)amd.com>
Date: Tue, 28 Jan 2025 19:16:49 +0000
Subject: [PATCH] drm/amdkfd: Ensure consistent barrier state saved in gfx12
trap handler
It is possible for some waves in a workgroup to finish their save
sequence before the group leader has had time to capture the workgroup
barrier state. When this happens, having those waves exit do impact the
barrier state. As a consequence, the state captured by the group leader
is invalid, and is eventually incorrectly restored.
This patch proposes to have all waves in a workgroup wait for each other
at the end of their save sequence (just before calling s_endpgm_saved).
Signed-off-by: Lancelot SIX <lancelot.six(a)amd.com>
Reviewed-by: Jay Cornwall <jay.cornwall(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org # 6.12.x
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
index 984f0e705078..651660958e5b 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
@@ -4121,7 +4121,8 @@ static const uint32_t cwsr_trap_gfx12_hex[] = {
0x0000ffff, 0x8bfe7e7e,
0x8bea6a6a, 0xb97af804,
0xbe804ec2, 0xbf94fffe,
- 0xbe804a6c, 0xbfb10000,
+ 0xbe804a6c, 0xbe804ec2,
+ 0xbf94fffe, 0xbfb10000,
0xbf9f0000, 0xbf9f0000,
0xbf9f0000, 0xbf9f0000,
0xbf9f0000, 0x00000000,
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
index 1740e98c6719..7b9d36e5fa43 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
@@ -1049,6 +1049,10 @@ L_SKIP_BARRIER_RESTORE:
s_rfe_b64 s_restore_pc_lo //Return to the main shader program and resume execution
L_END_PGM:
+ // Make sure that no wave of the workgroup can exit the trap handler
+ // before the workgroup barrier state is saved.
+ s_barrier_signal -2
+ s_barrier_wait -2
s_endpgm_saved
end
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x d584198a6fe4c51f4aa88ad72f258f8961a0f11c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025021803-fondness-ship-dc72@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d584198a6fe4c51f4aa88ad72f258f8961a0f11c Mon Sep 17 00:00:00 2001
From: Lancelot SIX <lancelot.six(a)amd.com>
Date: Tue, 28 Jan 2025 19:16:49 +0000
Subject: [PATCH] drm/amdkfd: Ensure consistent barrier state saved in gfx12
trap handler
It is possible for some waves in a workgroup to finish their save
sequence before the group leader has had time to capture the workgroup
barrier state. When this happens, having those waves exit do impact the
barrier state. As a consequence, the state captured by the group leader
is invalid, and is eventually incorrectly restored.
This patch proposes to have all waves in a workgroup wait for each other
at the end of their save sequence (just before calling s_endpgm_saved).
Signed-off-by: Lancelot SIX <lancelot.six(a)amd.com>
Reviewed-by: Jay Cornwall <jay.cornwall(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org # 6.12.x
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
index 984f0e705078..651660958e5b 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
@@ -4121,7 +4121,8 @@ static const uint32_t cwsr_trap_gfx12_hex[] = {
0x0000ffff, 0x8bfe7e7e,
0x8bea6a6a, 0xb97af804,
0xbe804ec2, 0xbf94fffe,
- 0xbe804a6c, 0xbfb10000,
+ 0xbe804a6c, 0xbe804ec2,
+ 0xbf94fffe, 0xbfb10000,
0xbf9f0000, 0xbf9f0000,
0xbf9f0000, 0xbf9f0000,
0xbf9f0000, 0x00000000,
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
index 1740e98c6719..7b9d36e5fa43 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
@@ -1049,6 +1049,10 @@ L_SKIP_BARRIER_RESTORE:
s_rfe_b64 s_restore_pc_lo //Return to the main shader program and resume execution
L_END_PGM:
+ // Make sure that no wave of the workgroup can exit the trap handler
+ // before the workgroup barrier state is saved.
+ s_barrier_signal -2
+ s_barrier_wait -2
s_endpgm_saved
end
Variable allocated by charlcd_alloc() should be released
by charlcd_free(). The following patch changed kfree() to
charlcd_free() to fix an API misuse.
Fixes: 718e05ed92ec ("auxdisplay: Introduce hd44780_common.[ch]")
Cc: stable(a)vger.kernel.org
Signed-off-by: Haoxiang Li <haoxiang_li2024(a)163.com>
---
Changes in v2:
- Merge the two patches into one.
- Modify the patch description.
Sorry Geert, I didn't see your reply until after I sent the
second patch. I've merged the two patches into one, hoping
to make your work a bit easier! Thanks a lot!
---
drivers/auxdisplay/hd44780.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/auxdisplay/hd44780.c b/drivers/auxdisplay/hd44780.c
index 0526f0d90a79..9d0ae9c02e9b 100644
--- a/drivers/auxdisplay/hd44780.c
+++ b/drivers/auxdisplay/hd44780.c
@@ -313,7 +313,7 @@ static int hd44780_probe(struct platform_device *pdev)
fail3:
kfree(hd);
fail2:
- kfree(lcd);
+ charlcd_free(lcd);
fail1:
kfree(hdc);
return ret;
@@ -328,7 +328,7 @@ static void hd44780_remove(struct platform_device *pdev)
kfree(hdc->hd44780);
kfree(lcd->drvdata);
- kfree(lcd);
+ charlcd_free(lcd);
}
static const struct of_device_id hd44780_of_match[] = {
--
2.25.1
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x efebe42d95fbba91dca6e3e32cb9e0612eb56de5
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025022413-skedaddle-gloss-62e4@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From efebe42d95fbba91dca6e3e32cb9e0612eb56de5 Mon Sep 17 00:00:00 2001
From: Long Li <leo.lilong(a)huawei.com>
Date: Sat, 11 Jan 2025 15:05:44 +0800
Subject: [PATCH] xfs: fix mount hang during primary superblock recovery
failure
When mounting an image containing a log with sb modifications that require
log replay, the mount process hang all the time and stack as follows:
[root@localhost ~]# cat /proc/557/stack
[<0>] xfs_buftarg_wait+0x31/0x70
[<0>] xfs_buftarg_drain+0x54/0x350
[<0>] xfs_mountfs+0x66e/0xe80
[<0>] xfs_fs_fill_super+0x7f1/0xec0
[<0>] get_tree_bdev_flags+0x186/0x280
[<0>] get_tree_bdev+0x18/0x30
[<0>] xfs_fs_get_tree+0x1d/0x30
[<0>] vfs_get_tree+0x2d/0x110
[<0>] path_mount+0xb59/0xfc0
[<0>] do_mount+0x92/0xc0
[<0>] __x64_sys_mount+0xc2/0x160
[<0>] x64_sys_call+0x2de4/0x45c0
[<0>] do_syscall_64+0xa7/0x240
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
During log recovery, while updating the in-memory superblock from the
primary SB buffer, if an error is encountered, such as superblock
corruption occurs or some other reasons, we will proceed to out_release
and release the xfs_buf. However, this is insufficient because the
xfs_buf's log item has already been initialized and the xfs_buf is held
by the buffer log item as follows, the xfs_buf will not be released,
causing the mount thread to hang.
xlog_recover_do_primary_sb_buffer
xlog_recover_do_reg_buffer
xlog_recover_validate_buf_type
xfs_buf_item_init(bp, mp)
The solution is straightforward, we simply need to allow it to be
handled by the normal buffer write process. The filesystem will be
shutdown before the submission of buffer_list in xlog_do_recovery_pass(),
ensuring the correct release of the xfs_buf as follows:
xlog_do_recovery_pass
error = xlog_recover_process
xlog_recover_process_data
xlog_recover_process_ophdr
xlog_recovery_process_trans
...
xlog_recover_buf_commit_pass2
error = xlog_recover_do_primary_sb_buffer
//Encounter error and return
if (error)
goto out_writebuf
...
out_writebuf:
xfs_buf_delwri_queue(bp, buffer_list) //add bp to list
return error
...
if (!list_empty(&buffer_list))
if (error)
xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR); //shutdown first
xfs_buf_delwri_submit(&buffer_list); //submit buffers in list
__xfs_buf_submit
if (bp->b_mount->m_log && xlog_is_shutdown(bp->b_mount->m_log))
xfs_buf_ioend_fail(bp) //release bp correctly
Fixes: 6a18765b54e2 ("xfs: update the file system geometry after recoverying superblock buffers")
Cc: stable(a)vger.kernel.org # v6.12
Signed-off-by: Long Li <leo.lilong(a)huawei.com>
Reviewed-by: Darrick J. Wong <djwong(a)kernel.org>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Signed-off-by: Carlos Maiolino <cem(a)kernel.org>
diff --git a/fs/xfs/xfs_buf_item_recover.c b/fs/xfs/xfs_buf_item_recover.c
index b05d5b81f642..05a2f6927c12 100644
--- a/fs/xfs/xfs_buf_item_recover.c
+++ b/fs/xfs/xfs_buf_item_recover.c
@@ -1087,7 +1087,7 @@ xlog_recover_buf_commit_pass2(
error = xlog_recover_do_primary_sb_buffer(mp, item, bp, buf_f,
current_lsn);
if (error)
- goto out_release;
+ goto out_writebuf;
/* Update the rt superblock if we have one. */
if (xfs_has_rtsb(mp) && mp->m_rtsb_bp) {
@@ -1104,6 +1104,15 @@ xlog_recover_buf_commit_pass2(
xlog_recover_do_reg_buffer(mp, item, bp, buf_f, current_lsn);
}
+ /*
+ * Buffer held by buf log item during 'normal' buffer recovery must
+ * be committed through buffer I/O submission path to ensure proper
+ * release. When error occurs during sb buffer recovery, log shutdown
+ * will be done before submitting buffer list so that buffers can be
+ * released correctly through ioend failure path.
+ */
+out_writebuf:
+
/*
* Perform delayed write on the buffer. Asynchronous writes will be
* slower when taking into account all the buffers to be flushed.
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x de390657b5d6f7deb9d1d36aaf45f02ba51ec9dc
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025022447-prelaunch-remnant-9fee@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From de390657b5d6f7deb9d1d36aaf45f02ba51ec9dc Mon Sep 17 00:00:00 2001
From: Nick Child <nnac123(a)linux.ibm.com>
Date: Tue, 1 Oct 2024 11:32:00 -0500
Subject: [PATCH] ibmvnic: Inspect header requirements before using scrq direct
Previously, the TX header requirement for standard frames was ignored.
This requirement is a bitstring sent from the VIOS which maps to the
type of header information needed during TX. If no header information,
is needed then send subcrq direct can be used (which can be more
performant).
This bitstring was previously ignored for standard packets (AKA non LSO,
non CSO) due to the belief that the bitstring was over-cautionary. It
turns out that there are some configurations where the backing device
does need header information for transmission of standard packets. If
the information is not supplied then this causes continuous "Adapter
error" transport events. Therefore, this bitstring should be respected
and observed before considering the use of send subcrq direct.
Fixes: 74839f7a8268 ("ibmvnic: Introduce send sub-crq direct")
Signed-off-by: Nick Child <nnac123(a)linux.ibm.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Link: https://patch.msgid.link/20241001163200.1802522-2-nnac123@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 87e693a81433..97425c06e1ed 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2472,9 +2472,11 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
/* if we are going to send_subcrq_direct this then we need to
* update the checksum before copying the data into ltb. Essentially
* these packets force disable CSO so that we can guarantee that
- * FW does not need header info and we can send direct.
+ * FW does not need header info and we can send direct. Also, vnic
+ * server must be able to xmit standard packets without header data
*/
- if (!skb_is_gso(skb) && !ind_bufp->index && !netdev_xmit_more()) {
+ if (*hdrs == 0 && !skb_is_gso(skb) &&
+ !ind_bufp->index && !netdev_xmit_more()) {
use_scrq_send_direct = true;
if (skb->ip_summed == CHECKSUM_PARTIAL &&
skb_checksum_help(skb))
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x de390657b5d6f7deb9d1d36aaf45f02ba51ec9dc
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025022446-sanding-rover-58d9@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From de390657b5d6f7deb9d1d36aaf45f02ba51ec9dc Mon Sep 17 00:00:00 2001
From: Nick Child <nnac123(a)linux.ibm.com>
Date: Tue, 1 Oct 2024 11:32:00 -0500
Subject: [PATCH] ibmvnic: Inspect header requirements before using scrq direct
Previously, the TX header requirement for standard frames was ignored.
This requirement is a bitstring sent from the VIOS which maps to the
type of header information needed during TX. If no header information,
is needed then send subcrq direct can be used (which can be more
performant).
This bitstring was previously ignored for standard packets (AKA non LSO,
non CSO) due to the belief that the bitstring was over-cautionary. It
turns out that there are some configurations where the backing device
does need header information for transmission of standard packets. If
the information is not supplied then this causes continuous "Adapter
error" transport events. Therefore, this bitstring should be respected
and observed before considering the use of send subcrq direct.
Fixes: 74839f7a8268 ("ibmvnic: Introduce send sub-crq direct")
Signed-off-by: Nick Child <nnac123(a)linux.ibm.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Link: https://patch.msgid.link/20241001163200.1802522-2-nnac123@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 87e693a81433..97425c06e1ed 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2472,9 +2472,11 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
/* if we are going to send_subcrq_direct this then we need to
* update the checksum before copying the data into ltb. Essentially
* these packets force disable CSO so that we can guarantee that
- * FW does not need header info and we can send direct.
+ * FW does not need header info and we can send direct. Also, vnic
+ * server must be able to xmit standard packets without header data
*/
- if (!skb_is_gso(skb) && !ind_bufp->index && !netdev_xmit_more()) {
+ if (*hdrs == 0 && !skb_is_gso(skb) &&
+ !ind_bufp->index && !netdev_xmit_more()) {
use_scrq_send_direct = true;
if (skb->ip_summed == CHECKSUM_PARTIAL &&
skb_checksum_help(skb))