The patch below was submitted to be applied to the 5.19-stable tree.
I fail to see how this patch meets the stable kernel rules as found at Documentation/process/stable-kernel-rules.rst.
I could be totally wrong, and if so, please respond to stable@vger.kernel.org and let me know why this patch should be applied. Otherwise, it is now dropped from my patch queues, never to be seen again.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b1f707146923335849fb70237eec27d4d1ae7d62 Mon Sep 17 00:00:00 2001 From: Arun Easi aeasi@marvell.com Date: Tue, 12 Jul 2022 22:20:39 -0700 Subject: [PATCH] scsi: qla2xxx: Fix response queue handler reading stale packets
On some platforms, the current logic of relying on finding new packet solely based on signature pattern can lead to driver reading stale packets. Though this is a bug in those platforms, reduce such exposures by limiting reading packets until the IN pointer.
Two module parameters are introduced:
ql2xrspq_follow_inptr:
When set, on newer adapters that has queue pointer shadowing, look for response packets only until response queue in pointer.
When reset, response packets are read based on a signature pattern logic (old way).
ql2xrspq_follow_inptr_legacy:
Like ql2xrspq_follow_inptr, but for those adapters where there is no queue pointer shadowing.
Link: https://lore.kernel.org/r/20220713052045.10683-5-njavali@marvell.com Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Arun Easi aeasi@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com
diff --git a/drivers/scsi/qla2xxx/qla_gbl.h b/drivers/scsi/qla2xxx/qla_gbl.h index 3674b35196b0..96147ca40126 100644 --- a/drivers/scsi/qla2xxx/qla_gbl.h +++ b/drivers/scsi/qla2xxx/qla_gbl.h @@ -193,6 +193,8 @@ extern int ql2xsecenable; extern int ql2xenforce_iocb_limit; extern int ql2xabts_wait_nvme; extern u32 ql2xnvme_queues; +extern int ql2xrspq_follow_inptr; +extern int ql2xrspq_follow_inptr_legacy;
extern int qla2x00_loop_reset(scsi_qla_host_t *); extern void qla2x00_abort_all_cmds(scsi_qla_host_t *, int); diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c index 4fa24d318f14..35b425c446b9 100644 --- a/drivers/scsi/qla2xxx/qla_isr.c +++ b/drivers/scsi/qla2xxx/qla_isr.c @@ -3780,6 +3780,8 @@ void qla24xx_process_response_queue(struct scsi_qla_host *vha, struct qla_hw_data *ha = vha->hw; struct purex_entry_24xx *purex_entry; struct purex_item *pure_item; + u16 rsp_in = 0; + int follow_inptr, is_shadow_hba;
if (!ha->flags.fw_started) return; @@ -3789,7 +3791,25 @@ void qla24xx_process_response_queue(struct scsi_qla_host *vha, qla_cpu_update(rsp->qpair, smp_processor_id()); }
- while (rsp->ring_ptr->signature != RESPONSE_PROCESSED) { +#define __update_rsp_in(_update, _is_shadow_hba, _rsp, _rsp_in) \ + do { \ + if (_update) { \ + _rsp_in = _is_shadow_hba ? *(_rsp)->in_ptr : \ + rd_reg_dword_relaxed((_rsp)->rsp_q_in); \ + } \ + } while (0) + + is_shadow_hba = IS_SHADOW_REG_CAPABLE(ha); + follow_inptr = is_shadow_hba ? ql2xrspq_follow_inptr : + ql2xrspq_follow_inptr_legacy; + + __update_rsp_in(follow_inptr, is_shadow_hba, rsp, rsp_in); + + while ((likely(follow_inptr && + rsp->ring_index != rsp_in && + rsp->ring_ptr->signature != RESPONSE_PROCESSED)) || + (!follow_inptr && + rsp->ring_ptr->signature != RESPONSE_PROCESSED)) { pkt = (struct sts_entry_24xx *)rsp->ring_ptr;
rsp->ring_index++; @@ -3902,6 +3922,8 @@ void qla24xx_process_response_queue(struct scsi_qla_host *vha, } pure_item = qla27xx_copy_fpin_pkt(vha, (void **)&pkt, &rsp); + __update_rsp_in(follow_inptr, is_shadow_hba, + rsp, rsp_in); if (!pure_item) break; qla24xx_queue_purex_item(vha, pure_item, diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index 1c7fb6484db2..0bd0fd1042df 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -338,6 +338,16 @@ module_param(ql2xdelay_before_pci_error_handling, uint, 0644); MODULE_PARM_DESC(ql2xdelay_before_pci_error_handling, "Number of seconds delayed before qla begin PCI error self-handling (default: 5).\n");
+int ql2xrspq_follow_inptr = 1; +module_param(ql2xrspq_follow_inptr, int, 0644); +MODULE_PARM_DESC(ql2xrspq_follow_inptr, + "Follow RSP IN pointer for RSP updates for HBAs 27xx and newer (default: 1)."); + +int ql2xrspq_follow_inptr_legacy = 1; +module_param(ql2xrspq_follow_inptr_legacy, int, 0644); +MODULE_PARM_DESC(ql2xrspq_follow_inptr_legacy, + "Follow RSP IN pointer for RSP updates for HBAs older than 27XX. (default: 1)."); + static void qla2x00_clear_drv_active(struct qla_hw_data *); static void qla2x00_free_device(scsi_qla_host_t *); static int qla2xxx_map_queues(struct Scsi_Host *shost);
On Sat, Aug 13, 2022 at 03:30:37PM +0200, gregkh@linuxfoundation.org wrote:
The patch below was submitted to be applied to the 5.19-stable tree.
I fail to see how this patch meets the stable kernel rules as found at Documentation/process/stable-kernel-rules.rst.
I could be totally wrong, and if so, please respond to stable@vger.kernel.org and let me know why this patch should be applied. Otherwise, it is now dropped from my patch queues, never to be seen again.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b1f707146923335849fb70237eec27d4d1ae7d62 Mon Sep 17 00:00:00 2001
From: Arun Easi aeasi@marvell.com Date: Tue, 12 Jul 2022 22:20:39 -0700 Subject: [PATCH] scsi: qla2xxx: Fix response queue handler reading stale packets
On some platforms, the current logic of relying on finding new packet solely based on signature pattern can lead to driver reading stale packets. Though this is a bug in those platforms, reduce such exposures by limiting reading packets until the IN pointer.
Two module parameters are introduced:
ql2xrspq_follow_inptr:
When set, on newer adapters that has queue pointer shadowing, look for response packets only until response queue in pointer. When reset, response packets are read based on a signature pattern logic (old way).
ql2xrspq_follow_inptr_legacy:
Like ql2xrspq_follow_inptr, but for those adapters where there is no queue pointer shadowing.
On a meta-note, this patch seems VERY wrong. You are adding a driver-wide module option for a device-specific action. That should be a device-specific functionality, not a module.
Again, as I say many times, this isn't the 1990's, please never add new module parameters. Use the other interfaces we have in the kernel to configure individual devices properly, module parameters are not to be used for that at all for anything new.
So, can you revert this commit and do it properly please?
thanks,
greg k-h
Hi Greg,
On Sat, 13 Aug 2022, 6:46am, Greg KH wrote:
On Sat, Aug 13, 2022 at 03:30:37PM +0200, gregkh@linuxfoundation.org wrote:
The patch below was submitted to be applied to the 5.19-stable tree.
I fail to see how this patch meets the stable kernel rules as found at Documentation/process/stable-kernel-rules.rst.
I could be totally wrong, and if so, please respond to stable@vger.kernel.org and let me know why this patch should be applied. Otherwise, it is now dropped from my patch queues, never to be seen again.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b1f707146923335849fb70237eec27d4d1ae7d62 Mon Sep 17 00:00:00 2001
From: Arun Easi aeasi@marvell.com Date: Tue, 12 Jul 2022 22:20:39 -0700 Subject: [PATCH] scsi: qla2xxx: Fix response queue handler reading stale packets
On some platforms, the current logic of relying on finding new packet solely based on signature pattern can lead to driver reading stale packets. Though this is a bug in those platforms, reduce such exposures by limiting reading packets until the IN pointer.
Two module parameters are introduced:
ql2xrspq_follow_inptr:
When set, on newer adapters that has queue pointer shadowing, look for response packets only until response queue in pointer. When reset, response packets are read based on a signature pattern logic (old way).
ql2xrspq_follow_inptr_legacy:
Like ql2xrspq_follow_inptr, but for those adapters where there is no queue pointer shadowing.
On a meta-note, this patch seems VERY wrong. You are adding a driver-wide module option for a device-specific action. That should be a device-specific functionality, not a module.
Again, as I say many times, this isn't the 1990's, please never add new module parameters. Use the other interfaces we have in the kernel to configure individual devices properly, module parameters are not to be used for that at all for anything new.
So, can you revert this commit and do it properly please?
The reason I chose module parameter way was because of these primarily:
1 As this is a platform specific issue, this behavior does not require a device granular level tuning. Either all the said adapters turns the behavior on or off. 2 Module parameters allowed persistence without complexity: Since this adapter is also used in booting on some environments, module parameter allowed the configurability as well as simplicity.
If the above approach is discouraged, the alternatives that comes to my mind are:
1 Add a sysfs node 2 Add a debugfs node 3 /proc/sys/kernel ? (but that is not per adapter specific) 4 Add a udev script to watch for PCI adapter addition and set/reset 1, 2 or 3
If udev route is taken, I'd have to come up with a configuration file to save tunable state, which could be a bit cumbersome and needs documentation and be different (in terms of script location/submitting) distro to distro.
Are there other options that you have in mind? Please do tell, if so.
Regards, -Arun
On Mon, Aug 15, 2022 at 03:35:09PM -0700, Arun Easi wrote:
Hi Greg,
On Sat, 13 Aug 2022, 6:46am, Greg KH wrote:
On Sat, Aug 13, 2022 at 03:30:37PM +0200, gregkh@linuxfoundation.org wrote:
The patch below was submitted to be applied to the 5.19-stable tree.
I fail to see how this patch meets the stable kernel rules as found at Documentation/process/stable-kernel-rules.rst.
I could be totally wrong, and if so, please respond to stable@vger.kernel.org and let me know why this patch should be applied. Otherwise, it is now dropped from my patch queues, never to be seen again.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b1f707146923335849fb70237eec27d4d1ae7d62 Mon Sep 17 00:00:00 2001
From: Arun Easi aeasi@marvell.com Date: Tue, 12 Jul 2022 22:20:39 -0700 Subject: [PATCH] scsi: qla2xxx: Fix response queue handler reading stale packets
On some platforms, the current logic of relying on finding new packet solely based on signature pattern can lead to driver reading stale packets. Though this is a bug in those platforms, reduce such exposures by limiting reading packets until the IN pointer.
Two module parameters are introduced:
ql2xrspq_follow_inptr:
When set, on newer adapters that has queue pointer shadowing, look for response packets only until response queue in pointer. When reset, response packets are read based on a signature pattern logic (old way).
ql2xrspq_follow_inptr_legacy:
Like ql2xrspq_follow_inptr, but for those adapters where there is no queue pointer shadowing.
On a meta-note, this patch seems VERY wrong. You are adding a driver-wide module option for a device-specific action. That should be a device-specific functionality, not a module.
Again, as I say many times, this isn't the 1990's, please never add new module parameters. Use the other interfaces we have in the kernel to configure individual devices properly, module parameters are not to be used for that at all for anything new.
So, can you revert this commit and do it properly please?
The reason I chose module parameter way was because of these primarily:
1 As this is a platform specific issue, this behavior does not require a device granular level tuning. Either all the said adapters turns the behavior on or off.
What is going to allow a user to know to do this or not? Why is this needed at all, and why doesn't the driver automatically know what to do either based on the device id, or the platform, or the workload?
Forcing a user to pick something for "tuning" like this is horrible and messy and almost impossible to support properly over time. Just do it in the driver automatically and then there's no need for any options at all.
2 Module parameters allowed persistence without complexity: Since this adapter is also used in booting on some environments, module parameter allowed the configurability as well as simplicity.
Just because it is easy does not mean it is correct. It is a device-specific option being applied to ALL devices in the system, which feels wrong. If it is correct, then just do it automatically in the driver based on platform information.
If the above approach is discouraged, the alternatives that comes to my mind are:
1 Add a sysfs node
Sure!
2 Add a debugfs node
Only if this really is only for debugging as that's what debugfs is for. You can never rely on debugfs to be present or accessable for anything that the system relies on for functionality.
3 /proc/sys/kernel ? (but that is not per adapter specific)
Ick, no.
4 Add a udev script to watch for PCI adapter addition and set/reset 1, 2 or 3
Your udev script will tie into the sysfs node.
If udev route is taken, I'd have to come up with a configuration file to save tunable state, which could be a bit cumbersome and needs documentation and be different (in terms of script location/submitting) distro to distro.
How is a module parameter saving any state anywhere? Your sysfs rule should be identical to the rule that causes you to write the module parameter file out to the device.
And then that logic, again, really should just be in the driver itself with no option needed at all.
Again, resist the option to force a user to do anything, that's messy and painful and not what a kernel should do if at all possible.
thanks,
greg k-h
On Tue, 16 Aug 2022, 2:30am, Greg KH wrote:
On Mon, Aug 15, 2022 at 03:35:09PM -0700, Arun Easi wrote:
Hi Greg,
On Sat, 13 Aug 2022, 6:46am, Greg KH wrote:
On Sat, Aug 13, 2022 at 03:30:37PM +0200, gregkh@linuxfoundation.org wrote:
The patch below was submitted to be applied to the 5.19-stable tree.
I fail to see how this patch meets the stable kernel rules as found at Documentation/process/stable-kernel-rules.rst.
I could be totally wrong, and if so, please respond to stable@vger.kernel.org and let me know why this patch should be applied. Otherwise, it is now dropped from my patch queues, never to be seen again.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b1f707146923335849fb70237eec27d4d1ae7d62 Mon Sep 17 00:00:00 2001
From: Arun Easi aeasi@marvell.com Date: Tue, 12 Jul 2022 22:20:39 -0700 Subject: [PATCH] scsi: qla2xxx: Fix response queue handler reading stale packets
On some platforms, the current logic of relying on finding new packet solely based on signature pattern can lead to driver reading stale packets. Though this is a bug in those platforms, reduce such exposures by limiting reading packets until the IN pointer.
Two module parameters are introduced:
ql2xrspq_follow_inptr:
When set, on newer adapters that has queue pointer shadowing, look for response packets only until response queue in pointer. When reset, response packets are read based on a signature pattern logic (old way).
ql2xrspq_follow_inptr_legacy:
Like ql2xrspq_follow_inptr, but for those adapters where there is no queue pointer shadowing.
On a meta-note, this patch seems VERY wrong. You are adding a driver-wide module option for a device-specific action. That should be a device-specific functionality, not a module.
Again, as I say many times, this isn't the 1990's, please never add new module parameters. Use the other interfaces we have in the kernel to configure individual devices properly, module parameters are not to be used for that at all for anything new.
So, can you revert this commit and do it properly please?
The reason I chose module parameter way was because of these primarily:
1 As this is a platform specific issue, this behavior does not require a device granular level tuning. Either all the said adapters turns the behavior on or off.
What is going to allow a user to know to do this or not? Why is this needed at all, and why doesn't the driver automatically know what to do either based on the device id, or the platform, or the workload?
The change is to err on the side of caution and make the logic a bit conservative at the same time providing an option for those platforms or architectures where the issue is not applicable, but the logic is causing a reduction in performance.
Forcing a user to pick something for "tuning" like this is horrible and messy and almost impossible to support properly over time.
The option is intended for slightly advanced users, platform or os vendors etc. When it comes to an end user, I agree it is challenging to know if a change from default is needed or not.
Just do it in the driver automatically and then there's no need for any options at all.
The platform bug exhibited as driver accessing stale memory, so it is tough to automatically tune the value automatically.
2 Module parameters allowed persistence without complexity: Since this adapter is also used in booting on some environments, module parameter allowed the configurability as well as simplicity.
Just because it is easy does not mean it is correct. It is a device-specific option being applied to ALL devices in the system, which feels wrong. If it is correct, then just do it automatically in the driver based on platform information.
If the above approach is discouraged, the alternatives that comes to my mind are:
1 Add a sysfs node
Sure!
2 Add a debugfs node
Only if this really is only for debugging as that's what debugfs is for. You can never rely on debugfs to be present or accessable for anything that the system relies on for functionality.
3 /proc/sys/kernel ? (but that is not per adapter specific)
Ick, no.
4 Add a udev script to watch for PCI adapter addition and set/reset 1, 2 or 3
Your udev script will tie into the sysfs node.
If udev route is taken, I'd have to come up with a configuration file to save tunable state, which could be a bit cumbersome and needs documentation and be different (in terms of script location/submitting) distro to distro.
How is a module parameter saving any state anywhere?
Since module parameter could be configured via modprobe.conf/equivalent, a user could set the value of his choice in that file and have the value persisted.
Regards, -Arun
Your sysfs rule should be identical to the rule that causes you to write the module parameter file out to the device.
And then that logic, again, really should just be in the driver itself with no option needed at all.
Again, resist the option to force a user to do anything, that's messy and painful and not what a kernel should do if at all possible.
On Tue, Aug 16, 2022 at 11:17:39AM -0700, Arun Easi wrote:
On Tue, 16 Aug 2022, 2:30am, Greg KH wrote:
On Mon, Aug 15, 2022 at 03:35:09PM -0700, Arun Easi wrote:
Hi Greg,
On Sat, 13 Aug 2022, 6:46am, Greg KH wrote:
On Sat, Aug 13, 2022 at 03:30:37PM +0200, gregkh@linuxfoundation.org wrote:
The patch below was submitted to be applied to the 5.19-stable tree.
I fail to see how this patch meets the stable kernel rules as found at Documentation/process/stable-kernel-rules.rst.
I could be totally wrong, and if so, please respond to stable@vger.kernel.org and let me know why this patch should be applied. Otherwise, it is now dropped from my patch queues, never to be seen again.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b1f707146923335849fb70237eec27d4d1ae7d62 Mon Sep 17 00:00:00 2001
From: Arun Easi aeasi@marvell.com Date: Tue, 12 Jul 2022 22:20:39 -0700 Subject: [PATCH] scsi: qla2xxx: Fix response queue handler reading stale packets
On some platforms, the current logic of relying on finding new packet solely based on signature pattern can lead to driver reading stale packets. Though this is a bug in those platforms, reduce such exposures by limiting reading packets until the IN pointer.
Two module parameters are introduced:
ql2xrspq_follow_inptr:
When set, on newer adapters that has queue pointer shadowing, look for response packets only until response queue in pointer. When reset, response packets are read based on a signature pattern logic (old way).
ql2xrspq_follow_inptr_legacy:
Like ql2xrspq_follow_inptr, but for those adapters where there is no queue pointer shadowing.
On a meta-note, this patch seems VERY wrong. You are adding a driver-wide module option for a device-specific action. That should be a device-specific functionality, not a module.
Again, as I say many times, this isn't the 1990's, please never add new module parameters. Use the other interfaces we have in the kernel to configure individual devices properly, module parameters are not to be used for that at all for anything new.
So, can you revert this commit and do it properly please?
The reason I chose module parameter way was because of these primarily:
1 As this is a platform specific issue, this behavior does not require a device granular level tuning. Either all the said adapters turns the behavior on or off.
What is going to allow a user to know to do this or not? Why is this needed at all, and why doesn't the driver automatically know what to do either based on the device id, or the platform, or the workload?
The change is to err on the side of caution and make the logic a bit conservative at the same time providing an option for those platforms or architectures where the issue is not applicable, but the logic is causing a reduction in performance.
So this is a "enable the driver to work in a broken way" option?
Forcing a user to pick something for "tuning" like this is horrible and messy and almost impossible to support properly over time.
The option is intended for slightly advanced users, platform or os vendors etc. When it comes to an end user, I agree it is challenging to know if a change from default is needed or not.
That's not ok, as a driver writer you need to make it "always work", don't force the user to choose "safe vs. unsafe" options, that's passing the blame.
Just do it in the driver automatically and then there's no need for any options at all.
The platform bug exhibited as driver accessing stale memory, so it is tough to automatically tune the value automatically.
That sounds like a real bug that you need to fix. Please revert this change and just fix the issue to always work properly. To have an option that allows a driver to work in a broken way is not acceptable.
If udev route is taken, I'd have to come up with a configuration file to save tunable state, which could be a bit cumbersome and needs documentation and be different (in terms of script location/submitting) distro to distro.
How is a module parameter saving any state anywhere?
Since module parameter could be configured via modprobe.conf/equivalent, a user could set the value of his choice in that file and have the value persisted.
Same for a udev rule, so this is not an issue.
Do you want me to send a patch that reverts this, or will you?
thanks,
greg k-h
On Tue, 16 Aug 2022, 11:20pm, Greg KH wrote:
On Tue, Aug 16, 2022 at 11:17:39AM -0700, Arun Easi wrote:
On Tue, 16 Aug 2022, 2:30am, Greg KH wrote:
On Mon, Aug 15, 2022 at 03:35:09PM -0700, Arun Easi wrote:
Hi Greg,
On Sat, 13 Aug 2022, 6:46am, Greg KH wrote:
On Sat, Aug 13, 2022 at 03:30:37PM +0200, gregkh@linuxfoundation.org wrote:
The patch below was submitted to be applied to the 5.19-stable tree.
I fail to see how this patch meets the stable kernel rules as found at Documentation/process/stable-kernel-rules.rst.
I could be totally wrong, and if so, please respond to stable@vger.kernel.org and let me know why this patch should be applied. Otherwise, it is now dropped from my patch queues, never to be seen again.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From b1f707146923335849fb70237eec27d4d1ae7d62 Mon Sep 17 00:00:00 2001 From: Arun Easi aeasi@marvell.com Date: Tue, 12 Jul 2022 22:20:39 -0700 Subject: [PATCH] scsi: qla2xxx: Fix response queue handler reading stale packets
On some platforms, the current logic of relying on finding new packet solely based on signature pattern can lead to driver reading stale packets. Though this is a bug in those platforms, reduce such exposures by limiting reading packets until the IN pointer.
Two module parameters are introduced:
ql2xrspq_follow_inptr:
When set, on newer adapters that has queue pointer shadowing, look for response packets only until response queue in pointer. When reset, response packets are read based on a signature pattern logic (old way).
ql2xrspq_follow_inptr_legacy:
Like ql2xrspq_follow_inptr, but for those adapters where there is no queue pointer shadowing.
On a meta-note, this patch seems VERY wrong. You are adding a driver-wide module option for a device-specific action. That should be a device-specific functionality, not a module.
Again, as I say many times, this isn't the 1990's, please never add new module parameters. Use the other interfaces we have in the kernel to configure individual devices properly, module parameters are not to be used for that at all for anything new.
So, can you revert this commit and do it properly please?
The reason I chose module parameter way was because of these primarily:
1 As this is a platform specific issue, this behavior does not require a device granular level tuning. Either all the said adapters turns the behavior on or off.
What is going to allow a user to know to do this or not? Why is this needed at all, and why doesn't the driver automatically know what to do either based on the device id, or the platform, or the workload?
The change is to err on the side of caution and make the logic a bit conservative at the same time providing an option for those platforms or architectures where the issue is not applicable, but the logic is causing a reduction in performance.
So this is a "enable the driver to work in a broken way" option?
Forcing a user to pick something for "tuning" like this is horrible and messy and almost impossible to support properly over time.
The option is intended for slightly advanced users, platform or os vendors etc. When it comes to an end user, I agree it is challenging to know if a change from default is needed or not.
That's not ok, as a driver writer you need to make it "always work", don't force the user to choose "safe vs. unsafe" options, that's passing the blame.
Just do it in the driver automatically and then there's no need for any options at all.
The platform bug exhibited as driver accessing stale memory, so it is tough to automatically tune the value automatically.
That sounds like a real bug that you need to fix.
Actually, this is a platform bug that got exposed with the driver behavior.
Please revert this change and just fix the issue to always work properly. To have an option that allows a driver to work in a broken way is not acceptable.
If udev route is taken, I'd have to come up with a configuration file to save tunable state, which could be a bit cumbersome and needs documentation and be different (in terms of script location/submitting) distro to distro.
How is a module parameter saving any state anywhere?
Since module parameter could be configured via modprobe.conf/equivalent, a user could set the value of his choice in that file and have the value persisted.
Same for a udev rule, so this is not an issue.
Do you want me to send a patch that reverts this, or will you?
Ok, hint taken, will send a revert followed by the patch with modparam removed.
Regards, -Arun
On Wed, 17 Aug 2022, 5:35pm, Arun Easi wrote:
On Tue, 16 Aug 2022, 11:20pm, Greg KH wrote:
On Tue, Aug 16, 2022 at 11:17:39AM -0700, Arun Easi wrote:
On Tue, 16 Aug 2022, 2:30am, Greg KH wrote:
On Mon, Aug 15, 2022 at 03:35:09PM -0700, Arun Easi wrote:
Hi Greg,
On Sat, 13 Aug 2022, 6:46am, Greg KH wrote:
On Sat, Aug 13, 2022 at 03:30:37PM +0200, gregkh@linuxfoundation.org wrote: > The patch below was submitted to be applied to the 5.19-stable tree. > > I fail to see how this patch meets the stable kernel rules as found at > Documentation/process/stable-kernel-rules.rst. > > I could be totally wrong, and if so, please respond to > stable@vger.kernel.org and let me know why this patch should be > applied. Otherwise, it is now dropped from my patch queues, never to be > seen again. > > thanks, > > greg k-h > > ------------------ original commit in Linus's tree ------------------ > > >From b1f707146923335849fb70237eec27d4d1ae7d62 Mon Sep 17 00:00:00 2001 > From: Arun Easi aeasi@marvell.com > Date: Tue, 12 Jul 2022 22:20:39 -0700 > Subject: [PATCH] scsi: qla2xxx: Fix response queue handler reading stale > packets > > On some platforms, the current logic of relying on finding new packet > solely based on signature pattern can lead to driver reading stale > packets. Though this is a bug in those platforms, reduce such exposures by > limiting reading packets until the IN pointer. > > Two module parameters are introduced: > > ql2xrspq_follow_inptr: > > When set, on newer adapters that has queue pointer shadowing, look for > response packets only until response queue in pointer. > > When reset, response packets are read based on a signature pattern > logic (old way). > > ql2xrspq_follow_inptr_legacy: > > Like ql2xrspq_follow_inptr, but for those adapters where there is no > queue pointer shadowing.
On a meta-note, this patch seems VERY wrong. You are adding a driver-wide module option for a device-specific action. That should be a device-specific functionality, not a module.
Again, as I say many times, this isn't the 1990's, please never add new module parameters. Use the other interfaces we have in the kernel to configure individual devices properly, module parameters are not to be used for that at all for anything new.
So, can you revert this commit and do it properly please?
The reason I chose module parameter way was because of these primarily:
1 As this is a platform specific issue, this behavior does not require a device granular level tuning. Either all the said adapters turns the behavior on or off.
What is going to allow a user to know to do this or not? Why is this needed at all, and why doesn't the driver automatically know what to do either based on the device id, or the platform, or the workload?
The change is to err on the side of caution and make the logic a bit conservative at the same time providing an option for those platforms or architectures where the issue is not applicable, but the logic is causing a reduction in performance.
So this is a "enable the driver to work in a broken way" option?
Forcing a user to pick something for "tuning" like this is horrible and messy and almost impossible to support properly over time.
The option is intended for slightly advanced users, platform or os vendors etc. When it comes to an end user, I agree it is challenging to know if a change from default is needed or not.
That's not ok, as a driver writer you need to make it "always work", don't force the user to choose "safe vs. unsafe" options, that's passing the blame.
Just do it in the driver automatically and then there's no need for any options at all.
The platform bug exhibited as driver accessing stale memory, so it is tough to automatically tune the value automatically.
That sounds like a real bug that you need to fix.
Actually, this is a platform bug that got exposed with the driver behavior.
Please revert this change and just fix the issue to always work properly. To have an option that allows a driver to work in a broken way is not acceptable.
If udev route is taken, I'd have to come up with a configuration file to save tunable state, which could be a bit cumbersome and needs documentation and be different (in terms of script location/submitting) distro to distro.
How is a module parameter saving any state anywhere?
Since module parameter could be configured via modprobe.conf/equivalent, a user could set the value of his choice in that file and have the value persisted.
Same for a udev rule, so this is not an issue.
Do you want me to send a patch that reverts this, or will you?
Ok, hint taken, will send a revert followed by the patch with modparam removed.
Greg, would you be alright if the module parameter be converted into a #define? That way, an advanced user, who knows what he is doing, still has the option, and the chances of a regular user accidentally turning on the module parameter incorrectly on the buggy platform avoided as well.
Regards, -Arun
On Wed, Aug 17, 2022 at 05:44:48PM -0700, Arun Easi wrote:
On Wed, 17 Aug 2022, 5:35pm, Arun Easi wrote:
On Tue, 16 Aug 2022, 11:20pm, Greg KH wrote:
On Tue, Aug 16, 2022 at 11:17:39AM -0700, Arun Easi wrote:
On Tue, 16 Aug 2022, 2:30am, Greg KH wrote:
On Mon, Aug 15, 2022 at 03:35:09PM -0700, Arun Easi wrote:
Hi Greg,
On Sat, 13 Aug 2022, 6:46am, Greg KH wrote:
> > ---------------------------------------------------------------------- > On Sat, Aug 13, 2022 at 03:30:37PM +0200, gregkh@linuxfoundation.org wrote: > > The patch below was submitted to be applied to the 5.19-stable tree. > > > > I fail to see how this patch meets the stable kernel rules as found at > > Documentation/process/stable-kernel-rules.rst. > > > > I could be totally wrong, and if so, please respond to > > stable@vger.kernel.org and let me know why this patch should be > > applied. Otherwise, it is now dropped from my patch queues, never to be > > seen again. > > > > thanks, > > > > greg k-h > > > > ------------------ original commit in Linus's tree ------------------ > > > > >From b1f707146923335849fb70237eec27d4d1ae7d62 Mon Sep 17 00:00:00 2001 > > From: Arun Easi aeasi@marvell.com > > Date: Tue, 12 Jul 2022 22:20:39 -0700 > > Subject: [PATCH] scsi: qla2xxx: Fix response queue handler reading stale > > packets > > > > On some platforms, the current logic of relying on finding new packet > > solely based on signature pattern can lead to driver reading stale > > packets. Though this is a bug in those platforms, reduce such exposures by > > limiting reading packets until the IN pointer. > > > > Two module parameters are introduced: > > > > ql2xrspq_follow_inptr: > > > > When set, on newer adapters that has queue pointer shadowing, look for > > response packets only until response queue in pointer. > > > > When reset, response packets are read based on a signature pattern > > logic (old way). > > > > ql2xrspq_follow_inptr_legacy: > > > > Like ql2xrspq_follow_inptr, but for those adapters where there is no > > queue pointer shadowing. > > On a meta-note, this patch seems VERY wrong. You are adding a > driver-wide module option for a device-specific action. That should be > a device-specific functionality, not a module. > > Again, as I say many times, this isn't the 1990's, please never add new > module parameters. Use the other interfaces we have in the kernel to > configure individual devices properly, module parameters are not to be > used for that at all for anything new. > > So, can you revert this commit and do it properly please? >
The reason I chose module parameter way was because of these primarily:
1 As this is a platform specific issue, this behavior does not require a device granular level tuning. Either all the said adapters turns the behavior on or off.
What is going to allow a user to know to do this or not? Why is this needed at all, and why doesn't the driver automatically know what to do either based on the device id, or the platform, or the workload?
The change is to err on the side of caution and make the logic a bit conservative at the same time providing an option for those platforms or architectures where the issue is not applicable, but the logic is causing a reduction in performance.
So this is a "enable the driver to work in a broken way" option?
Forcing a user to pick something for "tuning" like this is horrible and messy and almost impossible to support properly over time.
The option is intended for slightly advanced users, platform or os vendors etc. When it comes to an end user, I agree it is challenging to know if a change from default is needed or not.
That's not ok, as a driver writer you need to make it "always work", don't force the user to choose "safe vs. unsafe" options, that's passing the blame.
Just do it in the driver automatically and then there's no need for any options at all.
The platform bug exhibited as driver accessing stale memory, so it is tough to automatically tune the value automatically.
That sounds like a real bug that you need to fix.
Actually, this is a platform bug that got exposed with the driver behavior.
What are you meaning by "platform"? The kernel? Or your hardware device?
Please revert this change and just fix the issue to always work properly. To have an option that allows a driver to work in a broken way is not acceptable.
If udev route is taken, I'd have to come up with a configuration file to save tunable state, which could be a bit cumbersome and needs documentation and be different (in terms of script location/submitting) distro to distro.
How is a module parameter saving any state anywhere?
Since module parameter could be configured via modprobe.conf/equivalent, a user could set the value of his choice in that file and have the value persisted.
Same for a udev rule, so this is not an issue.
Do you want me to send a patch that reverts this, or will you?
Ok, hint taken, will send a revert followed by the patch with modparam removed.
Greg, would you be alright if the module parameter be converted into a #define? That way, an advanced user, who knows what he is doing, still has the option, and the chances of a regular user accidentally turning on the module parameter incorrectly on the buggy platform avoided as well.
No one will ever turn on such a thing if it is a #define, as no "enterprise" customer will ever build a custom kernel (it voids their warranty) and for the rest of the world that uses stock kernel.org kernels, they would never want to change the code to "enable an option that is known to break things."
Just fix the driver to work properly, don't try to paper over it with "change this line of code to go faster but it might break!" options.
thanks,
greg k-h
linux-stable-mirror@lists.linaro.org