On Wed, Jul 6, 2022 at 12:21 PM Greg KH gregkh@linuxfoundation.org wrote:
On Wed, Jul 06, 2022 at 12:01:19PM +0200, Varad Gautam wrote:
On Wed, Jul 6, 2022 at 11:21 AM Greg KH gregkh@linuxfoundation.org wrote:
On Wed, Jul 06, 2022 at 04:51:59PM +0800, Zhang Rui wrote:
On Wed, 2022-07-06 at 09:16 +0200, Varad Gautam wrote:
On Wed, Jul 6, 2022 at 8:45 AM Greg KH gregkh@linuxfoundation.org wrote:
On Tue, Jul 05, 2022 at 11:02:50PM +0200, Varad Gautam wrote: > On Tue, Jul 5, 2022 at 6:18 PM Greg KH < > gregkh@linuxfoundation.org> wrote: > > > > On Tue, Jul 05, 2022 at 03:00:02PM +0000, Varad Gautam wrote: > > > Check that a user-provided thermal state is within the > > > maximum > > > thermal states supported by a given driver before attempting > > > to > > > apply it. This prevents a subsequent OOB access in > > > thermal_cooling_device_stats_update() while performing > > > state-transition accounting on drivers that do not have this > > > check > > > in their set_cur_state() handle. > > > > > > Signed-off-by: Varad Gautam varadgautam@google.com > > > Cc: stable@vger.kernel.org > > > --- > > > drivers/thermal/thermal_sysfs.c | 12 +++++++++++- > > > 1 file changed, 11 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/thermal/thermal_sysfs.c > > > b/drivers/thermal/thermal_sysfs.c > > > index 1c4aac8464a7..0c6b0223b133 100644 > > > --- a/drivers/thermal/thermal_sysfs.c > > > +++ b/drivers/thermal/thermal_sysfs.c > > > @@ -607,7 +607,7 @@ cur_state_store(struct device *dev, > > > struct device_attribute *attr, > > > const char *buf, size_t count) > > > { > > > struct thermal_cooling_device *cdev = > > > to_cooling_device(dev); > > > - unsigned long state; > > > + unsigned long state, max_state; > > > int result; > > > > > > if (sscanf(buf, "%ld\n", &state) != 1) > > > @@ -618,10 +618,20 @@ cur_state_store(struct device *dev, > > > struct device_attribute *attr, > > > > > > mutex_lock(&cdev->lock); > > > > > > + result = cdev->ops->get_max_state(cdev, &max_state); > > > + if (result) > > > + goto unlock; > > > + > > > + if (state > max_state) { > > > + result = -EINVAL; > > > + goto unlock; > > > + } > > > + > > > result = cdev->ops->set_cur_state(cdev, state); > > > > Why doesn't set_cur_state() check the max state before setting > > it? Why > > are the callers forced to always check it before? That feels > > wrong... > > > > The problem lies in thermal_cooling_device_stats_update(), not > set_cur_state(). > > If ->set_cur_state() doesn't error out on invalid state, > thermal_cooling_device_stats_update() does a: > > stats->trans_table[stats->state * stats->max_states + > new_state]++; > > stats->trans_table reserves space depending on max_states, but > we'd end up > reading/writing outside it. cur_state_store() can prevent this > regardless of > the driver's ->set_cur_state() implementation.
Why wouldn't cur_state_store() check for an out-of-bounds condition by calling get_max_state() and then return an error if it is invalid, preventing thermal_cooling_device_stats_update() from ever being called?
That's what this patch does, it adds the out-of-bounds check.
No, I think Greg' question is why cdev->ops->set_cur_state() return 0 when setting a cooling state that exceeds the maximum cooling state?
Yes, that is what I am asking, it should not allow a state to be exceeded.
Indeed, it is upto the driver to return !0 from cdev->ops->set_cur_state() when setting state > max - and it is a driver bug for not doing so.
But a buggy driver should not lead to cur_state_store() performing an OOB access.
Agreed, which is why the code that does the access should check before it does so. Right now you are relying on the sysfs code to do so, which seems very wrong.
I see the point.
The OOB access happens in thermal_cooling_device_stats_update().
By placing the check in cur_state_store(), I'm trying to ensure two things for a buggy driver: 1. The driver's cdev->ops->set_cur_state() doesn't get called if the new state is > max state. This is to prevent the driver from storing the new (invalid) state internally. If the driver didn't realise/reject an invalid state, chances are it will try to propagate it internally and take actions according to that, which can have side effects on system stability. 2. The kernel doesn't do an OOB access in thermal_cooling_device_stats_update().
thanks,
greg k-h