On 9/17/21 2:31 AM, Daniel Lezcano wrote:
On 07/09/2021 21:01, Subbaraman Narayanamurthy wrote:
of_parse_thermal_zones() parses the thermal-zones node and registers a thermal_zone device for each subnode. However, if a thermal zone is consuming a thermal sensor and that thermal sensor device hasn't probed yet, an attempt to set trip_point_*_temp for that thermal zone device can cause a NULL pointer dereference. Fix it.
console:/sys/class/thermal/thermal_zone87 # echo 120000 > trip_point_0_temp ... Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020 ... Call trace: of_thermal_set_trip_temp+0x40/0xc4 trip_point_temp_store+0xc0/0x1dc dev_attr_store+0x38/0x88 sysfs_kf_write+0x64/0xc0 kernfs_fop_write_iter+0x108/0x1d0 vfs_write+0x2f4/0x368 ksys_write+0x7c/0xec __arm64_sys_write+0x20/0x30 el0_svc_common.llvm.7279915941325364641+0xbc/0x1bc do_el0_svc+0x28/0xa0 el0_svc+0x14/0x24 el0_sync_handler+0x88/0xec el0_sync+0x1c0/0x200
While at it, fix the possible NULL pointer dereference in other functions as well: of_thermal_get_temp(), of_thermal_set_emul_temp(), of_thermal_get_trend().
Cc: stable@vger.kernel.org Suggested-by: David Collins quic_collinsd@quicinc.com Signed-off-by: Subbaraman Narayanamurthy quic_subbaram@quicinc.com
Changes for v2:
- Added checks in of_thermal_get_temp(), of_thermal_set_emul_temp(), of_thermal_get_trend().
drivers/thermal/thermal_of.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c index 6379f26..9233f7e 100644 --- a/drivers/thermal/thermal_of.c +++ b/drivers/thermal/thermal_of.c @@ -89,7 +89,7 @@ static int of_thermal_get_temp(struct thermal_zone_device *tz, { struct __thermal_zone *data = tz->devdata;
- if (!data->ops->get_temp)
- if (!data->ops || !data->ops->get_temp)
comment (1)
AFAICT, if data->ops != NULL then data->ops->get_temp is also != NULL because of the code allocating/freeing the ops structure.
The tests can be replaced by (!data->ops), no ?
Thanks Daniel for reviewing the patch.
I agree that even if a sensor module is unregistered, that would call "thermal_zone_of_sensor_unregister" which would eventually set NULL on get_temp() and get_trend() and "tzd->ops" as well.
However, of_thermal_get_temp() is trying to call "data->ops->get_temp" which comes from a sensor driver when it registers. There is no guarantee that it would be non-NULL right?
Thinking of which, I think having both checks would be valid.
return -EINVAL;
return data->ops->get_temp(data->sensor_data, temp); @@ -186,6 +186,9 @@ static int of_thermal_set_emul_temp(struct thermal_zone_device *tz, { struct __thermal_zone *data = tz->devdata;
- if (!data->ops || !data->ops->set_emul_temp)
return -EINVAL;
comment (2)
The test looks pointless here (I mean both of them).
If of_thermal_set_emul_temp() is called it is because the callback was set in thermal_zone_of_add_sensor().
This one does:
tz->ops = ops;
and if (ops->set_emul_temp) tzd->ops->set_emul_temp = of_thermal_set_emul_temp;
If I'm not wrong if we are called, then data->ops && data->ops->set_emul_temp is always true, right ?
I've not exercised this condition yet. However, the original problem we've observed was when thermal HAL was trying to set trip thresholds on a thermal zone device for which the sensor device is not probed yet. This had happened randomly because of vendor modules taking time to be loaded and probed. I don't know if there would be any userspace entity that can try to set emulated temperature for a thermal zone even before a sensor device is probed.
Without a sensor driver probed, "tz->ops" would not have a valid pointer right? So, I think checking for "data->ops" should be good.
Another possibility is, a sensor might not have "set_emul_temp" callback. So checking for "ops->set_emul_temp" should be still valid.
return data->ops->set_emul_temp(data->sensor_data, temp); } @@ -194,7 +197,7 @@ static int of_thermal_get_trend(struct thermal_zone_device *tz, int trip, { struct __thermal_zone *data = tz->devdata;
- if (!data->ops->get_trend)
- if (!data->ops || !data->ops->get_trend) return -EINVAL;
Same comment as (1)
of_thermal_get_trend() is trying to call "data->ops->get_trend" which comes from a sensor driver when it registers. From what I can see, there are lot of drivers which don't pass "get_trend" in their ops. So there is no guarantee that it would be non-NULL right?
Thinking of which, I think having both checks would be valid.
return data->ops->get_trend(data->sensor_data, trip, trend); @@ -301,7 +304,7 @@ static int of_thermal_set_trip_temp(struct thermal_zone_device *tz, int trip, if (trip >= data->ntrips || trip < 0) return -EDOM;
- if (data->ops->set_trip_temp) {
- if (data->ops && data->ops->set_trip_temp) {
Same comment as (2)
Without a sensor driver probed, "tz->ops" would not have a valid pointer right? So, I think checking for "data->ops" should be good. Another possibility is, a sensor device might not have "set_trip_temp" callback but just "set_trips".
So checking for "data->ops->set_trip_temp" might be still valid.
int ret;
ret = data->ops->set_trip_temp(data->sensor_data, trip, temp);