When of_find_net_device_by_node() successfully acquires a reference to a network device but the subsequent call to dsa_port_parse_cpu() fails, dsa_port_parse_of() returns without releasing the reference count on the network device.
of_find_net_device_by_node() increments the reference count of the returned structure, which should be balanced with a corresponding put_device() when the reference is no longer needed.
Found by code review.
Cc: stable@vger.kernel.org Fixes: deff710703d8 ("net: dsa: Allow default tag protocol to be overridden from DT") Signed-off-by: Ma Ke make24@iscas.ac.cn --- Changes in v2: - simplified the patch as suggestions; - modified the Fixes tag as suggestions. --- net/dsa/dsa.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c index a20efabe778f..31b409a47491 100644 --- a/net/dsa/dsa.c +++ b/net/dsa/dsa.c @@ -1247,6 +1247,7 @@ static int dsa_port_parse_of(struct dsa_port *dp, struct device_node *dn) struct device_node *ethernet = of_parse_phandle(dn, "ethernet", 0); const char *name = of_get_property(dn, "label", NULL); bool link = of_property_read_bool(dn, "link"); + int err = 0;
dp->dn = dn;
@@ -1260,7 +1261,11 @@ static int dsa_port_parse_of(struct dsa_port *dp, struct device_node *dn) return -EPROBE_DEFER;
user_protocol = of_get_property(dn, "dsa-tag-protocol", NULL); - return dsa_port_parse_cpu(dp, conduit, user_protocol); + err = dsa_port_parse_cpu(dp, conduit, user_protocol); + if (err) + put_device(conduit); + + return err; }
if (link)
Hi,
On 12/14/25 14:12, Ma Ke wrote:
When of_find_net_device_by_node() successfully acquires a reference to
Your subject is missing the () of dsa_port_parse_of()
a network device but the subsequent call to dsa_port_parse_cpu() fails, dsa_port_parse_of() returns without releasing the reference count on the network device.
of_find_net_device_by_node() increments the reference count of the returned structure, which should be balanced with a corresponding put_device() when the reference is no longer needed.
Found by code review.
I agree with the reference not being properly released on failure, but I don't think this fix is complete.
I was trying to figure out where the put_device() would happen in the success case (or on removal), and I failed to find it.
Also if the (indirect) top caller of dsa_port_parse_of(), dsa_switch_probe(), fails at a later place the reference won't be released either.
The only explicit put_device() that happens is in dsa_dev_to_net_device(), which seems to convert a device reference to a netdev reference via dev_hold().
But the only caller of that, dsa_port_parse() immediately calls dev_put() on it, essentially dropping all references, and then continuing using it.
dsa_switch_shutdown() talks about dropping references taken via netdev_upper_dev_link(), but AFAICT this happens only after dsa_port_parse{,_of}() setup the conduit, so it looks like there could be a window without any reference held onto the conduit.
So AFAICT the current state is:
dsa_port_parse_of() keeps the device reference. dsa_port_parse() drops the device reference, and shortly has a dev_hold(), but it does not extend beyond the function.
Therefore if my analysis is correct (which it may very well not be), the correct fix(es) here could be:
dsa_port_parse{,_of}() should keep a reference via e.g. dev_hold() on success to the conduit.
Or maybe they should unconditionally drop if *after* calling dsa_port_parse_cpu(), and dsa_port_parse_cpu() should take one when assigning dsa_port::conduit.
Regardless, the end result should be that there is a reference on the conduit stored in dsa_port::conduit.
dsa_switch_release_ports() should drop the references, as this seems to be called in all error paths of dsa_port_parse{,of} as well by dsa_switch_remove().
And maybe dsa_switch_shutdown() then also needs to drop the reference? Though it may need to then retake the reference on resume, and I don't know where that exactly should happen. Maybe it should also lookup the conduit(s) again to be correct.
But here I'm more doing educated guesses then actually knowing what's correct.
The alternative/quick "fix" would be to just drop the reference unconditionally, which would align the behaviour to that of dsa_port_parse(). Not sure if it should mirror the dev_hold() / dev_put() spiel as well.
Not that I think this would be the correct behaviour though.
Sorry for the lengthy review/train of thought.
Best regards, Jonas
Cc: stable@vger.kernel.org Fixes: deff710703d8 ("net: dsa: Allow default tag protocol to be overridden from DT") Signed-off-by: Ma Ke make24@iscas.ac.cn
Changes in v2:
- simplified the patch as suggestions;
- modified the Fixes tag as suggestions.
net/dsa/dsa.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c index a20efabe778f..31b409a47491 100644 --- a/net/dsa/dsa.c +++ b/net/dsa/dsa.c @@ -1247,6 +1247,7 @@ static int dsa_port_parse_of(struct dsa_port *dp, struct device_node *dn) struct device_node *ethernet = of_parse_phandle(dn, "ethernet", 0); const char *name = of_get_property(dn, "label", NULL); bool link = of_property_read_bool(dn, "link");
- int err = 0;
dp->dn = dn; @@ -1260,7 +1261,11 @@ static int dsa_port_parse_of(struct dsa_port *dp, struct device_node *dn) return -EPROBE_DEFER; user_protocol = of_get_property(dn, "dsa-tag-protocol", NULL);
return dsa_port_parse_cpu(dp, conduit, user_protocol);
err = dsa_port_parse_cpu(dp, conduit, user_protocol);if (err)put_device(conduit); }return err;if (link)
Hi Jonas, Ma Ke,
On Sun, Dec 14, 2025 at 05:02:33PM +0100, Jonas Gorski wrote:
Hi,
On 12/14/25 14:12, Ma Ke wrote:
When of_find_net_device_by_node() successfully acquires a reference to
Your subject is missing the () of dsa_port_parse_of()
a network device but the subsequent call to dsa_port_parse_cpu() fails, dsa_port_parse_of() returns without releasing the reference count on the network device.
of_find_net_device_by_node() increments the reference count of the returned structure, which should be balanced with a corresponding put_device() when the reference is no longer needed.
Found by code review.
I agree with the reference not being properly released on failure, but I don't think this fix is complete.
I was trying to figure out where the put_device() would happen in the success case (or on removal), and I failed to find it.
Also if the (indirect) top caller of dsa_port_parse_of(), dsa_switch_probe(), fails at a later place the reference won't be released either.
The only explicit put_device() that happens is in dsa_dev_to_net_device(), which seems to convert a device reference to a netdev reference via dev_hold().
But the only caller of that, dsa_port_parse() immediately calls dev_put() on it, essentially dropping all references, and then continuing using it.
dsa_switch_shutdown() talks about dropping references taken via netdev_upper_dev_link(), but AFAICT this happens only after dsa_port_parse{,_of}() setup the conduit, so it looks like there could be a window without any reference held onto the conduit.
So AFAICT the current state is:
dsa_port_parse_of() keeps the device reference. dsa_port_parse() drops the device reference, and shortly has a dev_hold(), but it does not extend beyond the function.
Therefore if my analysis is correct (which it may very well not be), the correct fix(es) here could be:
dsa_port_parse{,_of}() should keep a reference via e.g. dev_hold() on success to the conduit.
Or maybe they should unconditionally drop if *after* calling dsa_port_parse_cpu(), and dsa_port_parse_cpu() should take one when assigning dsa_port::conduit.
Regardless, the end result should be that there is a reference on the conduit stored in dsa_port::conduit.
dsa_switch_release_ports() should drop the references, as this seems to be called in all error paths of dsa_port_parse{,of} as well by dsa_switch_remove().
And maybe dsa_switch_shutdown() then also needs to drop the reference? Though it may need to then retake the reference on resume, and I don't know where that exactly should happen. Maybe it should also lookup the conduit(s) again to be correct.
But here I'm more doing educated guesses then actually knowing what's correct.
The alternative/quick "fix" would be to just drop the reference unconditionally, which would align the behaviour to that of dsa_port_parse(). Not sure if it should mirror the dev_hold() / dev_put() spiel as well.
Not that I think this would be the correct behaviour though.
Sorry for the lengthy review/train of thought.
Best regards, Jonas
Thank you for your thoughts on this topic. Indeed there is a problem, for which I managed to find a few hours today to investigate. I was going to just submit a patch directly and refer Ma Ke to it directly, but since you started looking into the situation as well, I just thought I'd reply "please standby". It's currently undergoing testing.
On Sun, Dec 14, 2025 at 5:10 PM Vladimir Oltean olteanv@gmail.com wrote:
Hi Jonas, Ma Ke,
On Sun, Dec 14, 2025 at 05:02:33PM +0100, Jonas Gorski wrote:
Hi,
On 12/14/25 14:12, Ma Ke wrote:
When of_find_net_device_by_node() successfully acquires a reference to
Your subject is missing the () of dsa_port_parse_of()
a network device but the subsequent call to dsa_port_parse_cpu() fails, dsa_port_parse_of() returns without releasing the reference count on the network device.
of_find_net_device_by_node() increments the reference count of the returned structure, which should be balanced with a corresponding put_device() when the reference is no longer needed.
Found by code review.
I agree with the reference not being properly released on failure, but I don't think this fix is complete.
I was trying to figure out where the put_device() would happen in the success case (or on removal), and I failed to find it.
Also if the (indirect) top caller of dsa_port_parse_of(), dsa_switch_probe(), fails at a later place the reference won't be released either.
The only explicit put_device() that happens is in dsa_dev_to_net_device(), which seems to convert a device reference to a netdev reference via dev_hold().
But the only caller of that, dsa_port_parse() immediately calls dev_put() on it, essentially dropping all references, and then continuing using it.
dsa_switch_shutdown() talks about dropping references taken via netdev_upper_dev_link(), but AFAICT this happens only after dsa_port_parse{,_of}() setup the conduit, so it looks like there could be a window without any reference held onto the conduit.
So AFAICT the current state is:
dsa_port_parse_of() keeps the device reference. dsa_port_parse() drops the device reference, and shortly has a dev_hold(), but it does not extend beyond the function.
Therefore if my analysis is correct (which it may very well not be), the correct fix(es) here could be:
dsa_port_parse{,_of}() should keep a reference via e.g. dev_hold() on success to the conduit.
Or maybe they should unconditionally drop if *after* calling dsa_port_parse_cpu(), and dsa_port_parse_cpu() should take one when assigning dsa_port::conduit.
Regardless, the end result should be that there is a reference on the conduit stored in dsa_port::conduit.
dsa_switch_release_ports() should drop the references, as this seems to be called in all error paths of dsa_port_parse{,of} as well by dsa_switch_remove().
And maybe dsa_switch_shutdown() then also needs to drop the reference? Though it may need to then retake the reference on resume, and I don't know where that exactly should happen. Maybe it should also lookup the conduit(s) again to be correct.
But here I'm more doing educated guesses then actually knowing what's correct.
The alternative/quick "fix" would be to just drop the reference unconditionally, which would align the behaviour to that of dsa_port_parse(). Not sure if it should mirror the dev_hold() / dev_put() spiel as well.
Not that I think this would be the correct behaviour though.
Sorry for the lengthy review/train of thought.
Best regards, Jonas
Thank you for your thoughts on this topic. Indeed there is a problem, for which I managed to find a few hours today to investigate. I was going to just submit a patch directly and refer Ma Ke to it directly, but since you started looking into the situation as well, I just thought I'd reply "please standby". It's currently undergoing testing.
A patch already, that's even better! I'll gladly stand by :)
Best regards, Jonas
On Sun, Dec 14, 2025 at 05:14:04PM +0100, Jonas Gorski wrote:
On Sun, Dec 14, 2025 at 5:10 PM Vladimir Oltean olteanv@gmail.com wrote:
Hi Jonas, Ma Ke,
On Sun, Dec 14, 2025 at 05:02:33PM +0100, Jonas Gorski wrote:
Hi,
On 12/14/25 14:12, Ma Ke wrote:
When of_find_net_device_by_node() successfully acquires a reference to
Your subject is missing the () of dsa_port_parse_of()
a network device but the subsequent call to dsa_port_parse_cpu() fails, dsa_port_parse_of() returns without releasing the reference count on the network device.
of_find_net_device_by_node() increments the reference count of the returned structure, which should be balanced with a corresponding put_device() when the reference is no longer needed.
Found by code review.
I agree with the reference not being properly released on failure, but I don't think this fix is complete.
I was trying to figure out where the put_device() would happen in the success case (or on removal), and I failed to find it.
Also if the (indirect) top caller of dsa_port_parse_of(), dsa_switch_probe(), fails at a later place the reference won't be released either.
The only explicit put_device() that happens is in dsa_dev_to_net_device(), which seems to convert a device reference to a netdev reference via dev_hold().
But the only caller of that, dsa_port_parse() immediately calls dev_put() on it, essentially dropping all references, and then continuing using it.
dsa_switch_shutdown() talks about dropping references taken via netdev_upper_dev_link(), but AFAICT this happens only after dsa_port_parse{,_of}() setup the conduit, so it looks like there could be a window without any reference held onto the conduit.
So AFAICT the current state is:
dsa_port_parse_of() keeps the device reference. dsa_port_parse() drops the device reference, and shortly has a dev_hold(), but it does not extend beyond the function.
Therefore if my analysis is correct (which it may very well not be), the correct fix(es) here could be:
dsa_port_parse{,_of}() should keep a reference via e.g. dev_hold() on success to the conduit.
Or maybe they should unconditionally drop if *after* calling dsa_port_parse_cpu(), and dsa_port_parse_cpu() should take one when assigning dsa_port::conduit.
Regardless, the end result should be that there is a reference on the conduit stored in dsa_port::conduit.
dsa_switch_release_ports() should drop the references, as this seems to be called in all error paths of dsa_port_parse{,of} as well by dsa_switch_remove().
And maybe dsa_switch_shutdown() then also needs to drop the reference? Though it may need to then retake the reference on resume, and I don't know where that exactly should happen. Maybe it should also lookup the conduit(s) again to be correct.
But here I'm more doing educated guesses then actually knowing what's correct.
The alternative/quick "fix" would be to just drop the reference unconditionally, which would align the behaviour to that of dsa_port_parse(). Not sure if it should mirror the dev_hold() / dev_put() spiel as well.
Not that I think this would be the correct behaviour though.
Sorry for the lengthy review/train of thought.
Best regards, Jonas
Thank you for your thoughts on this topic. Indeed there is a problem, for which I managed to find a few hours today to investigate. I was going to just submit a patch directly and refer Ma Ke to it directly, but since you started looking into the situation as well, I just thought I'd reply "please standby". It's currently undergoing testing.
A patch already, that's even better! I'll gladly stand by :)
Best regards, Jonas
Patch location (for tracking purposes): https://lore.kernel.org/netdev/20251214182449.3900190-1-vladimir.oltean@nxp....
Ma Ke's patch unfortunately does not compile even in v2, so:
pw-bot: cr
(although "changes requested" is a figure of speech here, in the sense that I don't expect a follow-up patch from Ma Ke)
Hi Ma,
kernel test robot noticed the following build errors:
[auto build test ERROR on net-next/main] [also build test ERROR on net/main linus/master v6.19-rc1 next-20251219] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Ma-Ke/net-dsa-Fix-error-handl... base: net-next/main patch link: https://lore.kernel.org/r/20251214131204.4684-1-make24%40iscas.ac.cn patch subject: [PATCH v2] net: dsa: Fix error handling in dsa_port_parse_of config: s390-allmodconfig (https://download.01.org/0day-ci/archive/20251220/202512202210.PkOvVaCv-lkp@i...) compiler: clang version 18.1.8 (https://github.com/llvm/llvm-project 3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251220/202512202210.PkOvVaCv-lkp@i...)
If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot lkp@intel.com | Closes: https://lore.kernel.org/oe-kbuild-all/202512202210.PkOvVaCv-lkp@intel.com/
All errors (new ones prefixed by >>):
net/dsa/dsa.c:1266:15: error: incompatible pointer types passing 'struct net_device *' to parameter of type 'struct device *' [-Werror,-Wincompatible-pointer-types]
1266 | put_device(conduit); | ^~~~~~~ include/linux/device.h:1181:32: note: passing argument to parameter 'dev' here 1181 | void put_device(struct device *dev); | ^ 1 error generated.
vim +1266 net/dsa/dsa.c
1244 1245 static int dsa_port_parse_of(struct dsa_port *dp, struct device_node *dn) 1246 { 1247 struct device_node *ethernet = of_parse_phandle(dn, "ethernet", 0); 1248 const char *name = of_get_property(dn, "label", NULL); 1249 bool link = of_property_read_bool(dn, "link"); 1250 int err = 0; 1251 1252 dp->dn = dn; 1253 1254 if (ethernet) { 1255 struct net_device *conduit; 1256 const char *user_protocol; 1257 1258 conduit = of_find_net_device_by_node(ethernet); 1259 of_node_put(ethernet); 1260 if (!conduit) 1261 return -EPROBE_DEFER; 1262 1263 user_protocol = of_get_property(dn, "dsa-tag-protocol", NULL); 1264 err = dsa_port_parse_cpu(dp, conduit, user_protocol); 1265 if (err)
1266 put_device(conduit);
1267 1268 return err; 1269 } 1270 1271 if (link) 1272 return dsa_port_parse_dsa(dp); 1273 1274 return dsa_port_parse_user(dp, name); 1275 } 1276
Hi Ma,
kernel test robot noticed the following build errors:
[auto build test ERROR on net-next/main] [also build test ERROR on net/main linus/master v6.19-rc1 next-20251219] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Ma-Ke/net-dsa-Fix-error-handl... base: net-next/main patch link: https://lore.kernel.org/r/20251214131204.4684-1-make24%40iscas.ac.cn patch subject: [PATCH v2] net: dsa: Fix error handling in dsa_port_parse_of config: riscv-allmodconfig (https://download.01.org/0day-ci/archive/20251221/202512210004.b6WzJwXn-lkp@i...) compiler: clang version 22.0.0git (https://github.com/llvm/llvm-project b324c9f4fa112d61a553bf489b5f4f7ceea05ea8) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251221/202512210004.b6WzJwXn-lkp@i...)
If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot lkp@intel.com | Closes: https://lore.kernel.org/oe-kbuild-all/202512210004.b6WzJwXn-lkp@intel.com/
All errors (new ones prefixed by >>):
net/dsa/dsa.c:1266:15: error: incompatible pointer types passing 'struct net_device *' to parameter of type 'struct device *' [-Wincompatible-pointer-types]
1266 | put_device(conduit); | ^~~~~~~ include/linux/device.h:1181:32: note: passing argument to parameter 'dev' here 1181 | void put_device(struct device *dev); | ^ 1 error generated.
vim +1266 net/dsa/dsa.c
1244 1245 static int dsa_port_parse_of(struct dsa_port *dp, struct device_node *dn) 1246 { 1247 struct device_node *ethernet = of_parse_phandle(dn, "ethernet", 0); 1248 const char *name = of_get_property(dn, "label", NULL); 1249 bool link = of_property_read_bool(dn, "link"); 1250 int err = 0; 1251 1252 dp->dn = dn; 1253 1254 if (ethernet) { 1255 struct net_device *conduit; 1256 const char *user_protocol; 1257 1258 conduit = of_find_net_device_by_node(ethernet); 1259 of_node_put(ethernet); 1260 if (!conduit) 1261 return -EPROBE_DEFER; 1262 1263 user_protocol = of_get_property(dn, "dsa-tag-protocol", NULL); 1264 err = dsa_port_parse_cpu(dp, conduit, user_protocol); 1265 if (err)
1266 put_device(conduit);
1267 1268 return err; 1269 } 1270 1271 if (link) 1272 return dsa_port_parse_dsa(dp); 1273 1274 return dsa_port_parse_user(dp, name); 1275 } 1276
linux-stable-mirror@lists.linaro.org