You probably already know how to reproduce it, but in case it helps, I still have the packet captures and can share them with you. Let me know if you'd like me to share them (and how to share them).
It would be best if you could provide a reproducer using iproute2: Configure a dummy device using ip-link, install the multipath route using ip-route, configure the neighbour table using ip-neigh and then perform route queries using "ip route get ..." showing the problem. We can then use it as the basis for a new test case in tools/testing/selftests/net/fib_tests.sh
I'll try to do that next week.
BTW, do you have CONFIG_IPV6_ROUTER_PREF=y in your config?
Yes.
$ gunzip -c /proc/config.gz | grep ROUTER_PREF CONFIG_IPV6_ROUTER_PREF=y
As such, it still seems appropriate (to me) that this be implemented in the legacy API as well as ensuring it works with the NH API.
As I understand it you currently get different results because the kernel installs two default routes whereas user space can only create one default multipath route.
Yes, that's the end result of an underlying problem.
Perhaps more to the point, the fact that a coalesced, INCOMPLETE, multipath route is selected when a REACHABLE alternative exists, is what prevents us from using coalesced multipath routes. This seems like a bug, since it violates RFC4861 6.3.6, bullet 1.
Imagine adding a 2nd router to an IPv6 network for added resiliency, but when one becomes unreachable, some network flows keep choosing the unreachable router. This is what is happening with ECMP routes. It doesn't happen with multiple default routes.
I'll just reiterate earlier comments, this doesn't happen all of the time. It seems I have a 50/50 chance of the INCOMPLETE route being selected.
Before adding a new uAPI I want to understand the source of the difference and see if we can improve / fix the current multipath code so that the two behave the same. If we can get them to behave the same then I don't think user space will care about two default routes versus one default multipath route.
Exactly, I totally support that approach.
Regards, Matt.