On Wed, Mar 09, 2022 at 01:50:07PM +0100, Dietmar Eggemann wrote:
On 08/03/2022 18:49, Darren Hart wrote:
On Tue, Mar 08, 2022 at 05:03:07PM +0100, Dietmar Eggemann wrote:
On 08/03/2022 12:04, Vincent Guittot wrote:
On Tue, 8 Mar 2022 at 11:30, Will Deacon will@kernel.org wrote:
[...]
IMHO, if core_mask weight is 1, MC will be removed/degenerated anyway.
This is what I get on my Ampere Altra (I guess I don't have the ACPI changes which would let to a CLS sched domain):
# cat /sys/kernel/debug/sched/domains/cpu0/domain*/name DIE NUMA root@oss-altra01:~# zcat /proc/config.gz | grep SCHED_CLUSTER CONFIG_SCHED_CLUSTER=y
I'd like to follow up on this. Would you share your dmidecode BIOS Information section?
# dmidecode -t 0 # dmidecode 3.2 Getting SMBIOS data from sysfs. SMBIOS 3.2.0 present.
Handle 0x0000, DMI type 0, 26 bytes BIOS Information Vendor: Ampere(TM) Version: 0.9.20200724 Release Date: 2020/07/24 ROM Size: 7680 kB Characteristics: PCI is supported BIOS is upgradeable Boot from CD is supported Selectable boot is supported ACPI is supported UEFI is supported BIOS Revision: 5.15 Firmware Revision: 0.6
Thank you, I'm following internally and will get with you.
Which kernel version?
v5.17-rc5
[...]
I would not say that I'm happy because this solution skews the core cpu mask in order to abuse the scheduler so that it will remove a wrong but useless level when it will build its domains. But this works so as long as the maintainer are happy, I'm fine
I did explore the other options and they added considerably more complexity without much benefit in my view. I prefer this option which maintains the cpu_topology as described by the platform, and maps it into something that suits the current scheduler abstraction. I agree there is more work to be done here and intend to continue with it.
I do not have any better idea than this tweak here either in case the platform can't provide a cleaner setup.
I'd argue The platform is describing itself accurately in ACPI PPTT terms. The topology doesn't fit nicely within the kernel abstractions today. This is an area where I hope to continue to improve things going forward.
I see. And I assume lying about SCU/LLC boundaries in ACPI is not an option since it messes up /sys/devices/system/cpu/cpu0/cache/index*/.
[...]
I'm not aware of a way to accurately describe the SCU topology in the PPTT, and the risk we run with lying about LLC topology is that lie has to be comprehended by all OSes and not conflict with other lies people may ask for. In general, I think it is preferable and more maintainable to describe the topology as accurately and honestly as we can within the existing platform mechanisms (PPTT, HMAT, etc) and work on the higher level abstractions to accommodate a broader set of topologies as they emerge (as well as working to more fully describe the topology with new platform level mechanisms as needed).
As I mentioned, I intend to continue looking in to how to improve the current abstractions. For now, it sounds like we have agreement that this patch can be merged to address the BUG?
Thanks all,