linaro-toolchain December 2022

linaro-toolchain@lists.linaro.org

4 participants
8 discussions

by Thiago Jung Bauermann

Hello, # [GNU-767] Support changing SVE vector length in remote debugging - Identified a regression on systems that do not support SVE. Debugged it and now working on a fix. -- Thiago

3 years, 3 months

[ACTIVITY] Report for week #51

by Thiago Jung Bauermann

Hello, # [GNU-767] Support changing SVE vector length in remote debugging - Finished implementing the new approach of sending new XML target descriptions through the wire. - Fixed a couple of minor regressions I introduced and rebased the code on the current main branch. - Now preparing the patches for submitting upstream. # Community participation - Reviewed mailing list patch “[PATCH] [AArch64] Enable pointer authentication support for aarch64 bare metal/kernel mode addresses”. -- Thiago

3 years, 3 months

[ACTIVITY] week ending Dec. 25 2022

by Alex Bennée

Project Stratos =============== - started reviewing [PATCH v9 0/8] KVM: mm: fd-based approach for supporting KVM Message-Id: <20221025151344.3784230-1-chao.p.peng(a)linux.intel.com> - trying to assess if user-space facing solution for memory sharing - writing up a [proposal for an API] [proposal for an API] <https://docs.google.com/document/d/18ijlX2Lguejyo3BV8Tri5Y_d1SmozocBIk2ajgq…> vhost-device maintainer effort ([UM-196]) - debugged regression in virtio-vsock and QEMU - should have some error message patches to post - QEMU 7.2 shipped with stubs for virtio-gpio and virtio-i2c [UM-196] <https://linaro.atlassian.net/browse/UM-196> Single Binary ([QEMU-487]) ========================== - posted [PATCH for 8.0 v5 00/20] use MemTxAttrs to avoid current_cpu in hw/ Message-Id: <20221111182535.64844-1-alex.bennee(a)linaro.org> [QEMU-487] <https://linaro.atlassian.net/browse/QEMU-487> Plugin register access ([QEMU-495]) =================================== - While experimenting with [the register API] ran into issues integrating to gdbstub - started a [re-factor] to make the process less painful - posted [PATCH v1 00/10] split user and system code in gdbstub Message-Id: <20221216112206.3171578-1-alex.bennee(a)linaro.org> [QEMU-495] <https://linaro.atlassian.net/browse/QEMU-495> [the register API] <https://github.com/stsquad/qemu/tree/introspection/registers> [re-factor] <https://github.com/stsquad/qemu/tree/gdbstub/next> QEMU Upstream Work ([UM-2]) =========================== - posted [PULL 0/6] testing updates Message-Id: <20221221144019.2149905-1-alex.bennee(a)linaro.org> - posted [PATCH 00/11] gitdm metadata updates Message-Id: <20221219121914.851488-1-alex.bennee(a)linaro.org> [UM-2] <https://linaro.atlassian.net/browse/UM-2> Completed Reviews [5/5] ======================= [QEMU][PATCH v2 00/11] Introduce xenpv machine for arm architecture Message-Id: <20221202030003.11441-1-vikram.garhwal(a)amd.com> [PATCH] configure: Fix check-tcg not executing any tests Message-Id: <20221207082309.9966-1-quic_mthiyaga(a)quicinc.com> [PATCH v4 00/27] tcg misc patches Message-Id: <20221213212541.1820840-1-richard.henderson(a)linaro.org> [PATCH v3 0/8] accel/tcg: Rewrite user-only vma tracking Message-Id: <20221209051914.398215-1-richard.henderson(a)linaro.org> [PATCH-for-8.0 0/5] accel/tcg: Restrict page_collection structure to system TB maintainance Message-Id: <20221209093649.43738-1-philmd(a)linaro.org> Absences ======== Christmas holidays - merry Christmas! Current Review Queue ==================== TODO [RFC PATCH v6] virtio-video: Add virtio video device specification Message-Id: <20221208072325.2259940-1-acourbot(a)chromium.org> =================================================================================================================================== TODO [RFC PATCH kvmtool v1 00/32] Add support for restricted guest memory in kvmtool Message-Id: <20221202174417.1310826-1-tabba(a)google.com> =========================================================================================================================================== TODO [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM Message-Id: <20221202061347.1070246-1-chao.p.peng(a)linux.intel.com> ==================================================================================================================================== -- Alex Bennée Virtualisation Tech Lead @ Linaro

3 years, 3 months

[ACTIVITY] report week ending 20 Dec

by Peter Maydell

Progress: * UM-2 [QEMU upstream maintainership] - I'm now back on merging duty for the 8.0 release cycle, so some time spent on pull request processing - Sent pull requests with accumulated arm and reset-refactoring patches from the freeze period - Trying to cut down my code review backlog before the holidays -- PMM

3 years, 3 months

[ACTIVITY] Report for week #50

by Thiago Jung Bauermann

# [GNU-767] Support changing SVE vector length in remote debugging - Patches to gdbserver to support changing the SVE vector length: About halfway through implementing the new approach of sending new XML target descriptions through the wire. # Misc - Experimented with using a GDB wrapper to run the testsuite. Came up with a small patch that fixes the tests that fail when using the wrapper. -- Thiago

3 years, 3 months

[TCWG CI] Failure after basepoints/gcc-13-4618-g17ae956c0fa: AArch64: Support new tbranch optab.

by ci_notify＠linaro.org

Failure after basepoints/gcc-13-4618-g17ae956c0fa: AArch64: Support new tbranch optab.: Results changed to -10 # build_abe binutils: -9 # build_abe stage1: -5 # build_abe qemu: -2 # linux_n_obj: 7572 # First few build errors in logs: # 00:10:43 drivers/gpu/drm/v3d/v3d_perfmon.c:57:1: internal compiler error: in decompose, at rtl.h:2288 # 00:10:44 make[5]: *** [scripts/Makefile.build:250: drivers/gpu/drm/v3d/v3d_perfmon.o] Error 1 # 00:10:53 make[4]: *** [scripts/Makefile.build:502: drivers/gpu/drm/v3d] Error 2 # 00:13:52 drivers/media/mc/mc-device.c:198:1: internal compiler error: in decompose, at rtl.h:2288 # 00:13:53 make[4]: *** [scripts/Makefile.build:250: drivers/media/mc/mc-device.o] Error 1 # 00:13:57 make[3]: *** [scripts/Makefile.build:502: drivers/media/mc] Error 2 # 00:15:27 make[2]: *** [scripts/Makefile.build:502: drivers/media] Error 2 # 00:17:50 make[3]: *** [scripts/Makefile.build:502: drivers/gpu/drm] Error 2 # 00:17:50 make[2]: *** [scripts/Makefile.build:502: drivers/gpu] Error 2 # 00:17:50 make[1]: *** [scripts/Makefile.build:502: drivers] Error 2 from -10 # build_abe binutils: -9 # build_abe stage1: -5 # build_abe qemu: -2 # linux_n_obj: 8625 # linux build successful: all # linux boot successful: boot THIS IS THE END OF INTERESTING STUFF. BELOW ARE LINKS TO BUILDS, REPRODUCTION INSTRUCTIONS, AND THE RAW COMMIT. For latest status see comments in https://linaro.atlassian.net/browse/GNU-680 . Status of basepoints/gcc-13-4618-g17ae956c0fa commit for tcwg_kernel: commit 17ae956c0fa6baac3d22764019d5dd5ebf5c2b11 Author: Tamar Christina <tamar.christina(a)arm.com> Date: Mon Dec 12 15:18:56 2022 +0000 AArch64: Support new tbranch optab. This implements the new tbranch optab for AArch64. we cannot emit one big RTL for the final instruction immediately. The reason that all comparisons in the AArch64 backend expand to separate CC compares, and separate testing of the operands is for ifcvt. The separate CC compare is needed so ifcvt can produce csel, cset etc from the compares. Unlike say combine, ifcvt can not do recog on a parallel with a clobber. Should we emit the instruction directly then ifcvt will not be able to say, make a csel, because we have no patterns which handle zero_extract and compare. (unlike combine ifcvt cannot transform the extract into an AND). While you could provide various patterns for this (and I did try) you end up with broken patterns because you can't add the clobber to the CC register. If you do, ifcvt recog fails. i.e. int f1 (int x) { if (x & 1) return 1; return x; } We lose csel here. Secondly the reason the compare with an explicit CC mode is needed is so that ifcvt can transform the operation into a version that doesn't require the flags to be set. But it only does so if it know the explicit usage of the CC reg. For instance int foo (int a, int b) { return ((a & (1 << 25)) ? 5 : 4); } Doesn't require a comparison, the optimal form is: foo(int, int): ubfx x0, x0, 25, 1 add w0, w0, 4 ret and no compare is actually needed. If you represent the instruction using an ANDS instead of a zero_extract then you get close, but you end up with an ands followed by an add, which is a slower operation. gcc/ChangeLog: * config/aarch64/aarch64.md (*tb<optab><mode>1): Rename to... (*tb<optab><ALLI:mode><GPI:mode>1): ... this. (tbranch_<code><mode>4): New. * config/aarch64/iterators.md(ZEROM, zerom): New. gcc/testsuite/ChangeLog: * gcc.target/aarch64/tbz_1.c: New test. * gnu-master-aarch64-mainline-defconfig ** Failure after basepoints/gcc-13-4618-g17ae956c0fa: AArch64: Support new tbranch optab.: ** https://ci.linaro.org/job/tcwg_kernel-gnu-build-gnu-master-aarch64-mainline… Bad build: https://ci.linaro.org/job/tcwg_kernel-gnu-build-gnu-master-aarch64-mainline… Good build: https://ci.linaro.org/job/tcwg_kernel-gnu-build-gnu-master-aarch64-mainline… Reproduce current build: <cut> mkdir -p investigate-gcc-17ae956c0fa6baac3d22764019d5dd5ebf5c2b11 cd investigate-gcc-17ae956c0fa6baac3d22764019d5dd5ebf5c2b11 # Fetch scripts git clone https://git.linaro.org/toolchain/jenkins-scripts # Fetch manifests for bad and good builds mkdir -p bad/artifacts good/artifacts curl -o bad/artifacts/manifest.sh https://ci.linaro.org/job/tcwg_kernel-gnu-build-gnu-master-aarch64-mainline… --fail curl -o good/artifacts/manifest.sh https://ci.linaro.org/job/tcwg_kernel-gnu-build-gnu-master-aarch64-mainline… --fail # Reproduce bad build (cd bad; ../jenkins-scripts/tcwg_kernel-build.sh ^^ true %%rr[top_artifacts] artifacts) # Reproduce good build (cd good; ../jenkins-scripts/tcwg_kernel-build.sh ^^ true %%rr[top_artifacts] artifacts) </cut> Full commit (up to 1000 lines): <cut> commit 17ae956c0fa6baac3d22764019d5dd5ebf5c2b11 Author: Tamar Christina <tamar.christina(a)arm.com> Date: Mon Dec 12 15:18:56 2022 +0000 AArch64: Support new tbranch optab. This implements the new tbranch optab for AArch64. we cannot emit one big RTL for the final instruction immediately. The reason that all comparisons in the AArch64 backend expand to separate CC compares, and separate testing of the operands is for ifcvt. The separate CC compare is needed so ifcvt can produce csel, cset etc from the compares. Unlike say combine, ifcvt can not do recog on a parallel with a clobber. Should we emit the instruction directly then ifcvt will not be able to say, make a csel, because we have no patterns which handle zero_extract and compare. (unlike combine ifcvt cannot transform the extract into an AND). While you could provide various patterns for this (and I did try) you end up with broken patterns because you can't add the clobber to the CC register. If you do, ifcvt recog fails. i.e. int f1 (int x) { if (x & 1) return 1; return x; } We lose csel here. Secondly the reason the compare with an explicit CC mode is needed is so that ifcvt can transform the operation into a version that doesn't require the flags to be set. But it only does so if it know the explicit usage of the CC reg. For instance int foo (int a, int b) { return ((a & (1 << 25)) ? 5 : 4); } Doesn't require a comparison, the optimal form is: foo(int, int): ubfx x0, x0, 25, 1 add w0, w0, 4 ret and no compare is actually needed. If you represent the instruction using an ANDS instead of a zero_extract then you get close, but you end up with an ands followed by an add, which is a slower operation. gcc/ChangeLog: * config/aarch64/aarch64.md (*tb<optab><mode>1): Rename to... (*tb<optab><ALLI:mode><GPI:mode>1): ... this. (tbranch_<code><mode>4): New. * config/aarch64/iterators.md(ZEROM, zerom): New. gcc/testsuite/ChangeLog: * gcc.target/aarch64/tbz_1.c: New test. --- gcc/config/aarch64/aarch64.md | 33 ++++++++--- gcc/config/aarch64/iterators.md | 2 + gcc/testsuite/gcc.target/aarch64/tbz_1.c | 95 ++++++++++++++++++++++++++++++++ 3 files changed, 122 insertions(+), 8 deletions(-) diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 896b6a8ac79..d749c98eef6 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -947,12 +947,29 @@ (const_int 1)))] ) -(define_insn "*tb<optab><mode>1" +(define_expand "tbranch_<code><mode>3" [(set (pc) (if_then_else - (EQL (zero_extract:DI (match_operand:GPI 0 "register_operand" "r") - (const_int 1) - (match_operand 1 - "aarch64_simd_shift_imm_<mode>" "n")) + (EQL (match_operand:ALLI 0 "register_operand") + (match_operand 1 "aarch64_simd_shift_imm_<mode>")) + (label_ref (match_operand 2 "")) + (pc)))] + "" +{ + rtx bitvalue = gen_reg_rtx (<ZEROM>mode); + rtx reg = gen_lowpart (<ZEROM>mode, operands[0]); + rtx val = GEN_INT (1UL << UINTVAL (operands[1])); + emit_insn (gen_and<zerom>3 (bitvalue, reg, val)); + operands[1] = const0_rtx; + operands[0] = aarch64_gen_compare_reg (<CODE>, bitvalue, + operands[1]); +}) + +(define_insn "*tb<optab><ALLI:mode><GPI:mode>1" + [(set (pc) (if_then_else + (EQL (zero_extract:GPI (match_operand:ALLI 0 "register_operand" "r") + (const_int 1) + (match_operand 1 + "aarch64_simd_shift_imm_<ALLI:mode>" "n")) (const_int 0)) (label_ref (match_operand 2 "" "")) (pc))) @@ -963,15 +980,15 @@ { if (get_attr_far_branch (insn) == 1) return aarch64_gen_far_branch (operands, 2, "Ltb", - "<inv_tb>\\t%<w>0, %1, "); + "<inv_tb>\\t%<ALLI:w>0, %1, "); else { operands[1] = GEN_INT (HOST_WIDE_INT_1U << UINTVAL (operands[1])); - return "tst\t%<w>0, %1\;<bcond>\t%l2"; + return "tst\t%<ALLI:w>0, %1\;<bcond>\t%l2"; } } else - return "<tbz>\t%<w>0, %1, %l2"; + return "<tbz>\t%<ALLI:w>0, %1, %l2"; } [(set_attr "type" "branch") (set (attr "length") diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index d10cf93572e..a521dbde1ec 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -1107,6 +1107,8 @@ ;; Give the number of bits in the mode (define_mode_attr sizen [(QI "8") (HI "16") (SI "32") (DI "64")]) +(define_mode_attr ZEROM [(QI "SI") (HI "SI") (SI "SI") (DI "DI")]) +(define_mode_attr zerom [(QI "si") (HI "si") (SI "si") (DI "di")]) ;; Give the ordinal of the MSB in the mode (define_mode_attr sizem1 [(QI "#7") (HI "#15") (SI "#31") (DI "#63") diff --git a/gcc/testsuite/gcc.target/aarch64/tbz_1.c b/gcc/testsuite/gcc.target/aarch64/tbz_1.c new file mode 100644 index 00000000000..39deb58e278 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/tbz_1.c @@ -0,0 +1,95 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O2 -std=c99 -fno-unwind-tables -fno-asynchronous-unwind-tables" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +#include <stdbool.h> + +void h(void); + +/* +** g1: +** tbnz w[0-9]+, #?0, .L([0-9]+) +** ret +** ... +*/ +void g1(bool x) +{ + if (__builtin_expect (x, 0)) + h (); +} + +/* +** g2: +** tbz w[0-9]+, #?0, .L([0-9]+) +** b h +** ... +*/ +void g2(bool x) +{ + if (__builtin_expect (x, 1)) + h (); +} + +/* +** g3_ge: +** tbnz w[0-9]+, #?31, .L[0-9]+ +** b h +** ... +*/ +void g3_ge(int x) +{ + if (__builtin_expect (x >= 0, 1)) + h (); +} + +/* +** g3_gt: +** cmp w[0-9]+, 0 +** ble .L[0-9]+ +** b h +** ... +*/ +void g3_gt(int x) +{ + if (__builtin_expect (x > 0, 1)) + h (); +} + +/* +** g3_lt: +** tbz w[0-9]+, #?31, .L[0-9]+ +** b h +** ... +*/ +void g3_lt(int x) +{ + if (__builtin_expect (x < 0, 1)) + h (); +} + +/* +** g3_le: +** cmp w[0-9]+, 0 +** bgt .L[0-9]+ +** b h +** ... +*/ +void g3_le(int x) +{ + if (__builtin_expect (x <= 0, 1)) + h (); +} + +/* +** g5: +** mov w[0-9]+, 65279 +** tst w[0-9]+, w[0-9]+ +** beq .L[0-9]+ +** b h +** ... +*/ +void g5(int x) +{ + if (__builtin_expect (x & 0xfeff, 1)) + h (); +} </cut>

3 years, 3 months

[ACTIVITY] Report for week #49

by Thiago Jung Bauermann

# [GNU-767] Support changing SVE vector length in remote debugging - Patches to gdbserver to support changing the SVE vector length: There was an upstream discussion about whether changing the implementation from relying on expedited registers to relying on sending target descriptions over the wire was a better approach. Simon Marchi detailed his idea on how to do that and it does seem better. - Started implementing Simon's approach of sending target descriptions over the wire for each thread. # Misc - Sent and later committed a couple of patches¹ fixing whitespace issues in a Python script that generates a GDB source file. -- Thiago ¹ https://inbox.sourceware.org/gdb-patches/20221202192200.405379-1-thiago.bau…

3 years, 4 months

[ACTIVITY] Report for week #48

by Thiago Jung Bauermann

Hello, # [GNU-767] Support changing SVE vector length in remote debugging - v2 of the gdbserver patches to support changing the SVE vector length was quickly reviewed by both Luis and Simon Marchi. I applied their review suggestions and I'm now working on fixing a bug with multi-threaded programs that they spotted. - Submitted a couple of small patches¹ fixing tab vs spaces issues in the gdbarch.py script that generates some source code in GDB. # Misc - Fixed problem in the tcwg-dev/start.sh script where asking docker to expose /dev/kvm to a dev container on a host which doesn't have KVM support causes docker to error out (reported by David Spickett). Sent Gerrit change request “42669: tcwg-dev: Add heuristic to check for KVM support on the host” to fix it. David reviewed and merged it. Thanks! -- Thiago ¹ https://inbox.sourceware.org/gdb-patches/20221202192200.405379-1-thiago.bau…

3 years, 4 months

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

linaro-toolchain December 2022