Successfully identified regression in *linux* in CI configuration tcwg_kernel/llvm-release-aarch64-next-allnoconfig. So far, this commit has regressed CI configurations:
- tcwg_kernel/llvm-release-aarch64-next-allnoconfig
Culprit:
<cut>
commit 8633ef82f101c040427b57d4df7b706261420b94
Author: Javier Martinez Canillas <javierm(a)redhat.com>
Date: Fri Jun 25 15:13:59 2021 +0200
drivers/firmware: consolidate EFI framebuffer setup for all arches
The register_gop_device() function registers an "efi-framebuffer" platform
device to match against the efifb driver, to have an early framebuffer for
EFI platforms.
But there is already support to do exactly the same by the Generic System
Framebuffers (sysfb) driver. This used to be only for X86 but it has been
moved to drivers/firmware and could be reused by other architectures.
Also, besides supporting registering an "efi-framebuffer", this driver can
register a "simple-framebuffer" allowing to use the siple{fb,drm} drivers
on non-X86 EFI platforms. For example, on aarch64 these drivers can only
be used with DT and doesn't have code to register a "simple-frambuffer"
platform device when booting with EFI.
For these reasons, let's remove the register_gop_device() duplicated code
and instead move the platform specific logic that's there to sysfb driver.
Signed-off-by: Javier Martinez Canillas <javierm(a)redhat.com>
Acked-by: Borislav Petkov <bp(a)suse.de>
Acked-by: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Signed-off-by: Thomas Zimmermann <tzimmermann(a)suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20210625131359.1804394-1-javi…
</cut>
Results regressed to (for first_bad == 8633ef82f101c040427b57d4df7b706261420b94)
# reset_artifacts:
-10
# build_abe binutils:
-9
# build_llvm:
-5
# build_abe qemu:
-2
# linux_n_obj:
600
# First few build errors in logs:
# 00:00:38 ld.lld: error: undefined symbol: screen_info
# 00:00:38 make: *** [vmlinux] Error 1
from (for last_good == d391c58271072d0b0fad93c82018d495b2633448)
# reset_artifacts:
-10
# build_abe binutils:
-9
# build_llvm:
-5
# build_abe qemu:
-2
# linux_n_obj:
601
# linux build successful:
all
# linux boot successful:
boot
Artifacts of last_good build: https://ci.linaro.org/job/tcwg_kernel-llvm-bisect-llvm-release-aarch64-next…
Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_kernel-llvm-bisect-llvm-release-aarch64-next…
Build top page/logs: https://ci.linaro.org/job/tcwg_kernel-llvm-bisect-llvm-release-aarch64-next…
Configuration details:
rr[linux_git]="https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git#ff11764…"
Reproduce builds:
<cut>
mkdir investigate-linux-8633ef82f101c040427b57d4df7b706261420b94
cd investigate-linux-8633ef82f101c040427b57d4df7b706261420b94
git clone https://git.linaro.org/toolchain/jenkins-scripts
mkdir -p artifacts/manifests
curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_kernel-llvm-bisect-llvm-release-aarch64-next… --fail
curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_kernel-llvm-bisect-llvm-release-aarch64-next… --fail
curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_kernel-llvm-bisect-llvm-release-aarch64-next… --fail
chmod +x artifacts/test.sh
# Reproduce the baseline build (build all pre-requisites)
./jenkins-scripts/tcwg_kernel-build.sh @@ artifacts/manifests/build-baseline.sh
# Save baseline build state (which is then restored in artifacts/test.sh)
mkdir -p ./bisect
rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /linux/ ./ ./bisect/baseline/
cd linux
# Reproduce first_bad build
git checkout --detach 8633ef82f101c040427b57d4df7b706261420b94
../artifacts/test.sh
# Reproduce last_good build
git checkout --detach d391c58271072d0b0fad93c82018d495b2633448
../artifacts/test.sh
cd ..
</cut>
History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/…
Artifacts: https://ci.linaro.org/job/tcwg_kernel-llvm-bisect-llvm-release-aarch64-next…
Build log: https://ci.linaro.org/job/tcwg_kernel-llvm-bisect-llvm-release-aarch64-next…
Full commit (up to 1000 lines):
<cut>
commit 8633ef82f101c040427b57d4df7b706261420b94
Author: Javier Martinez Canillas <javierm(a)redhat.com>
Date: Fri Jun 25 15:13:59 2021 +0200
drivers/firmware: consolidate EFI framebuffer setup for all arches
The register_gop_device() function registers an "efi-framebuffer" platform
device to match against the efifb driver, to have an early framebuffer for
EFI platforms.
But there is already support to do exactly the same by the Generic System
Framebuffers (sysfb) driver. This used to be only for X86 but it has been
moved to drivers/firmware and could be reused by other architectures.
Also, besides supporting registering an "efi-framebuffer", this driver can
register a "simple-framebuffer" allowing to use the siple{fb,drm} drivers
on non-X86 EFI platforms. For example, on aarch64 these drivers can only
be used with DT and doesn't have code to register a "simple-frambuffer"
platform device when booting with EFI.
For these reasons, let's remove the register_gop_device() duplicated code
and instead move the platform specific logic that's there to sysfb driver.
Signed-off-by: Javier Martinez Canillas <javierm(a)redhat.com>
Acked-by: Borislav Petkov <bp(a)suse.de>
Acked-by: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Signed-off-by: Thomas Zimmermann <tzimmermann(a)suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20210625131359.1804394-1-javi…
---
arch/arm/include/asm/efi.h | 5 +--
arch/arm64/include/asm/efi.h | 5 +--
arch/riscv/include/asm/efi.h | 5 +--
drivers/firmware/Kconfig | 8 ++--
drivers/firmware/Makefile | 2 +-
drivers/firmware/efi/efi-init.c | 90 ---------------------------------------
drivers/firmware/efi/sysfb_efi.c | 76 ++++++++++++++++++++++++++++++++-
drivers/firmware/sysfb.c | 35 ++++++++++-----
drivers/firmware/sysfb_simplefb.c | 31 ++++++++++----
drivers/gpu/drm/tiny/Kconfig | 4 +-
include/linux/sysfb.h | 26 +++++------
11 files changed, 143 insertions(+), 144 deletions(-)
diff --git a/arch/arm/include/asm/efi.h b/arch/arm/include/asm/efi.h
index 9de7ab2ce05d..a6f3b179e8a9 100644
--- a/arch/arm/include/asm/efi.h
+++ b/arch/arm/include/asm/efi.h
@@ -17,6 +17,7 @@
#ifdef CONFIG_EFI
void efi_init(void);
+extern void efifb_setup_from_dmi(struct screen_info *si, const char *opt);
int efi_create_mapping(struct mm_struct *mm, efi_memory_desc_t *md);
int efi_set_mapping_permissions(struct mm_struct *mm, efi_memory_desc_t *md);
@@ -52,10 +53,6 @@ void efi_virtmap_unload(void);
struct screen_info *alloc_screen_info(void);
void free_screen_info(struct screen_info *si);
-static inline void efifb_setup_from_dmi(struct screen_info *si, const char *opt)
-{
-}
-
/*
* A reasonable upper bound for the uncompressed kernel size is 32 MBytes,
* so we will reserve that amount of memory. We have no easy way to tell what
diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
index 3578aba9c608..42d673a011c8 100644
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@@ -14,6 +14,7 @@
#ifdef CONFIG_EFI
extern void efi_init(void);
+extern void efifb_setup_from_dmi(struct screen_info *si, const char *opt);
#else
#define efi_init()
#endif
@@ -85,10 +86,6 @@ static inline void free_screen_info(struct screen_info *si)
{
}
-static inline void efifb_setup_from_dmi(struct screen_info *si, const char *opt)
-{
-}
-
#define EFI_ALLOC_ALIGN SZ_64K
/*
diff --git a/arch/riscv/include/asm/efi.h b/arch/riscv/include/asm/efi.h
index 6d98cd999680..7a8f0d45b13a 100644
--- a/arch/riscv/include/asm/efi.h
+++ b/arch/riscv/include/asm/efi.h
@@ -13,6 +13,7 @@
#ifdef CONFIG_EFI
extern void efi_init(void);
+extern void efifb_setup_from_dmi(struct screen_info *si, const char *opt);
#else
#define efi_init()
#endif
@@ -39,10 +40,6 @@ static inline void free_screen_info(struct screen_info *si)
{
}
-static inline void efifb_setup_from_dmi(struct screen_info *si, const char *opt)
-{
-}
-
void efi_virtmap_load(void);
void efi_virtmap_unload(void);
diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
index 71f3d97f0c39..af6719cc576b 100644
--- a/drivers/firmware/Kconfig
+++ b/drivers/firmware/Kconfig
@@ -254,9 +254,9 @@ config QCOM_SCM_DOWNLOAD_MODE_DEFAULT
config SYSFB
bool
default y
- depends on X86 || COMPILE_TEST
+ depends on X86 || ARM || ARM64 || RISCV || COMPILE_TEST
-config X86_SYSFB
+config SYSFB_SIMPLEFB
bool "Mark VGA/VBE/EFI FB as generic system framebuffer"
depends on SYSFB
help
@@ -264,10 +264,10 @@ config X86_SYSFB
bootloader or kernel can show basic video-output during boot for
user-guidance and debugging. Historically, x86 used the VESA BIOS
Extensions and EFI-framebuffers for this, which are mostly limited
- to x86.
+ to x86 BIOS or EFI systems.
This option, if enabled, marks VGA/VBE/EFI framebuffers as generic
framebuffers so the new generic system-framebuffer drivers can be
- used on x86. If the framebuffer is not compatible with the generic
+ used instead. If the framebuffer is not compatible with the generic
modes, it is advertised as fallback platform framebuffer so legacy
drivers like efifb, vesafb and uvesafb can pick it up.
If this option is not selected, all system framebuffers are always
diff --git a/drivers/firmware/Makefile b/drivers/firmware/Makefile
index ad78f78ffa8d..6ac637e422b9 100644
--- a/drivers/firmware/Makefile
+++ b/drivers/firmware/Makefile
@@ -19,7 +19,7 @@ obj-$(CONFIG_RASPBERRYPI_FIRMWARE) += raspberrypi.o
obj-$(CONFIG_FW_CFG_SYSFS) += qemu_fw_cfg.o
obj-$(CONFIG_QCOM_SCM) += qcom_scm.o qcom_scm-smc.o qcom_scm-legacy.o
obj-$(CONFIG_SYSFB) += sysfb.o
-obj-$(CONFIG_X86_SYSFB) += sysfb_simplefb.o
+obj-$(CONFIG_SYSFB_SIMPLEFB) += sysfb_simplefb.o
obj-$(CONFIG_TI_SCI_PROTOCOL) += ti_sci.o
obj-$(CONFIG_TRUSTED_FOUNDATIONS) += trusted_foundations.o
obj-$(CONFIG_TURRIS_MOX_RWTM) += turris-mox-rwtm.o
diff --git a/drivers/firmware/efi/efi-init.c b/drivers/firmware/efi/efi-init.c
index a552a08a1741..b19ce1a83f91 100644
--- a/drivers/firmware/efi/efi-init.c
+++ b/drivers/firmware/efi/efi-init.c
@@ -275,93 +275,3 @@ void __init efi_init(void)
}
#endif
}
-
-static bool efifb_overlaps_pci_range(const struct of_pci_range *range)
-{
- u64 fb_base = screen_info.lfb_base;
-
- if (screen_info.capabilities & VIDEO_CAPABILITY_64BIT_BASE)
- fb_base |= (u64)(unsigned long)screen_info.ext_lfb_base << 32;
-
- return fb_base >= range->cpu_addr &&
- fb_base < (range->cpu_addr + range->size);
-}
-
-static struct device_node *find_pci_overlap_node(void)
-{
- struct device_node *np;
-
- for_each_node_by_type(np, "pci") {
- struct of_pci_range_parser parser;
- struct of_pci_range range;
- int err;
-
- err = of_pci_range_parser_init(&parser, np);
- if (err) {
- pr_warn("of_pci_range_parser_init() failed: %d\n", err);
- continue;
- }
-
- for_each_of_pci_range(&parser, &range)
- if (efifb_overlaps_pci_range(&range))
- return np;
- }
- return NULL;
-}
-
-/*
- * If the efifb framebuffer is backed by a PCI graphics controller, we have
- * to ensure that this relation is expressed using a device link when
- * running in DT mode, or the probe order may be reversed, resulting in a
- * resource reservation conflict on the memory window that the efifb
- * framebuffer steals from the PCIe host bridge.
- */
-static int efifb_add_links(struct fwnode_handle *fwnode)
-{
- struct device_node *sup_np;
-
- sup_np = find_pci_overlap_node();
-
- /*
- * If there's no PCI graphics controller backing the efifb, we are
- * done here.
- */
- if (!sup_np)
- return 0;
-
- fwnode_link_add(fwnode, of_fwnode_handle(sup_np));
- of_node_put(sup_np);
-
- return 0;
-}
-
-static const struct fwnode_operations efifb_fwnode_ops = {
- .add_links = efifb_add_links,
-};
-
-static struct fwnode_handle efifb_fwnode;
-
-static int __init register_gop_device(void)
-{
- struct platform_device *pd;
- int err;
-
- if (screen_info.orig_video_isVGA != VIDEO_TYPE_EFI)
- return 0;
-
- pd = platform_device_alloc("efi-framebuffer", 0);
- if (!pd)
- return -ENOMEM;
-
- if (IS_ENABLED(CONFIG_PCI)) {
- fwnode_init(&efifb_fwnode, &efifb_fwnode_ops);
- pd->dev.fwnode = &efifb_fwnode;
- }
-
- err = platform_device_add_data(pd, &screen_info, sizeof(screen_info));
- if (err)
- return err;
-
- return platform_device_add(pd);
-}
-subsys_initcall(register_gop_device);
diff --git a/drivers/firmware/efi/sysfb_efi.c b/drivers/firmware/efi/sysfb_efi.c
index 9f035b15501c..f51865e1b876 100644
--- a/drivers/firmware/efi/sysfb_efi.c
+++ b/drivers/firmware/efi/sysfb_efi.c
@@ -1,6 +1,6 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/*
- * Generic System Framebuffers on x86
+ * Generic System Framebuffers
* Copyright (c) 2012-2013 David Herrmann <dh.herrmann(a)gmail.com>
*
* EFI Quirks Copyright (c) 2006 Edgar Hucek <gimli(a)dark-green.com>
@@ -19,7 +19,9 @@
#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/mm.h>
+#include <linux/of_address.h>
#include <linux/pci.h>
+#include <linux/platform_device.h>
#include <linux/screen_info.h>
#include <linux/sysfb.h>
#include <video/vga.h>
@@ -267,7 +269,72 @@ static const struct dmi_system_id efifb_dmi_swap_width_height[] __initconst = {
{},
};
-__init void sysfb_apply_efi_quirks(void)
+static bool efifb_overlaps_pci_range(const struct of_pci_range *range)
+{
+ u64 fb_base = screen_info.lfb_base;
+
+ if (screen_info.capabilities & VIDEO_CAPABILITY_64BIT_BASE)
+ fb_base |= (u64)(unsigned long)screen_info.ext_lfb_base << 32;
+
+ return fb_base >= range->cpu_addr &&
+ fb_base < (range->cpu_addr + range->size);
+}
+
+static struct device_node *find_pci_overlap_node(void)
+{
+ struct device_node *np;
+
+ for_each_node_by_type(np, "pci") {
+ struct of_pci_range_parser parser;
+ struct of_pci_range range;
+ int err;
+
+ err = of_pci_range_parser_init(&parser, np);
+ if (err) {
+ pr_warn("of_pci_range_parser_init() failed: %d\n", err);
+ continue;
+ }
+
+ for_each_of_pci_range(&parser, &range)
+ if (efifb_overlaps_pci_range(&range))
+ return np;
+ }
+ return NULL;
+}
+
+/*
+ * If the efifb framebuffer is backed by a PCI graphics controller, we have
+ * to ensure that this relation is expressed using a device link when
+ * running in DT mode, or the probe order may be reversed, resulting in a
+ * resource reservation conflict on the memory window that the efifb
+ * framebuffer steals from the PCIe host bridge.
+ */
+static int efifb_add_links(struct fwnode_handle *fwnode)
+{
+ struct device_node *sup_np;
+
+ sup_np = find_pci_overlap_node();
+
+ /*
+ * If there's no PCI graphics controller backing the efifb, we are
+ * done here.
+ */
+ if (!sup_np)
+ return 0;
+
+ fwnode_link_add(fwnode, of_fwnode_handle(sup_np));
+ of_node_put(sup_np);
+
+ return 0;
+}
+
+static const struct fwnode_operations efifb_fwnode_ops = {
+ .add_links = efifb_add_links,
+};
+
+static struct fwnode_handle efifb_fwnode;
+
+__init void sysfb_apply_efi_quirks(struct platform_device *pd)
{
if (screen_info.orig_video_isVGA != VIDEO_TYPE_EFI ||
!(screen_info.capabilities & VIDEO_CAPABILITY_SKIP_QUIRKS))
@@ -281,4 +348,9 @@ __init void sysfb_apply_efi_quirks(void)
screen_info.lfb_height = temp;
screen_info.lfb_linelength = 4 * screen_info.lfb_width;
}
+
+ if (screen_info.orig_video_isVGA == VIDEO_TYPE_EFI && IS_ENABLED(CONFIG_PCI)) {
+ fwnode_init(&efifb_fwnode, &efifb_fwnode_ops);
+ pd->dev.fwnode = &efifb_fwnode;
+ }
}
diff --git a/drivers/firmware/sysfb.c b/drivers/firmware/sysfb.c
index 1337515963d5..2bfbb05f7d89 100644
--- a/drivers/firmware/sysfb.c
+++ b/drivers/firmware/sysfb.c
@@ -1,11 +1,11 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/*
- * Generic System Framebuffers on x86
+ * Generic System Framebuffers
* Copyright (c) 2012-2013 David Herrmann <dh.herrmann(a)gmail.com>
*/
/*
- * Simple-Framebuffer support for x86 systems
+ * Simple-Framebuffer support
* Create a platform-device for any available boot framebuffer. The
* simple-framebuffer platform device is already available on DT systems, so
* this module parses the global "screen_info" object and creates a suitable
@@ -16,12 +16,12 @@
* to pick these devices up without messing with simple-framebuffer drivers.
* The global "screen_info" is still valid at all times.
*
- * If CONFIG_X86_SYSFB is not selected, we never register "simple-framebuffer"
+ * If CONFIG_SYSFB_SIMPLEFB is not selected, never register "simple-framebuffer"
* platform devices, but only use legacy framebuffer devices for
* backwards compatibility.
*
* TODO: We set the dev_id field of all platform-devices to 0. This allows
- * other x86 OF/DT parsers to create such devices, too. However, they must
+ * other OF/DT parsers to create such devices, too. However, they must
* start at offset 1 for this to work.
*/
@@ -43,12 +43,10 @@ static __init int sysfb_init(void)
bool compatible;
int ret;
- sysfb_apply_efi_quirks();
-
/* try to create a simple-framebuffer device */
- compatible = parse_mode(si, &mode);
+ compatible = sysfb_parse_mode(si, &mode);
if (compatible) {
- ret = create_simplefb(si, &mode);
+ ret = sysfb_create_simplefb(si, &mode);
if (!ret)
return 0;
}
@@ -61,9 +59,24 @@ static __init int sysfb_init(void)
else
name = "platform-framebuffer";
- pd = platform_device_register_resndata(NULL, name, 0,
- NULL, 0, si, sizeof(*si));
- return PTR_ERR_OR_ZERO(pd);
+ pd = platform_device_alloc(name, 0);
+ if (!pd)
+ return -ENOMEM;
+
+ sysfb_apply_efi_quirks(pd);
+
+ ret = platform_device_add_data(pd, si, sizeof(*si));
+ if (ret)
+ goto err;
+
+ ret = platform_device_add(pd);
+ if (ret)
+ goto err;
+
+ return 0;
+err:
+ platform_device_put(pd);
+ return ret;
}
/* must execute after PCI subsystem for EFI quirks */
diff --git a/drivers/firmware/sysfb_simplefb.c b/drivers/firmware/sysfb_simplefb.c
index df892444ea17..b86761904949 100644
--- a/drivers/firmware/sysfb_simplefb.c
+++ b/drivers/firmware/sysfb_simplefb.c
@@ -1,6 +1,6 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/*
- * Generic System Framebuffers on x86
+ * Generic System Framebuffers
* Copyright (c) 2012-2013 David Herrmann <dh.herrmann(a)gmail.com>
*/
@@ -23,9 +23,9 @@
static const char simplefb_resname[] = "BOOTFB";
static const struct simplefb_format formats[] = SIMPLEFB_FORMATS;
-/* try parsing x86 screen_info into a simple-framebuffer mode struct */
-__init bool parse_mode(const struct screen_info *si,
- struct simplefb_platform_data *mode)
+/* try parsing screen_info into a simple-framebuffer mode struct */
+__init bool sysfb_parse_mode(const struct screen_info *si,
+ struct simplefb_platform_data *mode)
{
const struct simplefb_format *f;
__u8 type;
@@ -57,13 +57,14 @@ __init bool parse_mode(const struct screen_info *si,
return false;
}
-__init int create_simplefb(const struct screen_info *si,
- const struct simplefb_platform_data *mode)
+__init int sysfb_create_simplefb(const struct screen_info *si,
+ const struct simplefb_platform_data *mode)
{
struct platform_device *pd;
struct resource res;
u64 base, size;
u32 length;
+ int ret;
/*
* If the 64BIT_BASE capability is set, ext_lfb_base will contain the
@@ -105,7 +106,19 @@ __init int create_simplefb(const struct screen_info *si,
if (res.end <= res.start)
return -EINVAL;
- pd = platform_device_register_resndata(NULL, "simple-framebuffer", 0,
- &res, 1, mode, sizeof(*mode));
- return PTR_ERR_OR_ZERO(pd);
+ pd = platform_device_alloc("simple-framebuffer", 0);
+ if (!pd)
+ return -ENOMEM;
+
+ sysfb_apply_efi_quirks(pd);
+
+ ret = platform_device_add_resources(pd, &res, 1);
+ if (ret)
+ return ret;
+
+ ret = platform_device_add_data(pd, mode, sizeof(*mode));
+ if (ret)
+ return ret;
+
+ return platform_device_add(pd);
}
diff --git a/drivers/gpu/drm/tiny/Kconfig b/drivers/gpu/drm/tiny/Kconfig
index 5593128eeff9..d31be274a2bd 100644
--- a/drivers/gpu/drm/tiny/Kconfig
+++ b/drivers/gpu/drm/tiny/Kconfig
@@ -64,8 +64,8 @@ config DRM_SIMPLEDRM
buffer, size, and display format must be provided via device tree,
UEFI, VESA, etc.
- On x86 and compatible, you should also select CONFIG_X86_SYSFB to
- use UEFI and VESA framebuffers.
+ On x86 BIOS or UEFI systems, you should also select SYSFB_SIMPLEFB
+ to use UEFI and VESA framebuffers.
config TINYDRM_HX8357D
tristate "DRM support for HX8357D display panels"
diff --git a/include/linux/sysfb.h b/include/linux/sysfb.h
index 3e5355769dc3..b0dcfa26d07b 100644
--- a/include/linux/sysfb.h
+++ b/include/linux/sysfb.h
@@ -58,37 +58,37 @@ struct efifb_dmi_info {
#ifdef CONFIG_EFI
extern struct efifb_dmi_info efifb_dmi_list[];
-void sysfb_apply_efi_quirks(void);
+void sysfb_apply_efi_quirks(struct platform_device *pd);
#else /* CONFIG_EFI */
-static inline void sysfb_apply_efi_quirks(void)
+static inline void sysfb_apply_efi_quirks(struct platform_device *pd)
{
}
#endif /* CONFIG_EFI */
-#ifdef CONFIG_X86_SYSFB
+#ifdef CONFIG_SYSFB_SIMPLEFB
-bool parse_mode(const struct screen_info *si,
- struct simplefb_platform_data *mode);
-int create_simplefb(const struct screen_info *si,
- const struct simplefb_platform_data *mode);
+bool sysfb_parse_mode(const struct screen_info *si,
+ struct simplefb_platform_data *mode);
+int sysfb_create_simplefb(const struct screen_info *si,
+ const struct simplefb_platform_data *mode);
-#else /* CONFIG_X86_SYSFB */
+#else /* CONFIG_SYSFB_SIMPLE */
-static inline bool parse_mode(const struct screen_info *si,
- struct simplefb_platform_data *mode)
+static inline bool sysfb_parse_mode(const struct screen_info *si,
+ struct simplefb_platform_data *mode)
{
return false;
}
-static inline int create_simplefb(const struct screen_info *si,
- const struct simplefb_platform_data *mode)
+static inline int sysfb_create_simplefb(const struct screen_info *si,
+ const struct simplefb_platform_data *mode)
{
return -EINVAL;
}
-#endif /* CONFIG_X86_SYSFB */
+#endif /* CONFIG_SYSFB_SIMPLE */
#endif /* _LINUX_SYSFB_H */
</cut>
Successfully identified regression in *gcc* in CI configuration tcwg_gnu_native_build/master-aarch64. So far, this commit has regressed CI configurations:
- tcwg_gnu_native_build/master-aarch64
Culprit:
<cut>
commit cad36f38576a6a781e3c62ab061c68f5b8dab13a
Author: Roger Sayle <roger(a)nextmovesoftware.com>
Date: Tue Aug 31 11:45:07 2021 +0100
Preserve SUBREG_PROMOTED_VAR_P on (extend:HI (subreg/s:QI (reg:SI))).
SUBREG_PROMOTED_VAR_P is a mechanism for tracking that a partial subreg
is correctly zero-extended or sign-extended in the parent register. For
example, the RTL (subreg/s/v:QI (reg/v:SI 23 [ x ]) 0) indicates that the
byte x is zero extended in reg:SI 23, which is useful for optimization.
An example is that zero extending the above QImode value to HImode can
simply use a wider subreg, i.e. (subreg:HI (reg/v:SI 23 [ x ]) 0).
This patch addresses the oversight/missed optimization opportunity that
the new HImode subreg above should retain its SUBREG_PROMOTED_VAR_P
annotation as its value is guaranteed to be correctly extended in the
SImode parent. The code below to preserve SUBREG_PROMOTED_VAR_P is already
present in the middle-end (e.g. simplify-rtx.c:7232-7242) but missing
from one or two (precisely three) places that (accidentally) strip it.
Whilst there I also added another optimization. If we need to extend
the above QImode value beyond the SImode register holding it, say to
DImode, we can eliminate the SUBREG and simply extend from the SImode
register to DImode.
2021-08-31 Roger Sayle <roger(a)nextmovesoftware.com>
gcc/ChangeLog
* expr.c (convert_modes): Preserve SUBREG_PROMOTED_VAR_P when
creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P
subreg.
* simplify-rtx.c (simplify_unary_operation_1) [SIGN_EXTEND]:
Likewise, preserve SUBREG_PROMOTED_VAR_P when creating a (wider)
partial subreg from a SUBREG_PROMOTED_VAR_P subreg. Generate
SIGN_EXTEND of the SUBREG_REG when a subreg would be paradoxical.
[ZERO_EXTEND]: Likewise, preserve SUBREG_PROMOTED_VAR_P when
creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P
subreg. Generate ZERO_EXTEND of the SUBREG_REG when a subreg
would be paradoxical.
</cut>
Results regressed to (for first_bad == cad36f38576a6a781e3c62ab061c68f5b8dab13a)
# reset_artifacts:
-10
# true:
0
# build_abe binutils:
1
# First few build errors in logs:
# 00:05:59 /home/tcwg-buildslave/workspace/tcwg_gnu_6/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-2.h:249:37: internal compiler error: in subreg_promoted_mode, at rtl.h:3132
# 00:05:59 /home/tcwg-buildslave/workspace/tcwg_gnu_6/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-2.h:249:37: internal compiler error: in subreg_promoted_mode, at rtl.h:3132
# 00:05:59 /home/tcwg-buildslave/workspace/tcwg_gnu_6/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-2.h:249:37: internal compiler error: in subreg_promoted_mode, at rtl.h:3132
# 00:05:59 /home/tcwg-buildslave/workspace/tcwg_gnu_6/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-1.h:127:36: internal compiler error: in subreg_promoted_mode, at rtl.h:3132
# 00:05:59 /home/tcwg-buildslave/workspace/tcwg_gnu_6/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-1.h:127:36: internal compiler error: in subreg_promoted_mode, at rtl.h:3132
# 00:05:59 /home/tcwg-buildslave/workspace/tcwg_gnu_6/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-2.h:249:37: internal compiler error: in subreg_promoted_mode, at rtl.h:3132
# 00:05:59 /home/tcwg-buildslave/workspace/tcwg_gnu_6/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-2.h:249:37: internal compiler error: in subreg_promoted_mode, at rtl.h:3132
# 00:05:59 /home/tcwg-buildslave/workspace/tcwg_gnu_6/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-1.h:127:36: internal compiler error: in subreg_promoted_mode, at rtl.h:3132
# 00:05:59 /home/tcwg-buildslave/workspace/tcwg_gnu_6/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-2.h:249:37: internal compiler error: in subreg_promoted_mode, at rtl.h:3132
# 00:05:59 make[2]: *** [/home/tcwg-buildslave/workspace/tcwg_gnu_6/abe/snapshots/gcc.git~master/libgcc/shared-object.mk:14: trunctfhf2.o] Error 1
from (for last_good == 0960d937d9bee3c831d0b64a9c828c263a58ff89)
# reset_artifacts:
-10
# true:
0
# build_abe binutils:
1
# build_abe gcc:
2
# build_abe linux:
4
# build_abe glibc:
5
# build_abe gdb:
6
Artifacts of last_good build: https://ci.linaro.org/job/tcwg_gnu_native_build-bisect-master-aarch64/1/art…
Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_gnu_native_build-bisect-master-aarch64/1/art…
Build top page/logs: https://ci.linaro.org/job/tcwg_gnu_native_build-bisect-master-aarch64/1/
Configuration details:
Reproduce builds:
<cut>
mkdir investigate-gcc-cad36f38576a6a781e3c62ab061c68f5b8dab13a
cd investigate-gcc-cad36f38576a6a781e3c62ab061c68f5b8dab13a
git clone https://git.linaro.org/toolchain/jenkins-scripts
mkdir -p artifacts/manifests
curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_gnu_native_build-bisect-master-aarch64/1/art… --fail
curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_gnu_native_build-bisect-master-aarch64/1/art… --fail
curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_gnu_native_build-bisect-master-aarch64/1/art… --fail
chmod +x artifacts/test.sh
# Reproduce the baseline build (build all pre-requisites)
./jenkins-scripts/tcwg_gnu-build.sh @@ artifacts/manifests/build-baseline.sh
# Save baseline build state (which is then restored in artifacts/test.sh)
mkdir -p ./bisect
rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /gcc/ ./ ./bisect/baseline/
cd gcc
# Reproduce first_bad build
git checkout --detach cad36f38576a6a781e3c62ab061c68f5b8dab13a
../artifacts/test.sh
# Reproduce last_good build
git checkout --detach 0960d937d9bee3c831d0b64a9c828c263a58ff89
../artifacts/test.sh
cd ..
</cut>
History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/…
Artifacts: https://ci.linaro.org/job/tcwg_gnu_native_build-bisect-master-aarch64/1/art…
Build log: https://ci.linaro.org/job/tcwg_gnu_native_build-bisect-master-aarch64/1/con…
Full commit (up to 1000 lines):
<cut>
commit cad36f38576a6a781e3c62ab061c68f5b8dab13a
Author: Roger Sayle <roger(a)nextmovesoftware.com>
Date: Tue Aug 31 11:45:07 2021 +0100
Preserve SUBREG_PROMOTED_VAR_P on (extend:HI (subreg/s:QI (reg:SI))).
SUBREG_PROMOTED_VAR_P is a mechanism for tracking that a partial subreg
is correctly zero-extended or sign-extended in the parent register. For
example, the RTL (subreg/s/v:QI (reg/v:SI 23 [ x ]) 0) indicates that the
byte x is zero extended in reg:SI 23, which is useful for optimization.
An example is that zero extending the above QImode value to HImode can
simply use a wider subreg, i.e. (subreg:HI (reg/v:SI 23 [ x ]) 0).
This patch addresses the oversight/missed optimization opportunity that
the new HImode subreg above should retain its SUBREG_PROMOTED_VAR_P
annotation as its value is guaranteed to be correctly extended in the
SImode parent. The code below to preserve SUBREG_PROMOTED_VAR_P is already
present in the middle-end (e.g. simplify-rtx.c:7232-7242) but missing
from one or two (precisely three) places that (accidentally) strip it.
Whilst there I also added another optimization. If we need to extend
the above QImode value beyond the SImode register holding it, say to
DImode, we can eliminate the SUBREG and simply extend from the SImode
register to DImode.
2021-08-31 Roger Sayle <roger(a)nextmovesoftware.com>
gcc/ChangeLog
* expr.c (convert_modes): Preserve SUBREG_PROMOTED_VAR_P when
creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P
subreg.
* simplify-rtx.c (simplify_unary_operation_1) [SIGN_EXTEND]:
Likewise, preserve SUBREG_PROMOTED_VAR_P when creating a (wider)
partial subreg from a SUBREG_PROMOTED_VAR_P subreg. Generate
SIGN_EXTEND of the SUBREG_REG when a subreg would be paradoxical.
[ZERO_EXTEND]: Likewise, preserve SUBREG_PROMOTED_VAR_P when
creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P
subreg. Generate ZERO_EXTEND of the SUBREG_REG when a subreg
would be paradoxical.
---
gcc/expr.c | 19 ++++++++++++++++++-
gcc/simplify-rtx.c | 52 ++++++++++++++++++++++++++++++++++++++++++----------
2 files changed, 60 insertions(+), 11 deletions(-)
diff --git a/gcc/expr.c b/gcc/expr.c
index 096c0315ecc..5dd98a9bccc 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -688,7 +688,24 @@ convert_modes (machine_mode mode, machine_mode oldmode, rtx x, int unsignedp)
&& (GET_MODE_PRECISION (subreg_promoted_mode (x))
>= GET_MODE_PRECISION (int_mode))
&& SUBREG_CHECK_PROMOTED_SIGN (x, unsignedp))
- x = gen_lowpart (int_mode, SUBREG_REG (x));
+ {
+ scalar_int_mode int_orig_mode;
+ machine_mode orig_mode = GET_MODE (x);
+ x = gen_lowpart (int_mode, SUBREG_REG (x));
+
+ /* Preserve SUBREG_PROMOTED_VAR_P if the new mode is wider than
+ the original mode, but narrower than the inner mode. */
+ if (GET_CODE (x) == SUBREG
+ && GET_MODE_PRECISION (subreg_promoted_mode (x))
+ > GET_MODE_PRECISION (int_mode)
+ && is_a <scalar_int_mode> (orig_mode, &int_orig_mode)
+ && GET_MODE_PRECISION (int_mode)
+ > GET_MODE_PRECISION (int_orig_mode))
+ {
+ SUBREG_PROMOTED_VAR_P (x) = 1;
+ SUBREG_PROMOTED_SET (x, unsignedp);
+ }
+ }
if (GET_MODE (x) != VOIDmode)
oldmode = GET_MODE (x);
diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index e431e0c19d7..ebad5cb5a79 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -1512,12 +1512,28 @@ simplify_context::simplify_unary_operation_1 (rtx_code code, machine_mode mode,
target mode is the same as the variable's promotion. */
if (GET_CODE (op) == SUBREG
&& SUBREG_PROMOTED_VAR_P (op)
- && SUBREG_PROMOTED_SIGNED_P (op)
- && !paradoxical_subreg_p (mode, GET_MODE (SUBREG_REG (op))))
+ && SUBREG_PROMOTED_SIGNED_P (op))
{
- temp = rtl_hooks.gen_lowpart_no_emit (mode, SUBREG_REG (op));
- if (temp)
- return temp;
+ rtx subreg = SUBREG_REG (op);
+ machine_mode subreg_mode = GET_MODE (subreg);
+ if (!paradoxical_subreg_p (mode, subreg_mode))
+ {
+ temp = rtl_hooks.gen_lowpart_no_emit (mode, subreg);
+ if (temp)
+ {
+ /* Preserve SUBREG_PROMOTED_VAR_P. */
+ if (partial_subreg_p (temp))
+ {
+ SUBREG_PROMOTED_VAR_P (temp) = 1;
+ SUBREG_PROMOTED_SET (temp, 1);
+ }
+ return temp;
+ }
+ }
+ else
+ /* Sign-extending a sign-extended subreg. */
+ return simplify_gen_unary (SIGN_EXTEND, mode,
+ subreg, subreg_mode);
}
/* (sign_extend:M (sign_extend:N <X>)) is (sign_extend:M <X>).
@@ -1631,12 +1647,28 @@ simplify_context::simplify_unary_operation_1 (rtx_code code, machine_mode mode,
target mode is the same as the variable's promotion. */
if (GET_CODE (op) == SUBREG
&& SUBREG_PROMOTED_VAR_P (op)
- && SUBREG_PROMOTED_UNSIGNED_P (op)
- && !paradoxical_subreg_p (mode, GET_MODE (SUBREG_REG (op))))
+ && SUBREG_PROMOTED_UNSIGNED_P (op))
{
- temp = rtl_hooks.gen_lowpart_no_emit (mode, SUBREG_REG (op));
- if (temp)
- return temp;
+ rtx subreg = SUBREG_REG (op);
+ machine_mode subreg_mode = GET_MODE (subreg);
+ if (!paradoxical_subreg_p (mode, subreg_mode))
+ {
+ temp = rtl_hooks.gen_lowpart_no_emit (mode, subreg);
+ if (temp)
+ {
+ /* Preserve SUBREG_PROMOTED_VAR_P. */
+ if (partial_subreg_p (temp))
+ {
+ SUBREG_PROMOTED_VAR_P (temp) = 1;
+ SUBREG_PROMOTED_SET (temp, 0);
+ }
+ return temp;
+ }
+ }
+ else
+ /* Zero-extending a zero-extended subreg. */
+ return simplify_gen_unary (ZERO_EXTEND, mode,
+ subreg, subreg_mode);
}
/* Extending a widening multiplication should be canonicalized to
</cut>
Successfully identified regression in *gcc* in CI configuration tcwg_gnu_cross_build/master-aarch64. So far, this commit has regressed CI configurations:
- tcwg_gnu_cross_build/master-aarch64
Culprit:
<cut>
commit cad36f38576a6a781e3c62ab061c68f5b8dab13a
Author: Roger Sayle <roger(a)nextmovesoftware.com>
Date: Tue Aug 31 11:45:07 2021 +0100
Preserve SUBREG_PROMOTED_VAR_P on (extend:HI (subreg/s:QI (reg:SI))).
SUBREG_PROMOTED_VAR_P is a mechanism for tracking that a partial subreg
is correctly zero-extended or sign-extended in the parent register. For
example, the RTL (subreg/s/v:QI (reg/v:SI 23 [ x ]) 0) indicates that the
byte x is zero extended in reg:SI 23, which is useful for optimization.
An example is that zero extending the above QImode value to HImode can
simply use a wider subreg, i.e. (subreg:HI (reg/v:SI 23 [ x ]) 0).
This patch addresses the oversight/missed optimization opportunity that
the new HImode subreg above should retain its SUBREG_PROMOTED_VAR_P
annotation as its value is guaranteed to be correctly extended in the
SImode parent. The code below to preserve SUBREG_PROMOTED_VAR_P is already
present in the middle-end (e.g. simplify-rtx.c:7232-7242) but missing
from one or two (precisely three) places that (accidentally) strip it.
Whilst there I also added another optimization. If we need to extend
the above QImode value beyond the SImode register holding it, say to
DImode, we can eliminate the SUBREG and simply extend from the SImode
register to DImode.
2021-08-31 Roger Sayle <roger(a)nextmovesoftware.com>
gcc/ChangeLog
* expr.c (convert_modes): Preserve SUBREG_PROMOTED_VAR_P when
creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P
subreg.
* simplify-rtx.c (simplify_unary_operation_1) [SIGN_EXTEND]:
Likewise, preserve SUBREG_PROMOTED_VAR_P when creating a (wider)
partial subreg from a SUBREG_PROMOTED_VAR_P subreg. Generate
SIGN_EXTEND of the SUBREG_REG when a subreg would be paradoxical.
[ZERO_EXTEND]: Likewise, preserve SUBREG_PROMOTED_VAR_P when
creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P
subreg. Generate ZERO_EXTEND of the SUBREG_REG when a subreg
would be paradoxical.
</cut>
Results regressed to (for first_bad == cad36f38576a6a781e3c62ab061c68f5b8dab13a)
# reset_artifacts:
-10
# true:
0
# build_abe binutils:
1
# First few build errors in logs:
# 00:04:40 cc1: error: no include path in which to search for stdc-predef.h
# 00:04:48 /home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-2.h:249:37: internal compiler error: in subreg_promoted_mode, at rtl.h:3132
# 00:04:48 /home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-2.h:249:37: internal compiler error: in subreg_promoted_mode, at rtl.h:3132
# 00:04:48 /home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-2.h:249:37: internal compiler error: in subreg_promoted_mode, at rtl.h:3132
# 00:04:48 /home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-1.h:127:36: internal compiler error: in subreg_promoted_mode, at rtl.h:3132
# 00:04:48 /home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-1.h:127:36: internal compiler error: in subreg_promoted_mode, at rtl.h:3132
# 00:04:48 /home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-1.h:127:36: internal compiler error: in subreg_promoted_mode, at rtl.h:3132
# 00:04:48 /home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-2.h:249:37: internal compiler error: in subreg_promoted_mode, at rtl.h:3132
# 00:04:48 make[2]: *** [/home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/snapshots/gcc.git~master/libgcc/static-object.mk:17: floatsitf.o] Error 1
# 00:04:48 /home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-2.h:249:37: internal compiler error: in subreg_promoted_mode, at rtl.h:3132
from (for last_good == 0960d937d9bee3c831d0b64a9c828c263a58ff89)
# reset_artifacts:
-10
# true:
0
# build_abe binutils:
1
# build_abe stage1:
2
# build_abe linux:
3
# build_abe glibc:
4
# build_abe stage2:
5
# build_abe gdb:
6
# build_abe qemu:
7
Artifacts of last_good build: https://ci.linaro.org/job/tcwg_gnu_cross_build-bisect-master-aarch64/1/arti…
Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_gnu_cross_build-bisect-master-aarch64/1/arti…
Build top page/logs: https://ci.linaro.org/job/tcwg_gnu_cross_build-bisect-master-aarch64/1/
Configuration details:
Reproduce builds:
<cut>
mkdir investigate-gcc-cad36f38576a6a781e3c62ab061c68f5b8dab13a
cd investigate-gcc-cad36f38576a6a781e3c62ab061c68f5b8dab13a
git clone https://git.linaro.org/toolchain/jenkins-scripts
mkdir -p artifacts/manifests
curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_gnu_cross_build-bisect-master-aarch64/1/arti… --fail
curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_gnu_cross_build-bisect-master-aarch64/1/arti… --fail
curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_gnu_cross_build-bisect-master-aarch64/1/arti… --fail
chmod +x artifacts/test.sh
# Reproduce the baseline build (build all pre-requisites)
./jenkins-scripts/tcwg_gnu-build.sh @@ artifacts/manifests/build-baseline.sh
# Save baseline build state (which is then restored in artifacts/test.sh)
mkdir -p ./bisect
rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /gcc/ ./ ./bisect/baseline/
cd gcc
# Reproduce first_bad build
git checkout --detach cad36f38576a6a781e3c62ab061c68f5b8dab13a
../artifacts/test.sh
# Reproduce last_good build
git checkout --detach 0960d937d9bee3c831d0b64a9c828c263a58ff89
../artifacts/test.sh
cd ..
</cut>
History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/…
Artifacts: https://ci.linaro.org/job/tcwg_gnu_cross_build-bisect-master-aarch64/1/arti…
Build log: https://ci.linaro.org/job/tcwg_gnu_cross_build-bisect-master-aarch64/1/cons…
Full commit (up to 1000 lines):
<cut>
commit cad36f38576a6a781e3c62ab061c68f5b8dab13a
Author: Roger Sayle <roger(a)nextmovesoftware.com>
Date: Tue Aug 31 11:45:07 2021 +0100
Preserve SUBREG_PROMOTED_VAR_P on (extend:HI (subreg/s:QI (reg:SI))).
SUBREG_PROMOTED_VAR_P is a mechanism for tracking that a partial subreg
is correctly zero-extended or sign-extended in the parent register. For
example, the RTL (subreg/s/v:QI (reg/v:SI 23 [ x ]) 0) indicates that the
byte x is zero extended in reg:SI 23, which is useful for optimization.
An example is that zero extending the above QImode value to HImode can
simply use a wider subreg, i.e. (subreg:HI (reg/v:SI 23 [ x ]) 0).
This patch addresses the oversight/missed optimization opportunity that
the new HImode subreg above should retain its SUBREG_PROMOTED_VAR_P
annotation as its value is guaranteed to be correctly extended in the
SImode parent. The code below to preserve SUBREG_PROMOTED_VAR_P is already
present in the middle-end (e.g. simplify-rtx.c:7232-7242) but missing
from one or two (precisely three) places that (accidentally) strip it.
Whilst there I also added another optimization. If we need to extend
the above QImode value beyond the SImode register holding it, say to
DImode, we can eliminate the SUBREG and simply extend from the SImode
register to DImode.
2021-08-31 Roger Sayle <roger(a)nextmovesoftware.com>
gcc/ChangeLog
* expr.c (convert_modes): Preserve SUBREG_PROMOTED_VAR_P when
creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P
subreg.
* simplify-rtx.c (simplify_unary_operation_1) [SIGN_EXTEND]:
Likewise, preserve SUBREG_PROMOTED_VAR_P when creating a (wider)
partial subreg from a SUBREG_PROMOTED_VAR_P subreg. Generate
SIGN_EXTEND of the SUBREG_REG when a subreg would be paradoxical.
[ZERO_EXTEND]: Likewise, preserve SUBREG_PROMOTED_VAR_P when
creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P
subreg. Generate ZERO_EXTEND of the SUBREG_REG when a subreg
would be paradoxical.
---
gcc/expr.c | 19 ++++++++++++++++++-
gcc/simplify-rtx.c | 52 ++++++++++++++++++++++++++++++++++++++++++----------
2 files changed, 60 insertions(+), 11 deletions(-)
diff --git a/gcc/expr.c b/gcc/expr.c
index 096c0315ecc..5dd98a9bccc 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -688,7 +688,24 @@ convert_modes (machine_mode mode, machine_mode oldmode, rtx x, int unsignedp)
&& (GET_MODE_PRECISION (subreg_promoted_mode (x))
>= GET_MODE_PRECISION (int_mode))
&& SUBREG_CHECK_PROMOTED_SIGN (x, unsignedp))
- x = gen_lowpart (int_mode, SUBREG_REG (x));
+ {
+ scalar_int_mode int_orig_mode;
+ machine_mode orig_mode = GET_MODE (x);
+ x = gen_lowpart (int_mode, SUBREG_REG (x));
+
+ /* Preserve SUBREG_PROMOTED_VAR_P if the new mode is wider than
+ the original mode, but narrower than the inner mode. */
+ if (GET_CODE (x) == SUBREG
+ && GET_MODE_PRECISION (subreg_promoted_mode (x))
+ > GET_MODE_PRECISION (int_mode)
+ && is_a <scalar_int_mode> (orig_mode, &int_orig_mode)
+ && GET_MODE_PRECISION (int_mode)
+ > GET_MODE_PRECISION (int_orig_mode))
+ {
+ SUBREG_PROMOTED_VAR_P (x) = 1;
+ SUBREG_PROMOTED_SET (x, unsignedp);
+ }
+ }
if (GET_MODE (x) != VOIDmode)
oldmode = GET_MODE (x);
diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index e431e0c19d7..ebad5cb5a79 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -1512,12 +1512,28 @@ simplify_context::simplify_unary_operation_1 (rtx_code code, machine_mode mode,
target mode is the same as the variable's promotion. */
if (GET_CODE (op) == SUBREG
&& SUBREG_PROMOTED_VAR_P (op)
- && SUBREG_PROMOTED_SIGNED_P (op)
- && !paradoxical_subreg_p (mode, GET_MODE (SUBREG_REG (op))))
+ && SUBREG_PROMOTED_SIGNED_P (op))
{
- temp = rtl_hooks.gen_lowpart_no_emit (mode, SUBREG_REG (op));
- if (temp)
- return temp;
+ rtx subreg = SUBREG_REG (op);
+ machine_mode subreg_mode = GET_MODE (subreg);
+ if (!paradoxical_subreg_p (mode, subreg_mode))
+ {
+ temp = rtl_hooks.gen_lowpart_no_emit (mode, subreg);
+ if (temp)
+ {
+ /* Preserve SUBREG_PROMOTED_VAR_P. */
+ if (partial_subreg_p (temp))
+ {
+ SUBREG_PROMOTED_VAR_P (temp) = 1;
+ SUBREG_PROMOTED_SET (temp, 1);
+ }
+ return temp;
+ }
+ }
+ else
+ /* Sign-extending a sign-extended subreg. */
+ return simplify_gen_unary (SIGN_EXTEND, mode,
+ subreg, subreg_mode);
}
/* (sign_extend:M (sign_extend:N <X>)) is (sign_extend:M <X>).
@@ -1631,12 +1647,28 @@ simplify_context::simplify_unary_operation_1 (rtx_code code, machine_mode mode,
target mode is the same as the variable's promotion. */
if (GET_CODE (op) == SUBREG
&& SUBREG_PROMOTED_VAR_P (op)
- && SUBREG_PROMOTED_UNSIGNED_P (op)
- && !paradoxical_subreg_p (mode, GET_MODE (SUBREG_REG (op))))
+ && SUBREG_PROMOTED_UNSIGNED_P (op))
{
- temp = rtl_hooks.gen_lowpart_no_emit (mode, SUBREG_REG (op));
- if (temp)
- return temp;
+ rtx subreg = SUBREG_REG (op);
+ machine_mode subreg_mode = GET_MODE (subreg);
+ if (!paradoxical_subreg_p (mode, subreg_mode))
+ {
+ temp = rtl_hooks.gen_lowpart_no_emit (mode, subreg);
+ if (temp)
+ {
+ /* Preserve SUBREG_PROMOTED_VAR_P. */
+ if (partial_subreg_p (temp))
+ {
+ SUBREG_PROMOTED_VAR_P (temp) = 1;
+ SUBREG_PROMOTED_SET (temp, 0);
+ }
+ return temp;
+ }
+ }
+ else
+ /* Zero-extending a zero-extended subreg. */
+ return simplify_gen_unary (ZERO_EXTEND, mode,
+ subreg, subreg_mode);
}
/* Extending a widening multiplication should be canonicalized to
</cut>
Successfully identified regression in *gdb* in CI configuration tcwg_gnu_native_build/master-arm. So far, this commit has regressed CI configurations:
- tcwg_gnu_native_build/master-arm
Culprit:
<cut>
commit 282aa4f7d292eb4bc213d028465a3b96f5af2f22
Author: Tom Tromey <tom(a)tromey.com>
Date: Sat Aug 28 13:16:50 2021 -0600
Add some parallel_for_each tests
Tom de Vries noticed that a patch in the DWARF scanner rewrite series
caused a regression in parallel_for_each -- it started crashing in the
case where the number of threads is 0 (there was an unchecked use of
"n-1" that was used to size an array).
He also pointed out that there were no tests of parallel_for_each.
This adds a few tests of parallel_for_each, primarily testing that
different settings for the number of threads will work. This test
catches the bug that he found in that series.
</cut>
Results regressed to (for first_bad == 282aa4f7d292eb4bc213d028465a3b96f5af2f22)
# reset_artifacts:
-10
# true:
0
# build_abe binutils:
1
# build_abe gcc:
2
# build_abe linux:
4
# build_abe glibc:
5
# First few build errors in logs:
# 00:03:45 ../../../../../../gdb/gdb/unittests/parallel-for-selftests.c:53:30: error: use of deleted function ‘std::atomic<int>::atomic(const std::atomic<int>&)’
# 00:03:45 make[1]: *** [unittests/parallel-for-selftests.o] Error 1
# 00:03:46 make: *** [all-gdb] Error 2
from (for last_good == ee8b88452c1cb1be97199942aee7a76bbca210ee)
# reset_artifacts:
-10
# true:
0
# build_abe binutils:
1
# build_abe gcc:
2
# build_abe linux:
4
# build_abe glibc:
5
# build_abe gdb:
6
Artifacts of last_good build: https://ci.linaro.org/job/tcwg_gnu_native_build-bisect-master-arm/1/artifac…
Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_gnu_native_build-bisect-master-arm/1/artifac…
Build top page/logs: https://ci.linaro.org/job/tcwg_gnu_native_build-bisect-master-arm/1/
Configuration details:
Reproduce builds:
<cut>
mkdir investigate-gdb-282aa4f7d292eb4bc213d028465a3b96f5af2f22
cd investigate-gdb-282aa4f7d292eb4bc213d028465a3b96f5af2f22
git clone https://git.linaro.org/toolchain/jenkins-scripts
mkdir -p artifacts/manifests
curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_gnu_native_build-bisect-master-arm/1/artifac… --fail
curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_gnu_native_build-bisect-master-arm/1/artifac… --fail
curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_gnu_native_build-bisect-master-arm/1/artifac… --fail
chmod +x artifacts/test.sh
# Reproduce the baseline build (build all pre-requisites)
./jenkins-scripts/tcwg_gnu-build.sh @@ artifacts/manifests/build-baseline.sh
# Save baseline build state (which is then restored in artifacts/test.sh)
mkdir -p ./bisect
rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /gdb/ ./ ./bisect/baseline/
cd gdb
# Reproduce first_bad build
git checkout --detach 282aa4f7d292eb4bc213d028465a3b96f5af2f22
../artifacts/test.sh
# Reproduce last_good build
git checkout --detach ee8b88452c1cb1be97199942aee7a76bbca210ee
../artifacts/test.sh
cd ..
</cut>
History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/…
Artifacts: https://ci.linaro.org/job/tcwg_gnu_native_build-bisect-master-arm/1/artifac…
Build log: https://ci.linaro.org/job/tcwg_gnu_native_build-bisect-master-arm/1/console…
Full commit (up to 1000 lines):
<cut>
commit 282aa4f7d292eb4bc213d028465a3b96f5af2f22
Author: Tom Tromey <tom(a)tromey.com>
Date: Sat Aug 28 13:16:50 2021 -0600
Add some parallel_for_each tests
Tom de Vries noticed that a patch in the DWARF scanner rewrite series
caused a regression in parallel_for_each -- it started crashing in the
case where the number of threads is 0 (there was an unchecked use of
"n-1" that was used to size an array).
He also pointed out that there were no tests of parallel_for_each.
This adds a few tests of parallel_for_each, primarily testing that
different settings for the number of threads will work. This test
catches the bug that he found in that series.
---
gdb/Makefile.in | 1 +
gdb/unittests/parallel-for-selftests.c | 86 ++++++++++++++++++++++++++++++++++
2 files changed, 87 insertions(+)
diff --git a/gdb/Makefile.in b/gdb/Makefile.in
index 73a1bf83c85..320d3326a81 100644
--- a/gdb/Makefile.in
+++ b/gdb/Makefile.in
@@ -456,6 +456,7 @@ SELFTESTS_SRCS = \
unittests/offset-type-selftests.c \
unittests/observable-selftests.c \
unittests/optional-selftests.c \
+ unittests/parallel-for-selftests.c \
unittests/parse-connection-spec-selftests.c \
unittests/ptid-selftests.c \
unittests/main-thread-selftests.c \
diff --git a/gdb/unittests/parallel-for-selftests.c b/gdb/unittests/parallel-for-selftests.c
new file mode 100644
index 00000000000..7f61b709fa7
--- /dev/null
+++ b/gdb/unittests/parallel-for-selftests.c
@@ -0,0 +1,86 @@
+/* Self tests for parallel_for_each
+
+ Copyright (C) 2021 Free Software Foundation, Inc.
+
+ This file is part of GDB.
+
+ This program is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 3 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>. */
+
+#include "defs.h"
+#include "gdbsupport/selftest.h"
+#include "gdbsupport/parallel-for.h"
+#include "gdbsupport/thread-pool.h"
+
+#if CXX_STD_THREAD
+
+namespace selftests {
+namespace parallel_for {
+
+struct save_restore_n_threads
+{
+ save_restore_n_threads ()
+ : n_threads (gdb::thread_pool::g_thread_pool->thread_count ())
+ {
+ }
+
+ ~save_restore_n_threads ()
+ {
+ gdb::thread_pool::g_thread_pool->set_thread_count (n_threads);
+ }
+
+ int n_threads;
+};
+
+static void
+test (int n_threads)
+{
+ save_restore_n_threads saver;
+ gdb::thread_pool::g_thread_pool->set_thread_count (n_threads);
+
+#define NUMBER 10000
+
+ std::atomic<int> counter = 0;
+ gdb::parallel_for_each (0, NUMBER,
+ [&] (int start, int end)
+ {
+ counter += end - start;
+ });
+
+ SELF_CHECK (counter == NUMBER);
+
+#undef NUMBER
+}
+
+static void
+test_n_threads ()
+{
+ test (0);
+ test (1);
+ test (3);
+}
+
+}
+}
+
+#endif /* CXX_STD_THREAD */
+
+void _initialize_parallel_for_selftests ();
+void
+_initialize_parallel_for_selftests ()
+{
+#ifdef CXX_STD_THREAD
+ selftests::register_test ("parallel_for",
+ selftests::parallel_for::test_n_threads);
+#endif /* CXX_STD_THREAD */
+}
</cut>
Successfully identified regression in *gcc* in CI configuration tcwg_gnu_cross_build/master-arm. So far, this commit has regressed CI configurations:
- tcwg_gnu_cross_build/master-arm
Culprit:
<cut>
commit caf81d3b57501b1f58dcd9b1ef9d7b4bc76f4ab1
Author: Sebastian Huber <sebastian.huber(a)embedded-brains.de>
Date: Tue Aug 17 09:53:43 2021 +0200
Use __builtin_trap() for abort() if inhibit_libc
abort() is used in gcc_assert() and gcc_unreachable() which is used by target
libraries such as libgcov.a. This patch changes the abort() definition under
certain conditions. If inhibit_libc is defined and abort is not already
defined, then abort() is defined to __builtin_trap().
The inhibit_libc define is usually defined if GCC is built for targets running
in embedded systems which may optionally use a C standard library. If
inhibit_libc is defined, then there may be still a full featured abort()
available. abort() is a heavy weight function which depends on signals and
file streams. For statically linked applications, this means that a dependency
on gcc_assert() pulls in the support for signals and file streams. This could
prevent using gcov to test low end targets for example. Using __builtin_trap()
avoids these dependencies if the target implements a "trap" instruction. The
application or operating system could use a trap handler to react to failed GCC
runtime checks which caused a trap.
gcc/
* tsystem.h (abort): Define abort() if inhibit_libc is defined and it
is not already defined.
</cut>
Results regressed to (for first_bad == caf81d3b57501b1f58dcd9b1ef9d7b4bc76f4ab1)
# reset_artifacts:
-10
# true:
0
# build_abe binutils:
1
# First few build errors in logs:
# 00:01:44 cc1: error: no include path in which to search for stdc-predef.h
# 00:02:05 /home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/snapshots/gcc.git~master/libgcc/unwind-arm-common.inc:55:24: error: macro passed 1 arguments, but takes just 0
# 00:02:05 make[2]: *** [/home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/snapshots/gcc.git~master/libgcc/static-object.mk:17: unwind-arm.o] Error 1
# 00:02:06 make[1]: *** [Makefile:12484: all-target-libgcc] Error 2
# 00:02:06 make: *** [Makefile:953: all] Error 2
from (for last_good == d7e56b084d0b230ae5ee280f569d679fa0f09f4d)
# reset_artifacts:
-10
# true:
0
# build_abe binutils:
1
# build_abe stage1:
2
# build_abe linux:
3
# build_abe glibc:
4
# build_abe stage2:
5
# build_abe gdb:
6
# build_abe qemu:
7
Artifacts of last_good build: https://ci.linaro.org/job/tcwg_gnu_cross_build-bisect-master-arm/1/artifact…
Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_gnu_cross_build-bisect-master-arm/1/artifact…
Build top page/logs: https://ci.linaro.org/job/tcwg_gnu_cross_build-bisect-master-arm/1/
Configuration details:
Reproduce builds:
<cut>
mkdir investigate-gcc-caf81d3b57501b1f58dcd9b1ef9d7b4bc76f4ab1
cd investigate-gcc-caf81d3b57501b1f58dcd9b1ef9d7b4bc76f4ab1
git clone https://git.linaro.org/toolchain/jenkins-scripts
mkdir -p artifacts/manifests
curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_gnu_cross_build-bisect-master-arm/1/artifact… --fail
curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_gnu_cross_build-bisect-master-arm/1/artifact… --fail
curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_gnu_cross_build-bisect-master-arm/1/artifact… --fail
chmod +x artifacts/test.sh
# Reproduce the baseline build (build all pre-requisites)
./jenkins-scripts/tcwg_gnu-build.sh @@ artifacts/manifests/build-baseline.sh
# Save baseline build state (which is then restored in artifacts/test.sh)
mkdir -p ./bisect
rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /gcc/ ./ ./bisect/baseline/
cd gcc
# Reproduce first_bad build
git checkout --detach caf81d3b57501b1f58dcd9b1ef9d7b4bc76f4ab1
../artifacts/test.sh
# Reproduce last_good build
git checkout --detach d7e56b084d0b230ae5ee280f569d679fa0f09f4d
../artifacts/test.sh
cd ..
</cut>
History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/…
Artifacts: https://ci.linaro.org/job/tcwg_gnu_cross_build-bisect-master-arm/1/artifact…
Build log: https://ci.linaro.org/job/tcwg_gnu_cross_build-bisect-master-arm/1/consoleT…
Full commit (up to 1000 lines):
<cut>
commit caf81d3b57501b1f58dcd9b1ef9d7b4bc76f4ab1
Author: Sebastian Huber <sebastian.huber(a)embedded-brains.de>
Date: Tue Aug 17 09:53:43 2021 +0200
Use __builtin_trap() for abort() if inhibit_libc
abort() is used in gcc_assert() and gcc_unreachable() which is used by target
libraries such as libgcov.a. This patch changes the abort() definition under
certain conditions. If inhibit_libc is defined and abort is not already
defined, then abort() is defined to __builtin_trap().
The inhibit_libc define is usually defined if GCC is built for targets running
in embedded systems which may optionally use a C standard library. If
inhibit_libc is defined, then there may be still a full featured abort()
available. abort() is a heavy weight function which depends on signals and
file streams. For statically linked applications, this means that a dependency
on gcc_assert() pulls in the support for signals and file streams. This could
prevent using gcov to test low end targets for example. Using __builtin_trap()
avoids these dependencies if the target implements a "trap" instruction. The
application or operating system could use a trap handler to react to failed GCC
runtime checks which caused a trap.
gcc/
* tsystem.h (abort): Define abort() if inhibit_libc is defined and it
is not already defined.
---
gcc/tsystem.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gcc/tsystem.h b/gcc/tsystem.h
index e1e6a96a4f4..5c72c69ff3e 100644
--- a/gcc/tsystem.h
+++ b/gcc/tsystem.h
@@ -59,7 +59,7 @@ extern int atexit (void (*)(void));
#endif
#ifndef abort
-extern void abort (void) __attribute__ ((__noreturn__));
+#define abort() __builtin_trap ()
#endif
#ifndef strlen
</cut>
Progress (short week, 3 days):
* UM-2 [QEMU upstream maintainership]
+ QEMU 6.1.0 has now been released
+ Sent out the first arm pullreq for the 6.2 cycle, including
another slice of the MVE patches
+ tried to work through some of the codereview backlog
-- PMM
Successfully identified regression in *gcc* in CI configuration tcwg_gcc_bootstrap/master-arm-bootstrap_debug. So far, this commit has regressed CI configurations:
- tcwg_gcc_bootstrap/master-arm-bootstrap_debug
Culprit:
<cut>
commit 1d244020246cb155e4de62ca3b302b920a1f513f
Author: Roger Sayle <roger(a)nextmovesoftware.com>
Date: Mon Aug 23 12:37:04 2021 +0100
Fold sign of LSHIFT_EXPR to eliminate no-op conversions.
This short patch teaches fold that it is "safe" to change the sign
of a left shift, to reduce the number of type conversions in gimple.
As an example:
unsigned int foo(unsigned int i) {
return (int)i << 8;
}
is currently optimized to:
unsigned int foo (unsigned int i)
{
int i.0_1;
int _2;
unsigned int _4;
<bb 2> [local count: 1073741824]:
i.0_1 = (int) i_3(D);
_2 = i.0_1 << 8;
_4 = (unsigned int) _2;
return _4;
}
with this patch, this now becomes:
unsigned int foo (unsigned int i)
{
unsigned int _2;
<bb 2> [local count: 1073741824]:
_2 = i_1(D) << 8;
return _2;
}
which generates exactly the same assembly language. Aside from the
reduced memory usage, the real benefit is that no-op conversions tend
to interfere with many folding optimizations. For example,
unsigned int bar(unsigned char i) {
return (i ^ (i<<16)) | (i<<8);
}
currently gets (tangled in conversions and) optimized to:
unsigned int bar (unsigned char i)
{
unsigned int _1;
unsigned int _2;
int _3;
int _4;
unsigned int _6;
unsigned int _8;
<bb 2> [local count: 1073741824]:
_1 = (unsigned int) i_5(D);
_2 = _1 * 65537;
_3 = (int) i_5(D);
_4 = _3 << 8;
_8 = (unsigned int) _4;
_6 = _2 | _8;
return _6;
}
but with this patch, bar now optimizes down to:
unsigned int bar(unsigned char i)
{
unsigned int _1;
unsigned int _4;
<bb 2> [local count: 1073741824]:
_1 = (unsigned int) i_3(D);
_4 = _1 * 65793;
return _4;
}
2021-08-23 Roger Sayle <roger(a)nextmovesoftware.com>
gcc/ChangeLog
* match.pd (shift transformations): Change the sign of an
LSHIFT_EXPR if it reduces the number of explicit conversions.
gcc/testsuite/ChangeLog
* gcc.dg/fold-convlshift-1.c: New test case.
* gcc.dg/fold-convlshift-2.c: New test case.
</cut>
Results regressed to (for first_bad == 1d244020246cb155e4de62ca3b302b920a1f513f)
# reset_artifacts:
-10
# true:
0
# build_abe binutils:
1
# First few build errors in logs:
# 00:06:26 make[3]: [armv8l-unknown-linux-gnueabihf/bits/largefile-config.h] Error 1 (ignored)
# 00:25:39 make[3]: [armv8l-unknown-linux-gnueabihf/bits/largefile-config.h] Error 1 (ignored)
# 00:29:38 /home/tcwg-buildslave/workspace/tcwg_gnu_8/abe/snapshots/gcc.git~master/gcc/bitmap.h:357:13: error: type mismatch in ‘lshift_expr’
# 00:29:38 /home/tcwg-buildslave/workspace/tcwg_gnu_8/abe/snapshots/gcc.git~master/gcc/bitmap.h:357:13: internal compiler error: ‘verify_gimple’ failed
# 00:29:38 make[3]: *** [bitmap.o] Error 1
# 00:34:06 make[2]: *** [all-stage3-gcc] Error 2
# 00:34:06 make[1]: *** [stage3-bubble] Error 2
# 00:34:07 make: *** [all] Error 2
from (for last_good == b320edc0c29c838b0090c3c9be14187d132f73f2)
# reset_artifacts:
-10
# true:
0
# build_abe binutils:
1
# build_abe bootstrap_debug:
2
Artifacts of last_good build: https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-arm-bootstrap_de…
Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-arm-bootstrap_de…
Build top page/logs: https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-arm-bootstrap_de…
Configuration details:
Reproduce builds:
<cut>
mkdir investigate-gcc-1d244020246cb155e4de62ca3b302b920a1f513f
cd investigate-gcc-1d244020246cb155e4de62ca3b302b920a1f513f
git clone https://git.linaro.org/toolchain/jenkins-scripts
mkdir -p artifacts/manifests
curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-arm-bootstrap_de… --fail
curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-arm-bootstrap_de… --fail
curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-arm-bootstrap_de… --fail
chmod +x artifacts/test.sh
# Reproduce the baseline build (build all pre-requisites)
./jenkins-scripts/tcwg_gnu-build.sh @@ artifacts/manifests/build-baseline.sh
# Save baseline build state (which is then restored in artifacts/test.sh)
mkdir -p ./bisect
rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /gcc/ ./ ./bisect/baseline/
cd gcc
# Reproduce first_bad build
git checkout --detach 1d244020246cb155e4de62ca3b302b920a1f513f
../artifacts/test.sh
# Reproduce last_good build
git checkout --detach b320edc0c29c838b0090c3c9be14187d132f73f2
../artifacts/test.sh
cd ..
</cut>
History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/…
Artifacts: https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-arm-bootstrap_de…
Build log: https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-arm-bootstrap_de…
Full commit (up to 1000 lines):
<cut>
commit 1d244020246cb155e4de62ca3b302b920a1f513f
Author: Roger Sayle <roger(a)nextmovesoftware.com>
Date: Mon Aug 23 12:37:04 2021 +0100
Fold sign of LSHIFT_EXPR to eliminate no-op conversions.
This short patch teaches fold that it is "safe" to change the sign
of a left shift, to reduce the number of type conversions in gimple.
As an example:
unsigned int foo(unsigned int i) {
return (int)i << 8;
}
is currently optimized to:
unsigned int foo (unsigned int i)
{
int i.0_1;
int _2;
unsigned int _4;
<bb 2> [local count: 1073741824]:
i.0_1 = (int) i_3(D);
_2 = i.0_1 << 8;
_4 = (unsigned int) _2;
return _4;
}
with this patch, this now becomes:
unsigned int foo (unsigned int i)
{
unsigned int _2;
<bb 2> [local count: 1073741824]:
_2 = i_1(D) << 8;
return _2;
}
which generates exactly the same assembly language. Aside from the
reduced memory usage, the real benefit is that no-op conversions tend
to interfere with many folding optimizations. For example,
unsigned int bar(unsigned char i) {
return (i ^ (i<<16)) | (i<<8);
}
currently gets (tangled in conversions and) optimized to:
unsigned int bar (unsigned char i)
{
unsigned int _1;
unsigned int _2;
int _3;
int _4;
unsigned int _6;
unsigned int _8;
<bb 2> [local count: 1073741824]:
_1 = (unsigned int) i_5(D);
_2 = _1 * 65537;
_3 = (int) i_5(D);
_4 = _3 << 8;
_8 = (unsigned int) _4;
_6 = _2 | _8;
return _6;
}
but with this patch, bar now optimizes down to:
unsigned int bar(unsigned char i)
{
unsigned int _1;
unsigned int _4;
<bb 2> [local count: 1073741824]:
_1 = (unsigned int) i_3(D);
_4 = _1 * 65793;
return _4;
}
2021-08-23 Roger Sayle <roger(a)nextmovesoftware.com>
gcc/ChangeLog
* match.pd (shift transformations): Change the sign of an
LSHIFT_EXPR if it reduces the number of explicit conversions.
gcc/testsuite/ChangeLog
* gcc.dg/fold-convlshift-1.c: New test case.
* gcc.dg/fold-convlshift-2.c: New test case.
---
gcc/match.pd | 9 +++++++++
gcc/testsuite/gcc.dg/fold-convlshift-1.c | 20 ++++++++++++++++++++
gcc/testsuite/gcc.dg/fold-convlshift-2.c | 20 ++++++++++++++++++++
3 files changed, 49 insertions(+)
diff --git a/gcc/match.pd b/gcc/match.pd
index 0fcfd0ea62c..978a1b0172e 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3385,6 +3385,15 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(if (integer_zerop (@2) || integer_all_onesp (@2))
(cmp @0 @2)))))
+/* Both signed and unsigned lshift produce the same result, so use
+ the form that minimizes the number of conversions. */
+(simplify
+ (convert (lshift:s@0 (convert:s@1 @2) INTEGER_CST@3))
+ (if (tree_nop_conversion_p (type, TREE_TYPE (@0))
+ && INTEGRAL_TYPE_P (TREE_TYPE (@2))
+ && TYPE_PRECISION (TREE_TYPE (@2)) <= TYPE_PRECISION (type))
+ (lshift (convert @2) @3)))
+
/* Simplifications of conversions. */
/* Basic strip-useless-type-conversions / strip_nops. */
diff --git a/gcc/testsuite/gcc.dg/fold-convlshift-1.c b/gcc/testsuite/gcc.dg/fold-convlshift-1.c
new file mode 100644
index 00000000000..b6f57f81e72
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/fold-convlshift-1.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+unsigned int foo(unsigned int i)
+{
+ int t1 = i;
+ int t2 = t1 << 8;
+ return t2;
+}
+
+int bar(int i)
+{
+ unsigned int t1 = i;
+ unsigned int t2 = t1 << 8;
+ return t2;
+}
+
+/* { dg-final { scan-tree-dump-not "\\(int\\)" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "\\(unsigned int\\)" "optimized" } } */
+
diff --git a/gcc/testsuite/gcc.dg/fold-convlshift-2.c b/gcc/testsuite/gcc.dg/fold-convlshift-2.c
new file mode 100644
index 00000000000..f21358c4584
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/fold-convlshift-2.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+unsigned int foo(unsigned char c)
+{
+ int t1 = c;
+ int t2 = t1 << 8;
+ return t2;
+}
+
+int bar(unsigned char c)
+{
+ unsigned int t1 = c;
+ unsigned int t2 = t1 << 8;
+ return t2;
+}
+
+/* { dg-final { scan-tree-dump-times "\\(int\\)" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\\(unsigned int\\)" 1 "optimized" } } */
+
</cut>