Hi,
On 26 August 2017 at 18:10, Pinski, Andrew Andrew.Pinski@cavium.com wrote:
However there might be pushback from upstream maintainers as this makes the structure bigger by adding a field. This could have implications for memory usage of the compiler.
I looked into the structure, adding this field is not going to make the structure bigger for either ILP32 or LP64 targets. If you want, you use bit-fields; there is one bool already there which means you can fit 8 bits in the same area as currently taken up by that one.
Yes. I should have checked the mem_attrs structure. This does have at least a byte left unlike some other tightly packed structures (gimple and some tree structures in gcc).
Thanks, Kugan
Alternatively, we maybe able to get this info from dwarf info when we compile with -g ?
I doubt you can. He wants to know if an instruction is a spill location. The location of a variable might be recorded in -g (if it was an user variable) but not that does present the data for all temps being spilled.
I think the patch is actually a good one in general just needs some cleanup.
As for these comments:
For example, GCC calls `output_asm_insn' directly from the `define_insn' definition in the aarch64.md file without an insn object(`output_asm_insn' calls `output_asm_operand_names'). This occurs in "*cb<optab><mode>1" and "*aarch64_fcvt<su_optab>GPF:modeGPI:mode2_mult".
Spills in GCC will always be via the mov* patterns (they are special). Now really *aarch64_fcvt<su_optab>GPF:modeGPI:mode2_mult should be fixed for a different reason; it does unneeded work. The fix would be something like (untested): { operands[2] = GEN_INT (aarch64_fpconst_pow_of_2 (operands[2])); return "fcvtz<su>\t%GPI:w0, %GPF:s1, %2"; }
Thanks, Andrew
-----Original Message----- From: linaro-toolchain [mailto:linaro-toolchain-bounces@lists.linaro.org] On Behalf Of Kugan Vivekanandarajah Sent: Saturday, August 26, 2017 12:40 AM To: Renato Golin renato.golin@linaro.org Cc: Jim Wilson jim.wilson@linaro.org; hpc-sig-devel@linaro.org; Linaro Toolchain linaro-toolchain@lists.linaro.org Subject: Re: [hpc-sig-devel] GCC extensions for `hcqc'
Hi,
On 26 August 2017 at 04:04, Renato Golin renato.golin@linaro.org wrote:
+linaro-toolchain, hoping to get more eyes into it.
cheers, --renato
On 25 August 2017 at 17:59, Masaki Arai masaki.arai@linaro.org wrote:
Hi,
I extended GCC 7.1(or GCC 7.2) for `hcqc'. I would be grateful if you could give me a comment about whether this extension is acceptable and whether this extension should be pushed upstream.
I think this is a useful info. However there might be pushback from upstream maintainers as this makes the structure bigger by adding a field. This could have implications for memory usage of the compiler. Alternatively, we maybe able to get this info from dwarf info when we compile with -g ? Jim may have some input here (cc ing him).
Thanks, Kugan
The extended GCC's output using the option ` -fverbose-asm' is as follows:
ldr w0, [x29,48] // tmp433, j(8-byte Folded Spill) ^^^^^^^^^^^^^^^^^^^ This code shows that this instruction accesses a memory area for spill codes. I made the following changes to GCC 7.1(or GCC 7.2). The related files are under `hcqc/patch/gcc-7.1.0-add'.
(1) rtl.h
I added flag information to `struct mem_attrs' that means whether it is a spill memory area or not.
- /* True if the MEM is for spill. */
- bool for_spill_p;
Also, I added an access macro for this additional field.
- /* For a MEM rtx, true if its MEM is for spill. */ #define
- MEM_FOR_SPILL_P(RTX) (get_mem_attrs (RTX)->for_spill_p)
(2) emit-rtl.c
I added a code to turn on flags for spill memory area in function `set_mem_attrs_for_spill'.
- attrs.for_spill_p = true;
(3) final.c
I added code to print that information in function `output_asm_operand_names' if the memory is a spill memory area,
if (MEM_P (op) && MEM_FOR_SPILL_P (op))
{
HOST_WIDE_INT size = MEM_SIZE (op);
fprintf (asm_out_file, " (" HOST_WIDE_INT_PRINT_DEC "-byte
- Folded
Spill)", size);
}
The above changes are implemented similarly as Clang/LLVM. Unfortunately, it is difficult for GCC to print the above "(?-byte Folded Spill)" for memory access instructions only in the same manner as Clang/LLVM. The reason is that GCC executes the above `output_asm_operand_names' even in situations where any instruction object(insn) does not exist when outputting assembly code. For example, GCC calls `output_asm_insn' directly from the `define_insn' definition in the aarch64.md file without an insn object(`output_asm_insn' calls `output_asm_operand_names'). This occurs in "*cb<optab><mode>1" and "*aarch64_fcvt<su_optab>GPF:modeGPI:mode2_mult".
From this fact, `hcqc' extracts and accumulates memory access instructions from the assembly code with the comment "(?-byte Folded Spill)".
The above extensions are commonly available on almost any architecture. Also, these extensions do not affect the execution of the resulting assembly code since additional outputs are only in comments.
Best regards,
Masaki Arai
linaro-toolchain mailing list linaro-toolchain@lists.linaro.org https://lists.linaro.org/mailman/listinfo/linaro-toolchain
linaro-toolchain mailing list linaro-toolchain@lists.linaro.org https://lists.linaro.org/mailman/listinfo/linaro-toolchain
Hi,
Thank you very much for your quick check and reply.
Kugan Vivekanandarajah kugan.vivekanandarajah@linaro.org writes:
I looked into the structure, adding this field is not going to make the
s=
tructure bigger for either ILP32 or LP64 targets. If you want, you use
bit=
-fields; there is one bool already there which means you can fit 8 bits
in =
the same area as currently taken up by that one.
Yes. I should have checked the mem_attrs structure. This does have at least a byte left unlike some other tightly packed structures (gimple and some tree structures in gcc).
Even though memory usage does not increase, I understand the policy of wanting to make the data structure simple.
Another way to implement this feature is to use the `addrspace' field in `struct mem_attrs' without adding any fields. I think that this implementation may be more decent. However, since this field holds information specific to the target machine, changing this will affect many files.
Alternatively, we maybe able to get this info from dwarf info when we
co= mpile with -g ?
I doubt you can. He wants to know if an instruction is a spill
location.=
The location of a variable might be recorded in -g (if it was an user
var=
iable) but not that does present the data for all temps being spilled.
I think the patch is actually a good one in general just needs some
clean=
up.
I was not thinking about implementation using DWARF. About 2013, I have created a tool to extract information from DWARF data in binary files generated by GCC. At that time, there were some shortages in the DWARF information, and as a result, it did not go very well.
Best regards, -- -------------------------------------- Masaki Arai
Hi,
On 29 August 2017 at 04:54, Masaki Arai masaki.arai@linaro.org wrote:
Hi,
Thank you very much for your quick check and reply.
Kugan Vivekanandarajah kugan.vivekanandarajah@linaro.org writes:
I looked into the structure, adding this field is not going to make the
s=
tructure bigger for either ILP32 or LP64 targets. If you want, you use
bit=
-fields; there is one bool already there which means you can fit 8 bits
in =
the same area as currently taken up by that one.
Yes. I should have checked the mem_attrs structure. This does have at least a byte left unlike some other tightly packed structures (gimple and some tree structures in gcc).
Even though memory usage does not increase, I understand the policy of wanting to make the data structure simple.
Another way to implement this feature is to use the `addrspace' field in `struct mem_attrs' without adding any fields. I think that this implementation may be more decent. However, since this field holds information specific to the target machine, changing this will affect many files.
I think your current approach is reasonable. It is better to discuss your alternate approach upstream. I would suggest you post your patch upstream and initiate a discussion there. Please refer to https://gcc.gnu.org/contribute.html (submitting patches part) if you already haven't.
Alternatively, we maybe able to get this info from dwarf info when we
co= mpile with -g ?
I doubt you can. He wants to know if an instruction is a spill
location.=
The location of a variable might be recorded in -g (if it was an user
var=
iable) but not that does present the data for all temps being spilled.
I think the patch is actually a good one in general just needs some
clean=
up.
I was not thinking about implementation using DWARF. About 2013, I have created a tool to extract information from DWARF data in binary files generated by GCC. At that time, there were some shortages in the DWARF information, and as a result, it did not go very well.
I am not an expert in DWARF but I can understand that this can be a problem.
Thanks, Kugan
Best regards,
Masaki Arai _______________________________________________ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org https://lists.linaro.org/mailman/listinfo/linaro-toolchain
Hi,
Kugan Vivekanandarajah kugan.vivekanandarajah@linaro.org writes:
Even though memory usage does not increase, I understand the policy of wanting to make the data structure simple.
Another way to implement this feature is to use the `addrspace' field in `struct mem_attrs' without adding any fields. I think that this implementation may be more decent. However, since this field holds information specific to the target machine, changing this will affect many files.
I think your current approach is reasonable. It is better to discuss your alternate approach upstream. I would suggest you post your patch upstream and initiate a discussion there. Please refer to https://gcc.gnu.org/contribute.html (submitting patches part) if you already haven't.
Thank you very much for your comments and advice. I checked the URL. If there is no objection, I will consider proposing this function(and alternative appraoch) on the GCC mailing listgcc@gcc.gnu.org (after improving print strings and actual patch files).
Best regards, -- -------------------------------------- Masaki Arai
On 29 August 2017 at 10:15, Masaki Arai masaki.arai@linaro.org wrote:
Thank you very much for your comments and advice. I checked the URL. If there is no objection, I will consider proposing this function(and alternative appraoch) on the GCC mailing listgcc@gcc.gnu.org (after improving print strings and actual patch files).
I think that'll be the best way forward, thanks!
The list to send the patches is gcc-patches@, according to this page: https://gcc.gnu.org/lists.html
Also, make sure you have enough tests. Usually, you can find tests for similar features and replicate them for yours.
cheers, --renato
Hi,
Renato Golin renato.golin@linaro.org writes:
On 29 August 2017 at 10:15, Masaki Arai masaki.arai@linaro.org wrote:
If there is no objection, I will consider proposing this function(and alternative appraoch) on the GCC mailing listgcc@gcc.gnu.org (after improving print strings and actual patch files).
I think that'll be the best way forward, thanks!
The list to send the patches is gcc-patches@, according to this page: https://gcc.gnu.org/lists.html
Also, make sure you have enough tests. Usually, you can find tests for similar features and replicate them for yours.
Thank you very much for your advice.
There are many examples in the archive of the mailing list(gcc and gcc-patches), so I will prepare a proposal using them as a sample. In addition to modifying some source codes, I think that it is necessary to check about test programs related to `-fverbose-asm' under `gcc/testsuite'.
Best regards, -- -------------------------------------- Masaki Arai
On 29 August 2017 at 17:36, Masaki Arai masaki.arai@linaro.org wrote:
Hi,
Renato Golin renato.golin@linaro.org writes:
On 29 August 2017 at 10:15, Masaki Arai masaki.arai@linaro.org wrote:
If there is no objection, I will consider proposing this function(and alternative appraoch) on the GCC mailing listgcc@gcc.gnu.org (after improving print strings and actual patch files).
I think that'll be the best way forward, thanks!
The list to send the patches is gcc-patches@, according to this page: https://gcc.gnu.org/lists.html
Also, make sure you have enough tests. Usually, you can find tests for similar features and replicate them for yours.
Thank you very much for your advice.
There are many examples in the archive of the mailing list(gcc and gcc-patches), so I will prepare a proposal using them as a sample. In addition to modifying some source codes, I think that it is necessary to check about test programs related to `-fverbose-asm' under `gcc/testsuite'.
Yes, GCC tests are located under gcc/testsuite. There you can grep for verbose-asm, and see if there is something helpful.
Thanks,
Christophe
Best regards,
Masaki Arai _______________________________________________ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org https://lists.linaro.org/mailman/listinfo/linaro-toolchain
Sorry for the delayed response.
Masaki Arai masaki.arai@linaro.org writes:
Hi,
Thank you very much for your quick check and reply.
Kugan Vivekanandarajah kugan.vivekanandarajah@linaro.org writes:
I looked into the structure, adding this field is not going to make the
s=
tructure bigger for either ILP32 or LP64 targets. If you want, you use
bit=
-fields; there is one bool already there which means you can fit 8 bits
in =
the same area as currently taken up by that one.
Yes. I should have checked the mem_attrs structure. This does have at least a byte left unlike some other tightly packed structures (gimple and some tree structures in gcc).
Even though memory usage does not increase, I understand the policy of wanting to make the data structure simple.
FWIW, I think adding a field for this should be fine. It's very much in the spirit of ORIGINAL_REGNO, which also exists to record the effects of register allocation and reloading.
Perhaps one question is how the flag should interact with mem_attrs_eq_p, but I suppose in practice, if everything else about two mem_attrs is the same, the spill flag should be too, and so there should be no need to check the spill flag explicitly.
An alternative to adding a new flag might be to check:
MEM_EXPR (mem) == get_spill_slot_decl (false)
But this would bake in the assumption that everything we want to mark as a spill slot will use set_mem_attrs_for_spill, whereas the flag would allow other MEMs to be marked as spill slots too.
Thanks, Richard
Hi,
Thank you very much, everyone, who gave me useful advice. I might conclude that extension of GCC for HCQC is completely unnecessary.
Richard Sandiford richard.sandiford@linaro.org writes:
An alternative to adding a new flag might be to check:
MEM_EXPR (mem) == get_spill_slot_decl (false)
I think that this method is far better than introducing the flag `for_spill_p'! For this approach, the additional code can only be the following 6 lines in final.c.
+ + if (MEM_P (op) && MEM_EXPR (op) && MEM_EXPR (op) == get_spill_slot_decl (false)) + { + HOST_WIDE_INT size = MEM_SIZE (op); + fprintf (asm_out_file, " (" HOST_WIDE_INT_PRINT_DEC "-byte spill slot)", size); + }
I implemented this, and it seems to work fine both on AArch64
ldr x0, [x29, 192] // tmp1241, %sfp (8-byte spill slot) str w0, [x29, 204] // ivtmp_805, %sfp (4-byte spill slot)
and on x86_64
movsd 8(%rsp), %xmm1 # %sfp (8-byte spill slot), prephitmp_811 movsd %xmm0, 16(%rsp) #, %sfp (8-byte spill slot)
The important point here is that "spill slot" is always displayed with "%sfp" attached. This is because the object returned by `get_spill_slot_decl' always points to the following decl tree `d'.
d = build_decl (DECL_SOURCE_LOCATION (current_function_decl), VAR_DECL, get_identifier ("%sfp"), void_type_node);
This means that "%sfp" is "spill slot" on all target machines. I checked the source code of GCC, and it seems that only this part of code uses "%sfp". This "%sfp" does not express data size, but HCQC can get data size information from assembly codes in the same line with "%sfp". Therefore, the conclusion is that it is enough for HCQC to detect "%sfp" without any extension of GCC.
But this would bake in the assumption that everything we want to mark as a spill slot will use set_mem_attrs_for_spill, whereas the flag would allow other MEMs to be marked as spill slots too.
I guess that the effect of the flag can also be realized with the following code:
MEM_EXPR (mem) = get_spill_slot_decl (false);
instead of
MEM_ATTRS (mem).for_spill_p = true;
The only case where `for_spill_p' seems to be useful is where some optimizations combine a memory area V on the stack for a local variable and a memory area S for the spill code together and make it the V only using the result of live range analysis for both memory areas. In that case, the code
MEM_ATTRS (V).for_spill_p = true;
holds the original memory expression with this additional flag. However, it may be a matter of opinion whether this memory region should be considered spill areas or not.
Best regards, -- -------------------------------------- Masaki Arai
linaro-toolchain@lists.linaro.org