Hello Michael,
We do have more and more instances of the following issues turning up in the kernel requiring toolchain assistance to solve the problem properly. Could you or someone from your team follow this up please?
---------- Forwarded message ---------- Date: Tue, 1 Feb 2011 12:16:48 +0000 From: Dave Martin dave.martin@linaro.org To: binutils@sourceware.org Cc: linaro-toolchain linaro-toolchain@lists.linaro.org Subject: Generating ancilliary sections with gas
Hi all,
Every now and again I come across a situation where it would be really useful to be able to query the assembler state during assembly: for example, to query and do something based on the current section name. This makes it possible to write generic macros to do certain things which otherwise require manual maintenance, or complex and fragile external preprocessing.
Below, I give a real-world example of the problem, and sketch out a possible solution.
What do people think of this approach? Does anyone have any better ideas on how to solve this?
Cheers ---Dave
EXAMPLE
An example is the generation of custom ancilliary sections. Suppose you want to write macros which record fixup information. Currently, there's no way to put each fixup in an appropriately named section automatically within gas. Tellingly, gas has had to grow the ability to do this internally at least for ARM, since the exception handling information in .ARM.ex{idx,tab}* must go in sections with names based on the associated section name. However, this ancillary section generation support is neither flexible nor exposed to the user.
By putting fixups in sections whose names are based on the name of the section they refer to, selective link-time discard of the fixups (and hence the code referenced by the fixups) will work; otherwise it doesn't. This would help avoid a situation where we have to keep dead code in the kernel because custom fixups are applied to it: at run-time, the code gets fixed up, then is thrown away. The fixups can't be selectively discarded because they are all in the same section: we seem have to no good way to separate them out into separate sections appropriately.
For context, see: http://www.spinics.net/lists/arm-kernel/msg112268.html
PROPOSAL
To solve the problem of generating custom ancillary sections during assembly, here's a simple proposal: introducing a new kind of macro argument can make aspects of the assembler state available to macros in a flexible way, with only minimal implementation required.
Basically, the macro qualifier field could be used to identify arguments which are filled in by the assembler with information about the assembly state, rather than being filled in by the invoker of the macro: e.g.:
.macro mymacro name:req, flags, secname:current_section /* ... */ .pushsection "\secname\name", "\flags" /* ... */ .popsection .endm
/* ... */
mymacro .ancillary, "a"
During expansion, \name and \flags are expanded as normal. But \secname is substituted instead with the current section name, so the macro expansion would look like this:
/* ... */ .pushsection ".text.ancillary", "a" /* ... */ .popsection
Without the special :current_section argument, it doesn't appear possible to implement a macro such as mymacro in a generic way.
This surely isn't the only way to achieve the goal, and it's probably not the best way, but it does have some desirable features.
Principally, while a new pseudo-op(s) could have been defined to append text to the current section name, etc., allowing the current section name to be appear as a macro parameter avoids prejudicing the way the text is used. So there should never be a need to introduce additional pseudo-ops to do things with the current section name: with this patch, the user can always implement their own macro to do the desired thing. This gets the desired behaviour and maximum flexibility, while keeping the implementation in gas very simple.
Also, using the macro expansion system in this way allows the caller a free choice of macro parameter names, and so pretty much guarantees that existing code won't get broken by the change.
Because my hack is currently simplistic, it has shortcomings: in particular, it's not desirable to parse an argument from the invocation line at all to fill a :current_section argument. Currently, an argument is read in if present, but its value is ignored and the current section name pasted in at macro expansion time instead. However, that should be straightforward to fix with a bit more code.
Of course, there's no reason only to expose the current section name in this way. Any aspect of the the assembler state (current subsection, current section flags, current instruction set, current macro mode, etc.) could be made available in a similar way.
USAGE EXAMPLE AND PATCH
Note that the specific implementation described here is intended to be illustrative, rather than complete or final.
binutils$ cat <<EOF >tst.s .macro push_ancillary_section name:req, flags, csec:current_section .pushsection "\name\csec", "\flags" .endm
.macro register_fixup _register_fixup 100@ .endm
.macro _register_fixup label:req \label : push_ancillary_section .fixup, "a" .long \label(b) .popsection .endm
.long 1 register_fixup .long 2
.data .long 3 register_fixup .long 4 .long 5 register_fixup .long 6 EOF
binutils$ gas/as-new -ahlms -o tst.o tst.s ARM GAS tst.s page 1
1 .macro push_ancillary_section name:req, flags, csec:current_section 2 .pushsection "\name\csec", "\flags" 3 .endm 4 5 .macro register_fixup 6 _register_fixup 100@ 7 .endm 8 9 .macro _register_fixup label:req 10 \label : 11 push_ancillary_section .fixup, "a" 12 .long \label(b) 13 .popsection 14 .endm 15 16 0000 01000000 .long 1 17 register_fixup 17 > _register_fixup 1000 17 >> 1000: 17 >> push_ancillary_section .fixup,"a" 17 >>> .pushsection ".fixup.text","a" 17 0000 04000000 >> .long 1000b 17 >> .popsection 18 0004 02000000 .long 2 19 20 .data 21 0000 03000000 .long 3 22 register_fixup 22 > _register_fixup 1003 22 >> 1003: 22 >> push_ancillary_section .fixup,"a" 22 >>> .pushsection ".fixup.data","a" 22 0000 04000000 >> .long 1003b 22 >> .popsection 23 0004 04000000 .long 4 24 0008 05000000 .long 5 25 register_fixup 25 > _register_fixup 1006 25 >> 1006: 25 >> push_ancillary_section .fixup,"a" 25 >>> .pushsection ".fixup.data","a" 25 0004 0C000000 >> .long 1006b 25 >> .popsection 26 000c 06000000 .long 6 ARM GAS tst.s page 2
NO DEFINED SYMBOLS
NO UNDEFINED SYMBOLS
binutils$ arm-linux-gnueabi-objdump -rs tst.o
tst.o: file format elf32-littlearm
RELOCATION RECORDS FOR [.fixup.text]: OFFSET TYPE VALUE 00000000 R_ARM_ABS32 .text
RELOCATION RECORDS FOR [.fixup.data]: OFFSET TYPE VALUE 00000000 R_ARM_ABS32 .data 00000004 R_ARM_ABS32 .data
Contents of section .text: 0000 01000000 02000000 ........ Contents of section .data: 0000 03000000 04000000 05000000 06000000 ................ Contents of section .fixup.text: 0000 04000000 .... Contents of section .fixup.data: 0000 04000000 0c000000 ........ Contents of section .ARM.attributes: 0000 41150000 00616561 62690001 0b000000 A....aeabi...... 0010 08010901 2c01 ....,.
diff --git a/gas/macro.c b/gas/macro.c index e392883..95c4de1 100644 --- a/gas/macro.c +++ b/gas/macro.c @@ -516,6 +516,8 @@ do_formals (macro_entry *macro, int idx, sb *in) formal->type = FORMAL_REQUIRED; else if (strcmp (qual.ptr, "vararg") == 0) formal->type = FORMAL_VARARG; + else if (strcmp (qual.ptr, "current_section") == 0) + formal->type = FORMAL_CURRENT_SECTION; else as_bad_where (macro->file, macro->line, @@ -540,6 +542,15 @@ do_formals (macro_entry *macro, int idx, sb *in) name, macro->name); } + else if (formal->type == FORMAL_CURRENT_SECTION) + { + sb_reset (&formal->def); + as_warn_where (macro->file, + macro->line, + _("Pointless default value for current_section parameter `%s' in macro `%s'"), + name, + macro->name); + } }
/* Add to macro's hash table. */ @@ -734,7 +745,11 @@ sub_actual (int start, sb *in, sb *t, struct hash_control *formal_hash, ptr = (formal_entry *) hash_find (formal_hash, sb_terminate (t)); if (ptr) { - if (ptr->actual.len) + if (ptr->type == FORMAL_CURRENT_SECTION) + { + sb_add_string (out, segment_name (now_seg)); + } + else if (ptr->actual.len) { sb_add_sb (out, &ptr->actual); } diff --git a/gas/macro.h b/gas/macro.h index edc1b6b..ea6cabb 100644 --- a/gas/macro.h +++ b/gas/macro.h @@ -38,7 +38,8 @@ enum formal_type { FORMAL_OPTIONAL, FORMAL_REQUIRED, - FORMAL_VARARG + FORMAL_VARARG, + FORMAL_CURRENT_SECTION, };
/* Describe the formal arguments to a macro. */
_______________________________________________ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain
On Mon, Jun 13, 2011 at 10:03 PM, Nicolas Pitre nicolas.pitre@linaro.org wrote:
Hello Michael,
We do have more and more instances of the following issues turning up in the kernel requiring toolchain assistance to solve the problem properly. Could you or someone from your team follow this up please?
Just a bit of extra context here:
The problem my patch intends to solve is just one instance of a general class of problems: in general, it's not possible to query, save or restore aspects of the assembler state during assembly.
gas supports a few special-purposes state management features such as .pushsection/.popsection, but for most aspects of the assembler state there are no such facilities.
Things we'd like to achieve are:
Creating sections with names based on that of the current section Temporarily overriding .arch (for example, to build in CPU-specific optimisations in a file built for a more generic architecture).
Hacks like the one I suggest could provide a way to solve problems such as these. However, I don't know much about gas internals, so there may be a much cleaner way of achieving these things than the method I propose.
Cheers ---Dave
---------- Forwarded message ---------- Date: Tue, 1 Feb 2011 12:16:48 +0000 From: Dave Martin dave.martin@linaro.org To: binutils@sourceware.org Cc: linaro-toolchain linaro-toolchain@lists.linaro.org Subject: Generating ancilliary sections with gas
Hi all,
Every now and again I come across a situation where it would be really useful to be able to query the assembler state during assembly: for example, to query and do something based on the current section name. This makes it possible to write generic macros to do certain things which otherwise require manual maintenance, or complex and fragile external preprocessing.
Below, I give a real-world example of the problem, and sketch out a possible solution.
What do people think of this approach? Does anyone have any better ideas on how to solve this?
Cheers ---Dave
EXAMPLE
An example is the generation of custom ancilliary sections. Suppose you want to write macros which record fixup information. Currently, there's no way to put each fixup in an appropriately named section automatically within gas. Tellingly, gas has had to grow the ability to do this internally at least for ARM, since the exception handling information in .ARM.ex{idx,tab}* must go in sections with names based on the associated section name. However, this ancillary section generation support is neither flexible nor exposed to the user.
By putting fixups in sections whose names are based on the name of the section they refer to, selective link-time discard of the fixups (and hence the code referenced by the fixups) will work; otherwise it doesn't. This would help avoid a situation where we have to keep dead code in the kernel because custom fixups are applied to it: at run-time, the code gets fixed up, then is thrown away. The fixups can't be selectively discarded because they are all in the same section: we seem have to no good way to separate them out into separate sections appropriately.
For context, see: http://www.spinics.net/lists/arm-kernel/msg112268.html
PROPOSAL
To solve the problem of generating custom ancillary sections during assembly, here's a simple proposal: introducing a new kind of macro argument can make aspects of the assembler state available to macros in a flexible way, with only minimal implementation required.
Basically, the macro qualifier field could be used to identify arguments which are filled in by the assembler with information about the assembly state, rather than being filled in by the invoker of the macro: e.g.:
.macro mymacro name:req, flags, secname:current_section /* ... */ .pushsection "\secname\name", "\flags" /* ... */ .popsection .endm
/* ... */
mymacro .ancillary, "a"
During expansion, \name and \flags are expanded as normal. But \secname is substituted instead with the current section name, so the macro expansion would look like this:
/* ... */ .pushsection ".text.ancillary", "a" /* ... */ .popsection
Without the special :current_section argument, it doesn't appear possible to implement a macro such as mymacro in a generic way.
This surely isn't the only way to achieve the goal, and it's probably not the best way, but it does have some desirable features.
Principally, while a new pseudo-op(s) could have been defined to append text to the current section name, etc., allowing the current section name to be appear as a macro parameter avoids prejudicing the way the text is used. So there should never be a need to introduce additional pseudo-ops to do things with the current section name: with this patch, the user can always implement their own macro to do the desired thing. This gets the desired behaviour and maximum flexibility, while keeping the implementation in gas very simple.
Also, using the macro expansion system in this way allows the caller a free choice of macro parameter names, and so pretty much guarantees that existing code won't get broken by the change.
Because my hack is currently simplistic, it has shortcomings: in particular, it's not desirable to parse an argument from the invocation line at all to fill a :current_section argument. Currently, an argument is read in if present, but its value is ignored and the current section name pasted in at macro expansion time instead. However, that should be straightforward to fix with a bit more code.
Of course, there's no reason only to expose the current section name in this way. Any aspect of the the assembler state (current subsection, current section flags, current instruction set, current macro mode, etc.) could be made available in a similar way.
USAGE EXAMPLE AND PATCH
Note that the specific implementation described here is intended to be illustrative, rather than complete or final.
binutils$ cat <<EOF >tst.s .macro push_ancillary_section name:req, flags, csec:current_section .pushsection "\name\csec", "\flags" .endm
.macro register_fixup _register_fixup 100@ .endm
.macro _register_fixup label:req \label : push_ancillary_section .fixup, "a" .long \label(b) .popsection .endm
.long 1 register_fixup .long 2
.data .long 3 register_fixup .long 4 .long 5 register_fixup .long 6 EOF
binutils$ gas/as-new -ahlms -o tst.o tst.s ARM GAS tst.s page 1
1 .macro push_ancillary_section name:req, flags, csec:current_section 2 .pushsection "\name\csec", "\flags" 3 .endm 4 5 .macro register_fixup 6 _register_fixup 100@ 7 .endm 8 9 .macro _register_fixup label:req 10 \label : 11 push_ancillary_section .fixup, "a" 12 .long \label(b) 13 .popsection 14 .endm 15 16 0000 01000000 .long 1 17 register_fixup 17 > _register_fixup 1000 17 >> 1000: 17 >> push_ancillary_section .fixup,"a" 17 >>> .pushsection ".fixup.text","a" 17 0000 04000000 >> .long 1000b 17 >> .popsection 18 0004 02000000 .long 2 19 20 .data 21 0000 03000000 .long 3 22 register_fixup 22 > _register_fixup 1003 22 >> 1003: 22 >> push_ancillary_section .fixup,"a" 22 >>> .pushsection ".fixup.data","a" 22 0000 04000000 >> .long 1003b 22 >> .popsection 23 0004 04000000 .long 4 24 0008 05000000 .long 5 25 register_fixup 25 > _register_fixup 1006 25 >> 1006: 25 >> push_ancillary_section .fixup,"a" 25 >>> .pushsection ".fixup.data","a" 25 0004 0C000000 >> .long 1006b 25 >> .popsection 26 000c 06000000 .long 6 ARM GAS tst.s page 2
NO DEFINED SYMBOLS
NO UNDEFINED SYMBOLS
binutils$ arm-linux-gnueabi-objdump -rs tst.o
tst.o: file format elf32-littlearm
RELOCATION RECORDS FOR [.fixup.text]: OFFSET TYPE VALUE 00000000 R_ARM_ABS32 .text
RELOCATION RECORDS FOR [.fixup.data]: OFFSET TYPE VALUE 00000000 R_ARM_ABS32 .data 00000004 R_ARM_ABS32 .data
Contents of section .text: 0000 01000000 02000000 ........ Contents of section .data: 0000 03000000 04000000 05000000 06000000 ................ Contents of section .fixup.text: 0000 04000000 .... Contents of section .fixup.data: 0000 04000000 0c000000 ........ Contents of section .ARM.attributes: 0000 41150000 00616561 62690001 0b000000 A....aeabi...... 0010 08010901 2c01 ....,.
diff --git a/gas/macro.c b/gas/macro.c index e392883..95c4de1 100644 --- a/gas/macro.c +++ b/gas/macro.c @@ -516,6 +516,8 @@ do_formals (macro_entry *macro, int idx, sb *in) formal->type = FORMAL_REQUIRED; else if (strcmp (qual.ptr, "vararg") == 0) formal->type = FORMAL_VARARG;
- else if (strcmp (qual.ptr, "current_section") == 0)
- formal->type = FORMAL_CURRENT_SECTION;
else as_bad_where (macro->file, macro->line, @@ -540,6 +542,15 @@ do_formals (macro_entry *macro, int idx, sb *in) name, macro->name); }
- else if (formal->type == FORMAL_CURRENT_SECTION)
- {
- sb_reset (&formal->def);
- as_warn_where (macro->file,
- macro->line,
- _("Pointless default value for current_section parameter `%s' in macro `%s'"),
- name,
- macro->name);
- }
}
/* Add to macro's hash table. */ @@ -734,7 +745,11 @@ sub_actual (int start, sb *in, sb *t, struct hash_control *formal_hash, ptr = (formal_entry *) hash_find (formal_hash, sb_terminate (t)); if (ptr) {
- if (ptr->actual.len)
- if (ptr->type == FORMAL_CURRENT_SECTION)
- {
- sb_add_string (out, segment_name (now_seg));
- }
- else if (ptr->actual.len)
{ sb_add_sb (out, &ptr->actual); } diff --git a/gas/macro.h b/gas/macro.h index edc1b6b..ea6cabb 100644 --- a/gas/macro.h +++ b/gas/macro.h @@ -38,7 +38,8 @@ enum formal_type { FORMAL_OPTIONAL, FORMAL_REQUIRED,
- FORMAL_VARARG
- FORMAL_VARARG,
- FORMAL_CURRENT_SECTION,
};
/* Describe the formal arguments to a macro. */
linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain
linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain
On Tue, Jun 14, 2011 at 9:03 AM, Nicolas Pitre nicolas.pitre@linaro.org wrote:
Hello Michael,
We do have more and more instances of the following issues turning up in the kernel requiring toolchain assistance to solve the problem properly. Could you or someone from your team follow this up please?
Hi Nicholas. Sorry for the delay. We'll talk about this at today's meeting and I'll follow up to this thread.
-- Michael
On Mon, Jun 20, 2011 at 4:15 PM, Michael Hope michael.hope@linaro.org wrote:
On Tue, Jun 14, 2011 at 9:03 AM, Nicolas Pitre nicolas.pitre@linaro.org wrote:
Hello Michael,
We do have more and more instances of the following issues turning up in the kernel requiring toolchain assistance to solve the problem properly. Could you or someone from your team follow this up please?
Hi Nicholas. Sorry for the delay. We'll talk about this at today's meeting and I'll follow up to this thread.
Hi Nicholas. We had a talk about it at the meeting (unfortunately) two weeks ago. The original request can be split into two bits: finding the current section name, and changing the assembler state at runtime.
The latter is possible but a fair chunk of work. Per section and per symbol attributes are possible but not implemented. One idea would be to add a new directive such as '.permitted push neon' to allow NEON instructions from here on, and '.permitted pop' to restore the previous state.
The section name work is trickier. I don't like Dave's suggestion of adding a new formal argument type which expands to the callers section name. It works, but isn't really a formal type like 'required' or 'vaargs'. An alternative is to add a new macro-local variable such as _sectionname.
Do you have a work around? How much smaller would the kernel be with support for fetching the current section name?
The minutes are here: https://wiki.linaro.org/WorkingGroups/ToolChain/Meetings/2011-06-20
-- Michael
On Mon, 4 Jul 2011, Michael Hope wrote:
On Mon, Jun 20, 2011 at 4:15 PM, Michael Hope michael.hope@linaro.org wrote:
On Tue, Jun 14, 2011 at 9:03 AM, Nicolas Pitre nicolas.pitre@linaro.org wrote:
Hello Michael,
We do have more and more instances of the following issues turning up in the kernel requiring toolchain assistance to solve the problem properly. Could you or someone from your team follow this up please?
Hi Nicholas. Sorry for the delay. We'll talk about this at today's meeting and I'll follow up to this thread.
Hi Nicholas. We had a talk about it at the meeting (unfortunately) two weeks ago. The original request can be split into two bits: finding the current section name, and changing the assembler state at runtime.
The latter is possible but a fair chunk of work. Per section and per symbol attributes are possible but not implemented. One idea would be to add a new directive such as '.permitted push neon' to allow NEON instructions from here on, and '.permitted pop' to restore the previous state.
Sure, however that won't solve the problem at hand. From reading the minutes I have the impression ttwo issues might be mixed up together.
The section name work is trickier. I don't like Dave's suggestion of adding a new formal argument type which expands to the callers section name. It works, but isn't really a formal type like 'required' or 'vaargs'. An alternative is to add a new macro-local variable such as _sectionname.
That would certainly be perfectly fine.
Do you have a work around? How much smaller would the kernel be with support for fetching the current section name?
We do have a workaround which consists of pulling all the referenced code in the kernel and discarding it at run time instead of simply discarding it at link time. But this feels really awkward because the toolchain is really unhelpful here. The kernel size is not that huge considering that we're talking about some code marked as __exit i.e. module removal cleanup code which is obviously not needed when linking modules in the kernel. But with more runtime patching of the kernel techniques, more of that previously link-time discarded code now has to be moved to the __init section which could be seen as a regression.
Nicolas
On Mon, Jul 4, 2011 at 2:45 AM, Nicolas Pitre nicolas.pitre@linaro.org wrote:
On Mon, 4 Jul 2011, Michael Hope wrote:
On Mon, Jun 20, 2011 at 4:15 PM, Michael Hope michael.hope@linaro.org wrote:
On Tue, Jun 14, 2011 at 9:03 AM, Nicolas Pitre nicolas.pitre@linaro.org wrote:
Hello Michael,
We do have more and more instances of the following issues turning up in the kernel requiring toolchain assistance to solve the problem properly. Could you or someone from your team follow this up please?
Hi Nicholas. Sorry for the delay. We'll talk about this at today's meeting and I'll follow up to this thread.
Hi Nicholas. We had a talk about it at the meeting (unfortunately) two weeks ago. The original request can be split into two bits: finding the current section name, and changing the assembler state at runtime.
The latter is possible but a fair chunk of work. Per section and per symbol attributes are possible but not implemented. One idea would be to add a new directive such as '.permitted push neon' to allow NEON instructions from here on, and '.permitted pop' to restore the previous state.
Sure, however that won't solve the problem at hand. From reading the minutes I have the impression ttwo issues might be mixed up together.
The section name work is trickier. I don't like Dave's suggestion of adding a new formal argument type which expands to the callers section
Well, it was a complete hack :)
name. It works, but isn't really a formal type like 'required' or 'vaargs'. An alternative is to add a new macro-local variable such as _sectionname.
That would certainly be perfectly fine.
Are we worried about polluting the macro parameter namespace? One intentional feature of my suggestion is that it cannot affect the behaviour of any existing code, albeit by a rather cumbersome means.
Another thing that concerns me is the suggestion of various different special-case tweaks to implement various instances of a general area of functionality.
In principle, it could be more efficient and robust to address the general case, i.e. how to save, restore and query an appropriate subset of the assembler state at run-time.
(Interestingly, troff -- which bears a surprisingly uncanny resemblance to an assembler -- seems to have better generic capabilities in this area despite its ancient lineage. Its environment and state variable concepts seem close to what seems to be needed here: environments allow clean restoration of state after temporary changes, and variables are available reflecting many aspects of the program's state at run-time. I'm not saying it's easy to flip a switch and integrate such capabilities into gas, but it's a more instructive comparison than you might initially think.)
Do you have a work around? How much smaller would the kernel be with support for fetching the current section name?
We do have a workaround which consists of pulling all the referenced code in the kernel and discarding it at run time instead of simply discarding it at link time. But this feels really awkward because the toolchain is really unhelpful here. The kernel size is not that huge considering that we're talking about some code marked as __exit i.e. module removal cleanup code which is obviously not needed when linking modules in the kernel. But with more runtime patching of the kernel techniques, more of that previously link-time discarded code now has to be moved to the __init section which could be seen as a regression.
I guess my main concern at this point is that this generates fragmentary mess in the kernel _source_, effectively breaking the kernel's own linking rules to deal with these special cases. Once there is mess, maintenance tends to make the mess worse over time even if no functionality is added. It would be much nicer if we could avoid the mess in the first place...
Cheers ---Dave
linaro-toolchain@lists.linaro.org