On Wed, Nov 23, 2011 at 04:00:58PM +1300, Michael Hope wrote:
On Tue, Nov 22, 2011 at 12:10 AM, Dave Martin dave.martin@linaro.org wrote:
Defining a macro seems to eat up about half a megabyte of memory, due to the excessively large size of the macro arguments hash table (large enough for 65537 entries by default).
Hi Dave. Just bikshedding responses from me, but that's better than nothing.
Thanks for the comments
As a result, gas memory consumption becomes rather huge when a modestly large number (hundreds) of macros are defined.
In fact, it appears to make little sense to force the size of macro argument hash tables to be the same as the size of other hash tables, since the typical number of macro arguments is very small compared with the number of opcodes, macros or symbols; and because the number of macros is not limited to a small constant number.
This patch uses a smaller default size for the macro argument hash tables (43) which should be enough for virtually everyone. hash_new_sized () is exported from hash.c in order to allow hash tables of multiple sizes to be created.
What's the most arguments you've seen in real code? Does recursion or nesting of macros ever blow out this number?
The power of the gas macro processor is not widely appreciated, so far as I can tell. Except for experimantal things I've written myself, more than, say, 20 arguments or recursion appear to be exceptionally rare. The largest number of arguments I can find in the Linux kernel is 16. I can't find a single example of a recursive macro (though this is tricky to search for, so I may have missed some).
Paramaterised definition of macros from within other macros (which is where a lot of the real power, and the scope for having really large numbers of arguments, comes from) is something I've never seen an example of in the wild.
The fact that it's impossible to split a statement over multiple lines (?) also serves as a rather effective deterrent against having significantly large numbers of arguments.
Auto-generated source, or macro recursion, could result in the definition of large numbers of macros, and/or macros with essentially unlimited numbers of arguments. But such usage is far, far away from the common case, so far as I can tell.
Note that proliferation of macro arguments is not implied by the use of recursion. Most sane uses of recursion that I can think of only use/generate macros with a fairly conventional number of arguments.
A new argument, --macro-args is added to allow this number to be customised.
I don't like this. The default should be high enough that no one ever needs the argument.
Well, yes -- I sort of agree, although the same argument applies to --hash-size. You need a really huge compilation unit for that actually to be useful, and because more people seem not to use gcc -pipe, gas is often passed a file, whose size could provide the basis for a good guess about the appropriate size for the hash tables. gcc -combine could also produce huge compilation units. That's a more plausible to get a huge compilation unit, but I haven't seen much use of this either.
Now you mention it, though, because we only have to examine a single line of assembler source to count the arguments for a macro being defined, we may simply be able to create a hash of the correct size rather easily.
That would certainly be better than having a command-line switch, but we still have the problem of choosing a good hash-table size. Unfortunately, there no exposed "create a hash table of a sensible size close to this" function libiberty would seem the obvious place for that, but every part of binutils appears to roll its own hash table implementation.
These numbers are deliberately conservative -- they could perhaps be reduced further.
Caveats:
* I'm not a hash table expert, so the exact size values I've chosen may not be optimal.
Does it matter? Is the hash table checked often, or is it just used as a convenient name/value container?
Lookups will be done in the hash every time an argument is substituted during macro expansion.
So, both -- but if you make heavy use of macros, performance could be significantly impacted if the choice of hash table size is poor.
For most common usage, I suspect it isn't going to matter much though, so long as the hash table size isn't something pathologically bad like a power of 2.
Really, the knowledge about good choice of hash table sizes is not local to this problem -- it should exist in one place and get reused wherever appropriate. Do you have any views on where that should go?
Cheers ---Dave