If you don't understand what NP means, read my statement again. Also, I provided URLs which may explain this issue.
I'm not a compiler writer. In Finland, it is very rare that one gets such a job. In Nokia it might be possible, but whatever. You simply don't seem to understand. That is not my problem.
In general, I'm just interested in implementing a modern aggressively optimizing compiler (also re-targetable), and I chose Pascal from the
70's. It has pretty simply syntax and semantics. Also, because of my background in the demo scene, I find it interesting to create a compiler that would finally schedule insn for MC68060 (gcc doesn't do that, neither vbcc).
Well, that is the common way to do that. But if I want to keep the "middle-end" free of target specific details?
Ok. I have to confess, that I have never been on that freq. On the Amiga, OTOH, everything was synched into the 50Hz screen. But that is an another issue again, which isn't really related to this group.
So does the compiler when it'll decide to make your switch-case
-statement into a table? I don't quite understand what do you mean by increasing the source size. Normally, when one applies the State pattern, the source gets quite small. This pattern can be made in C, even though it is PITA.
Now, you are being a troll. As it has been stated _several_ times in this thread (which was created by a troll), you need to do this only with very bad compilers.
A hint: Single Statement Assignment (SSA)
The first sentence is quite nice. I have seen (or forced) to refactor code in C++, which did not use STL. Instead, some wise guy did his own implementation of string and hashtable. Which is nice.
Indeed: The usual approach is to turn a sparse tree into a hard coded binary search for the upper part of the tree, and then a linear search for the lower part. The point where you switch depends on the cost of mispredicted branches. In the binary tree part, about half the branches will mispredict (not counting the branch-eq leg of the common compare / branch-gt / branch-eq construction), in the sequential part, only one (the one that's actually taken) will mispredict. Each extra layer of the binary search adds a compare, and half each of a predicted and mispredicted branch, but halves the number of compares and correctly predicted branches in the sequential part.
And of course switches with both spares and dense sections often generate code that does computed branches in the dense parts of the tree.
No, but I'm planning to be more famous so people stop asking that question.
The word contiguous doesn't imply linearly incrementing, it means no gaps. Compilers always sort case values so the order is irrelevant. If your case labels are not dense (ie. have large gaps) then you're unlikely to get a branch table.
Now you're trolling. SSA has absolutely nothing to do with switches... FYI, few compilers use SSA today and the ones that do only use it in a limited set of optimizations.
C++ and STL is a different matter altogether. I would not advocate STL in programs that need to be efficient, use as little memory as possible and be small (ie. most embedded systems and many big desktop applications). It's great for fast development or if you simply don't care about memory use or performance.
STL often uses too much resources even for desktop systems, so many performance critical applications avoid it. I've seen cases where STL string concatenation slowed down command-line parsing so much that command-line parsing it became the most performance critical part of a compiler...
Many performance critical programs use their own memory allocation strategies as new/malloc are not good enough. So in specific situations it is a good idea to create your own implementation - but you'd have to have a really good reason and do significantly better than existing solutions.
The code generated for i/4 goes like this (for 32-bit int):
res = (i + ((unsigned)(i >> 31) >> 30)) >> 2;
ASR cannot be used for division by powers of 2 as it rounds towards minus infinity rather than zero, so an explicit round step done to round the result up if it is negative.
I came across the following w.r.t ARM processor w.r.t Efficient C:-
1) Unsigned types are faster during divisions.
2) Do not use implicit/explicit narrowing casts as they will cost extra cycles.
3) Do not use char and short types for function arguments or return values. Try, using the int type even if the range of the parameter is smaller. This makes the compiler not to perform unnecessary casts.
comp.compilers is not easy to post to so many of us don't bother.
As I said, you query the backend. The intermediate code is usually very close to the final code, so you can ask questions like: "how many cycles/ bytes is this compare going to take?". A good compiler asks the backend similar questions all the time so that most target-specific decisions are made well before the IR enters the backend.
However in this case it is good enough to ask at how many case labels one should change from binary into linear search for the remaining cases.
I wouldn't consider intermediate code being "close to the target". Quite often it is a "three-address code" in SSA form. Your other statements apply, of course.
Actually he is right. NP complete problems can be solved in reasonable time as long as the problem size stays small. It only becomes impossible to solve as the size grows.
The fact is compilers and humans can produce optimal code for small functions (have you heard about super optimizers?). It only becomes impractical for large programs.
If you are interested in such a job and don't mind going abroad then I know several UK companies who are keen to hire.
Most targets use 3-address instructions or a subset thereof, so it is a good match. In a compiler I worked on the intermediate code could represent almost every feature of the target, including some of the more complex features like conditional execution and addressing modes. So even for CISC targets you'd be close to final code.
Actually the quality and performance of STL varies greatly between implementations. Some functions have algorithmic complexity guarantees, but even that doesn't necessarily result in good performance. If it guarantees an operation in O(log(N)) time but you need it in O(1) time you're stuck.
Just look at the size() operation - it takes O(N) time. Are you saying that it is not possible to do this in O(1) in a custom implementation?
No, just the usual ~5 arguments. The compiler slowed down by about a factor of 2, and the startup time was noticeably slower. The commandline options were stored in a string hash (key->value) and needed various string concatenations to calculate the key and return the result. The STL version was a literal translation from C into STL C++. The slowdown was caused by doing lots of lookups during compilation based on the assumption they were fast operations - which was true for the C implementation but not for STL.
At the time nobody would believe STL could be that slow, but I compared STL strings with C strings, and you do indeed get a 30-50 times slowdown when doing strcat and similar operations. The total slowdown of the lookup was initially around 400 times and after a week of hard work the author got it down to only 200 times slower... That was the point I decided to ban the use of STL even in non-performance critical code.
ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here.
All logos and trade names are the property of their respective owners.