Coverting Asm to C: Metrics?

G'day All,

I'm trying to estimate the effort involved in a project converting some well structured and commented (not mine, needless to say) 8051 assembler to C. As a first estimate, naturally, I'm performing the task on a small part of the code and extrapolating. Does anyone have (or know of) any metrics for a similar project as a sanity check. Doesn't need to be 8051 to C, any non automated assembler to HLL conversion stats would be useful.

Thanks, Alf

Reply to
Alf Katz
Loading thread data ...

Hello Alf,

One mans "well documented code" is another mans "just spaghetti code".

Whenever I am asked to convert assembly to C, I bid for a redesign with documentation up front.

They way most assembly for micros are written is horrid at best.

Then.....

"But we just want to add a little extra code to ...."

This is where it will fall on its face.

Good Luck, you will need it.

donald

Reply to
Donald

I've done this sort of thing many times. I usually wind up deriving a functional spec, and maybe an architecture, from the existing assembler and re-writing clean code in C.

Even with ideal assembler, the idioms are different. With less than ideal assembler, the idioms are just plain wrong ;). I can't give you metrics; some things are hard, some things are easy.

(FWIW, I often spot a bunch of bugs in the assembler in doing such conversion exercises. But I can generally extract what the coder *intended* to do, rather than what actually happens. IOW, it's not a bad way of doing a code review and a sanity check.)

Steve

formatting link

Reply to
Steve at fivetrees

This can be very open-ended ( but you probably already know that ... )

Is this code moving to another platform, or staying on the C51, and getting a C makeover, to make new features easier to control ?

If it is staying on the C51, you can quote in stages, and move the proven/stable asm to libraries, where C can call it. ( ie don't rewrite what you don't have to ... )

You will also be able to have test-able sign-off stages, as this step should be an operational clone of the existing system.

Then, you can add all the new features in C...

-jg

Reply to
Jim Granville

Hello ,

You can collect the following metrics, a) Size of the code generated for the C code and compare thesize with that of the existing ASM code. Set a limit to say the code generated by the C code should be less that 1.5/2 times that of the ASM code. b) Do a code coverage and profiling analysis for the ASM and the C code.

May I know the exact reason for this conversion?

Best Regards, Vivekanandan M

Reply to
Vivekanandan M

I have found the metrics interesting in the cases we have done this. Evolved code is rarely that clean and compilers generally do a considerably better job at local variable allocation and placement than assembler programmers. Don't be surprised that the generated C results in shorter faster applications with less RAM requirements.

Of course I am biased.

w..

Reply to
Walter Banks

Fat chance

Maybe on a C friendly machine but not much chance on a 8051.

Be amased if it does

Agreed.

Reply to
cbarn24050

Was the assembly code originally written by someone who is a also skilled C programmer?

I'd guess somewhat longer than it would take to write the C program from scratch (given good documentation), because you have to extract the design, evaluate it, probably modify aspects of it, and then implement it.

Best regards, Spehro Pefhany

--
"it's the network..."                          "The Journey is the reward"
speff@interlog.com             Info for manufacturers: http://www.trexon.com
Embedded software/hardware/analog  Info for designers:  http://www.speff.com
Reply to
Spehro Pefhany

hehe, yes you are.

I've never had the experience of a C compiler reducing my code and data footprint or improving on the execution time. But then, I've only had one truly comparable experience where I was paid to actually port an assembly-coded, full-up application that I'd also written, as exactly as I could manage into C. The size expanded dramatically on all points and I was seriously trying to write good maintainable C and accurately reflect the details of operation. I'm not ignorant of library issues, nor numerical methods, and I feel I applied a good degree of expertise in writing the C code.

Even in the case of small routines, where I'm forced to apply the model required by C for interfacing purposes (frames, stack unwind support if appropriate, register preservation, etc), I don't find yet any C compiler able to improve on execution -or- space.

Jon

Reply to
Jonathan Kirwan

Start from the right end by defining what the product does, (I know this is obvious :-) perhaps as some sort of functional spec. Then, the existing code major functional bloacks, relationship between modules in terms of how data gets passed around the system, how many code banks are involved and relationships between const data, code and common area dependencies etc. If the code is well documented already, you have a head start, but the docs and code comments may not equal current reality. You could spend weeks or even months trying to work out what the old code is doing. Then you need to check for bugs and subtle 'intended' side effects etc. You have to assume that all the old code is suspect.

One major obstacle to analysis of older systems is that there can be global data all over the place and if the code has been heavily modified over the years, can be a nightmare to determine what global gets used where and when. You will also have to modify or write wrappers for the asm modules that you decide to keep, to conform to the calling conventions of the C compiler.

IME, reuse of a worn out code base ends up as a dog's dinner. It's usually quicker to rewrite the whole lot. Sharp pencil, nice new clean code base, well structured etc - you just have to convince the management :-)...

Chris

--
Greenfield Designs Ltd
-----------------------------------------------------------
Embedded Systems & Electronics: Research Design Development
Oxford. England. (44) 1865 750 681

Reply to
Chris Quayle

Completely agree. Nicely put.

But - the old codebase is useful in terms of analysing: - What was intended - The workarounds employed to get it to work as intended

Steve

formatting link

Reply to
Steve at fivetrees

the old code is only thing that is not suspect, it's everything else that's suspect (comments, low level requirements, customer spec, tests etc), the only thing that you can be sure of that accurately reflects the desired product performance and purpose is the old program code, it's the only thing that was guaranteed to be maintained, however poor the implementation

Reply to
steve

Yepp, you're right Steve (not at five trees). In this case the code is the spec. It has been been debugged over 10 years is very stable and is the basis of the company's existence. The code in fact defines what the product does. The code is the documentation. Although the original coder's brain is available for consultation. The way to work out what the code is doing is to rewrite it in something intelligible, may as well make it the original version of the next generation product.

Cheers, Alf

Reply to
Alf Katz

We have a product that has outgrown the 8051. Even the 33MHz Dallas single cycle 8051's are no longer fast enough to do the job. Both program and data requirements have outgrown the current 512kB X/P memory mapping schema. I am examining and comparing two major solutions. One is building a better

8051 inside an FPGA with a 3:1 improvement in speed and heaps of hardware speed ups to critical tasks. The other is migrating to a faster processor (e.g. and ARM, but the actual processor is pretty irrelevant once we get to C). The latter has numerous other advantages, not least of which is the maintainability and expandability promised by conversion to C.

The reason I was interested in metrics for the conversion process is that the major difference in the development cost of the two approaches is the need to recode to use the faster processor.

Cheers, Alf

Reply to
Alf Katz

Thanks, all, for your interesting replies. Unfortunately, no one seems to have collected metrics for the *effort* involved in translation of C to assembler, which will be the main determinant of whether we proceed with this project. I have performed the task on a small (

Reply to
Alf Katz

"[..]

[..] One is building a better 8051 inside an FPGA with a 3:1 improvement in speed and heaps of hardware speed ups to critical tasks. [..] [..]"

8051s implemented in FPGAs are commercially available.

Reply to
Colin Paul Gloster

Yes I deal with this situation all the time, the customer can't even tell you what precisely the product requirements are, only the legacy code can.

Reply to
steve

The core is (even the one Dallas bought), and I've found them to be a good starting point. They tend to be *too* 8051 compatible without taking advantage of what the extra pins and other resources of the FPGA offer. Stuff like true harvard architecture (separate P and X mem) with non muxed address & data busses, single write context switches and the other hardware speedups alluded to. That's what makes the C conversion to a cheaper, faster MCU the riskier approach, and consequently why I'm trying to get a handle on the effort (man hours/KSLOC) involved in the migration before comitting to one approach or the other.

Cheers, Alf

Reply to
Alf Katz

Hold it a moment, please. Just so we're 101% perfectly clear about this: you've outgrown a 512 KiB super-8051 system writing all the code in *assembler*? I'll hand it to you, you must have had a team of brave people work on such monster. I've maintained a 96 KiB code-size, Super-8051 project done entirely in assembly, and considered that to be sitting on the fence, leaning dangerous towards "mission impossible" territory.

I'd be careful with that expected speedup. Just ask yourself: if it were realistic to render that IP core Dallas bought for 100 MHz clock frequency, what kept Dalls from doing that in their chips (keeping in mind they're going ASIC, so they should be faster, not slower than an FPGA)? Would DalSemi really shoot their own foot just like that?

There is, theoretically at least, a third option, albeit it a

*thoroughly* nasty one: an 8051 machine code interpreter, running on whatever CPU you can find that is fast enough to pull off that stunt.
--
Hans-Bernhard Broeker (broeker@physik.rwth-aachen.de)
Even if all the snow were burnt, ashes would remain.
Reply to
Hans-Bernhard Broeker

For another data point, look at these new 'monster 80C51' devices :

formatting link

These have the Farcall C390 core, and claim 100Mips, with 512K Flash Prices are apparently sub $6

I'd define the data sizes carefully, depending on the code/data split, you might already be above single chip devices. There are very few

1Mbyte ARM uC, so this might push into microprocessor solutiom, which is a quite different animal.

It depends on the project, but another viable soultion would be to split into two controllers. The stable low level stuff, stays in 80C51(s), and the new stuff goes into some 32 bit core. uC these days are so cheap, they cost less than the packing, or cables, in many projects.

-jg

Reply to
Jim Granville

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.