8052 emulator in C

Do you have a question? Post it now! No Registration Necessary

Translate This Thread From English to

Threaded View
Anybody got a simple 8052 emulator in C source, im trying to reverse engineer
some code and would like to emulate/simulate the code to get a better
understanding as it looks like  it was written in C and compiled by a very bad
compiler

joolz



--
--------------------------------- --- -- -
Posted with NewsLeecher v5.0 Beta 6
We've slightly trimmed the long signature. Click to see the full one.
Re: 8052 emulator in C
Quoted text here. Click to load it

What is the target MCU?  The 51 family is huge (over 600 variants) and
whilst the cores are similar there are some big differences.

Why do you want the source of the simulator?

How do you know the binary was written in C?

How big is the binary?

What is it supposed to do?

--
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
\/\/\/\/\ Chris Hills  Staffs  England     /\/\/\/\/
We've slightly trimmed the long signature. Click to see the full one.
Re: 8052 emulator in C
In reply to "Chris H" who wrote the following:

Quoted text here. Click to load it
Analog Devices ADuC84x

Quoted text here. Click to load it
So i can add in a serial driver, also the output display, you know make the
simulator behave like the real thing with inputs and outputs

 
Quoted text here. Click to load it
I can tell from the way the code is written!! cant you tell the differnece
between human and machine created code

Quoted text here. Click to load it
64k but not all used

Quoted text here. Click to load it
cant say




--
--------------------------------- --- -- -
Posted with NewsLeecher v5.0 Beta 6
We've slightly trimmed the long signature. Click to see the full one.
Re: 8052 emulator in C
Quoted text here. Click to load it

This is NOT a true 8051/52 core.  Read the documentation it is "based
on" an 8052.  Not all they 8051 simulators will handle the non standard
8051 parts like this one.

Quoted text here. Click to load it

Then use the Keil Simulator that can do this already.


Quoted text here. Click to load it

Yes... However you can not tell which HLL was used.

Quoted text here. Click to load it


Use the Keil Sumulator.

--
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
\/\/\/\/\ Chris Hills  Staffs  England     /\/\/\/\/
We've slightly trimmed the long signature. Click to see the full one.
Re: 8052 emulator in C
In reply to "Chris H" who wrote the following:

Quoted text here. Click to load it

Hence the reason for wanting source so i can modify it to take in the special
features of the chip and also the interfaces.

Ive got ucsim running now under command line and i am now optimising the code,
need more speed and adding in things like serial in/out and the interface
devices.

thanks for the help

joolz



--
--------------------------------- --- -- -


Re: 8052 emulator in C
Quoted text here. Click to load it

The Keil simulator already has that.  You won't need to modify it.

Quoted text here. Click to load it

Then use the Keil simulator that already has the support.

Why spend a lot of time and energy (== money) on re-inventing the wheel?

--
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
\/\/\/\/\ Chris Hills  Staffs  England     /\/\/\/\/
We've slightly trimmed the long signature. Click to see the full one.
Re: 8052 emulator in C



Quoted text here. Click to load it

You are going to a lot of work to reverse engineer an application.
Why is this needed?

w..



Re: 8052 emulator in C
In reply to "Walter Banks" who wrote the following:

Quoted text here. Click to load it

Bugs in code, no source code available any more, its a job for a friend and i
have been doing this work as a job for over 20 years, my first was a
re-assembler for the C64 for a games company i worked for.

I have lots of experience in this but just needed a 8052 sim.

last job was a ARM decoder and annotate so this little chip will be a lot
easier.

joolz




--
--------------------------------- --- -- -


Re: 8052 emulator in C
Quoted text here. Click to load it
You don't what a emulator, you want a de-compiler or reverse compiler.

An emulator will just execute the binary code as the real hardware would.

Using the binary to get the C back is impossible !!!!

Except for very simple programs.

Even if you have the compiler sources and understood the compile
process, you still would not be able to get the binary -> C conversion
to work.

But, have fun and good luck.

hamilton


Re: 8052 emulator in C
Hi Hamilton,

Quoted text here. Click to load it

Actually, for some simple-minded compilers, you can often reverse
engineer the code to get much of the "C" source (neglecting
variable names, some expressions, etc.).  This is especially
true of old/early compilers that didn't do much optimization.

I was able to recreate C source for a client's libraries from
binaries using this approach.  Though it required a fair bit of
"organic computing" to recognize the "patterns" in the code
(a decompiler wasn't available).  Of course, familiarity with
the product (application) goes a long way -- especially when
it comes to annotating the sources!

Note that the "organic" method can be painfully slow -- I was
only able to decompile a few KB per week.  :<  But, the alternative
is to recreate the sources from the *specification*...

[if you've never done this, it can be a really fun problem!  Sure
beats crossword puzzles!]

Quoted text here. Click to load it


Re: 8052 emulator in C
Quoted text here. Click to load it

For years I have heard that story.

I have always asked to show me any links with the compiler in question,
So I will ask if you have any links to this "simple compiler" ?

I took a compiler class 30 years ago, and my professor at the time
stated that it was not possible.
With the better compiler available today it would be even more impossible.



Quoted text here. Click to load it

Being familiar with the code is the only way to get back the C code.
But the OP seems to have no knowledge of the application.

I have lost sources in disk crashes and have had to re-create the C
sources by watching the operation of the application.


reverse-engineering is always easier when you have a good idea of what
is suppose to happen.


Quoted text here. Click to load it

Yes, building a spec for functions code is not bad, but as you say very
slow.

A few years ago I had a company needing to reverse engineer their legacy
assembly 68hc11 product.

I was able to recreate most of the application in C, but some of the
algorithms were so convoluted that I could never understand the
dis-assembly.

So, we repackaged the assembly into a C in-line assembly function and
everything still worked.

Lucky !!!


Quoted text here. Click to load it


Re: 8052 emulator in C
Quoted text here. Click to load it

Have you looked at eg. Hex-Rays? From what I've seen of it, it's pretty
good at what it does.

-a

Re: 8052 emulator in C
In reply to " snipped-for-privacy@kapsi.spam.stop.fi.invalid" who wrote the
following:

Quoted text here. Click to load it

i want to simulate the code, hence the question!!!!! i alread have the binary
and disassembly

joolz



--
--------------------------------- --- -- -
Posted with NewsLeecher v5.0 Beta 6
We've slightly trimmed the long signature. Click to see the full one.
Re: 8052 emulator in C
Hi Anders,

Quoted text here. Click to load it

Yes.  It has knowledge of how various compilers generate code
and uses that "backwards" to deduce what the code should have
been, based on the binary.

Re: 8052 emulator in C

Quoted text here. Click to load it

I guess "simple compiler" refers to some 1970's compilers for PDP-11,
Intel Intellecs and Motorola Exorcisers.

Writing compilers for these platforms  was problematic due to the 64
KiB address space limit. Overlay loading helped a lot (each
compilation phase in a separately loaded overlay branch), but you
still had to reserve space for the symbol table, that had to be kept
constantly in memory. Overlay loading with floppies was also very
slow, thus, much optimization could not be done. For this reason,
getting assembly output from a compiler was not the standard
situation.

I once wrote an object code disassembler for PDP-11. Compared to
ordinary disassemblers, the object code disassembler can also display
the global symbols defined in this module as well as displaying any
external function names (including library function names) in plain
text.

I analyzed quite a few object codes generated by Fortran, Pascal and C
compilers and I was capable of detecting by "organic matching" how
each compiler will generate code. After this, it was quite easy to
reverse engineering some algorithms.

These days with good compilers, it is much harder to reverse
engineering things based on purely the executable code.


Re: 8052 emulator in C
Quoted text here. Click to load it

In my case, these were PC based tools.  The only compilers I had
access to on the MDS were silly things like PL/M (which, actually,
was *an* improvement)

Quoted text here. Click to load it

I think the problem from that timeframe (mid 80's, in my case)
was a combination of things:
- targets were pretty crippled.  They really weren't designed
   with HLL's in mind (with the exception of the bigger 16/32
   bit machines).  E.g., support for stack frames was tedious
   at best.  And, even then, limited (e.g., "index registers"
   with +- 128 byte offsets)
- there were *lots* of different processor *families*.  6800/3/5
   6809, 68HC11, 8080/85/Z80/Z180, Z8000, 68000, 9900, 99000, 1802/5,
   6502/816, 2650, 8x300, etc.  With no single market leader.  A
   "compiler vendor" was almost forced to try to address *all* of
   these targets to increase the chance for a sale.  So, you ended up
   with a core compiler and varying backends.
- the PC was becoming a viable development platform (previously,
   we used CP/M boxes or vendor supported "development systems").
   So, you had lots of folks putting forth products to try to
   sell to that "development system".  Almost all were "command line"
   driven tools, no IDE's, etc.
- users were anxious to get their hands on *anything* that could
   expedite development.  ASM was just *painfully* slow for bigger
   projects.
- resources were starting to become affordable.  The $50 2KB EPROMs
   were a thing of the past.  And, you could actually think of putting
   more than a few *hundred* bytes of RAM into a system!

Quoted text here. Click to load it

Cool!  From this, you could probably port to a 68K disassembler
with little trouble.  Or even a 32K.

Quoted text here. Click to load it

Exactly.  The compiler tends to do the same thing, the same way.
It's hard for "hand-written" code to achieve that same level of
discipline.

Quoted text here. Click to load it

Have you looked at some of the code compiled for PICs?

Re: 8052 emulator in C
In reply to "D Yuniskis" who wrote the following:

Quoted text here. Click to load it

You missed one of my old favorite cpus, 6303 a 4bit processor, i wrote the
firmware for a mouse on that.

Also ive just finished 3 projects on a PIC18f4550 and using the Hi-TECH PRO
compiler, code is ok and runs

joolz



--
--------------------------------- --- -- -
Posted with NewsLeecher v5.0 Beta 6


Re: 8052 emulator in C
Quoted text here. Click to load it


IIRC, the 6303 was a 6803 (and 6301 was 6801).  They'd be called
"8 bit" processors even if their internal ALU was only 4b wide.  I
lump all of them in the "680x" category (the '09 and hc11 deserve
slightly better attention)

I.e., the i4004 was one of the few 4 bitters (IIRC, the TMS1000
and TLCS47 were equally crippled).

Re: 8052 emulator in C
Quoted text here. Click to load it

The 6303 I remember was the Hatichi HD6303, it was a CMOS replacement
for the Motorola 6803.

hamilton


Quoted text here. Click to load it


Re: 8052 emulator in C
Hi Hamilton,

Quoted text here. Click to load it

<grin>  It's relatively easy to disprove a negative.  :>  I'll
drag out some examples and post them here.  I think you;ll see that
most of these early compilers were pretty "straightforward" in the
way they emitted code.  You could look at stanzas and deduce from
what they were created (of course, you couldn't tell "a == b"
from "b == a" -- though sometimes you could distinguish "a > b"
from "b < a"!).

I remember thinking about "peephole optimizers" and wondering how
they could be effective ("Shirley the compiler knows what code it
*just* emitted?  Why would it ever do something as inane as
'STORE X; LOAD X'?").  But, if you saw how stanzas were "pasted"
together, you could see lots of opportunities for this kind
of micro-optimization!

Perhaps Walter can shed some light on what his products were doing
in the mid 80's and how they've progressed (along with *why*)?

Quoted text here. Click to load it

A lot depends on the code being compiled, the level of optimization
used, the optimizations *available* and the actual target itself.
E.g., older "single register" machines required lots of shuffling
to get arguments into an accumulator where they could be operated
on.

Also, older devices didn't have niceties like "MUL" or (gasp!) "DIV".
So, the repertoire of "helper functions" gave you lots of insight
into what the code was actually doing.  And, those helpers didn't
have "short-circuits" where the compiler could do a "partial"
operation, etc.

Quoted text here. Click to load it

I disagree.  You can get back code that will recompile into the
same binary.  You can further embelish that with some ideas as
to what the code is *likely* doing.  As far as the ultimate
application... <shrug>

If you have the compiler (and binary libraries) available, you have
a huge headstart.  You can feed it test cases to see what the code
looks like for various C constructs.  You can see which helper
functions get dragged in and, thus, start giving those real "names".

If you have the hardware available (or at least the memory map),
you have known starting points for the code -- instead of picking
a spot "at random".

Chances are, it uses some part of the standard libraries.  These
are relatively easy to recognize.  So, you can put names on their
entry points and back-annotate all references to them as they
are encountered.

It's trivial to identify the strings in most applications (though
some might go to some lengths to protect or hide them -- but that
is rare and starts competing with the compiler since *it* has
a notion of what constitutes a "string").  So, library functions
that use strings (e.g., printf et al.) can be identified.  Also,
strings often give you information about the data *referenced*
there -- "%d records processed.\n").

Finally, most older processors used in embedded systems were small.
Few systems could afford gobs of (EP)ROM for multimegabyte images.
Likewise, tens of KB of RAM was a lot.  It's not like trying to
reverse engineer MSBloatware...

Quoted text here. Click to load it

Sure.  But it isn't a necessary prerequisite.

There are (big name) firms whose businesses are based on reverse
engineering other people's products -- e.g., to make something
"compatible" with a closed system.

In the process, one can often find obvious "mistakes" or
opportunities for improvement that the original designers
overlooked.

One of my first jobs was at a firm that designed marine navigation
equipment (among other things).  I recall the "excitement" when a
Japanese firm expressed an interest in one of our RADAR sets.  I
think they purchased 25 of them "for evaluation".

Some time later, *they* produced a similar product.  It was very
obvious that it was "heavily inspired" (avoiding the term "copied")
by our set.

My boss grumbled at the lost business and having been "suckered".
In the next breath, he pointed out how the "competing design" had
lots of little changes that were incredibly obvious after-the-fact...
but, that had been omitted in our design!

E.g., the antenna (rotor) emitted rotational pulses to tell the
display which way it was pointed.  This allowed the sweep in the
display to be synchronized (angularly) to the antenna's position.
Of course, this was done by mounting an optointerrupter and
encoder wheel (slotted disc) on the antenna's shaft.  I think
the encoder had perhaps 1 degree azimuth resolution -- or something
like that.  It was relatively costly to manufacture the disc since
it was done photographically, etc.

The competing product had a crude disc with perhaps 9 (!) slots
cut in it.  It looked like something that a child would fashion
out of cardboard.  *But*, the disc was mounted on the high side
of the reducing gearbox that drove the antenna shaft.  So, it
rotated 40 times faster than the antenna!  (i.e., same sort of
information coming from the antenna but much lower manufacturing
costs).

Without seeing "our design" with that modification made to it,
I doubt it ever would have occurred to anyone!  <:-(

Site Timeline