Writing a simple assembler

- P
- Paul Keinanen
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Mar 9, 2006 9:48 AM

At least RSX-11 also contained a nice table driven parser (TPARS) in the run time library. Macros were used to define the state tables with all the various transitions.

This parser was initially intended for command line parsing, but it was so versatile that I wrote a cross assembler (for a strange 20 bit test instrument processor). This cross assembler was then used to write a Basic interpreter for that 20 bit processor. The parsing tables became quite complex, due to the oddities of the target processor.

The TPARS was also useable for parsing simple high level languages:-).

Paul

- E
- Everett M. Greene
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Mar 9, 2006 4:48 PM

And then there are meta-assemblers that can generate code for any processor...

- A
- ArarghMail603NOSPAM
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Mar 9, 2006 5:25 PM

Way back in the 70's I wrote a series of macros for 360 ASM F to cross compile object for a IBM 705. The cute part is the the 705 was a character machine, some what like a 14xx. Which meant that the binary addresses from the symbol table had to be converted to 4? 5? character leading zero strings. I have forgotten most of the details by now.

--
ArarghMail603 at [drop the 'http://www.' from ->] http://www.arargh.com
BCET Basic Compiler Page: http://www.arargh.com/basic/index.html

To reply by email, remove the garbage from the reply address.

- W
- Walter Banks
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Mar 9, 2006 6:04 PM

This sure brings back memories. We wrote a Cross assembler for the F8 in PDP11 macro's. It didn't set any speed records. I think about 45 minutes to assemble 12K of code. It was enough to implement a Daisy Wheel printer controller. The prototype is still around the office in case someone needs a typewriter that actually hits the paper with a hammer.

Macro 11 was a nice piece of software

w..

toby wrote:

- C
- cs_posting
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Fri, Mar 10, 2006 7:11 PM

I would imagine things get more complicated when you have both a short relative and a long call/jump addressing scheme - you don't know before you emit the intervening code how far away your target will be, so you don't know how long an instruction word you need to emit, so you don't know how far away your target will be...

Of course if the two modes have different mnemonics, then it's up to the user to pick the right one, and the assembler only to error if the short one is used where it won't work.

- T
- Tauno Voipio
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Fri, Mar 10, 2006 7:49 PM

There is an algorithm, published in Datamation (1968-1970), which enables the translation of variable-length addresses so that the task of address length selection is left to the assembler.

It needs a third pass to resolve the addressing lengths.

IIRC, the Borland Macro Assembler for i86 family can also resolve the addresses to long or short, depending on what is needed.

--

Tauno Voipio
tauno voipio (at) iki fi

- L
- les.hildenbrandt
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Fri, Mar 10, 2006 8:31 PM

I have written cross assemblers in assembly (x86(DOS) and 8080(CPM)), although it was a long time ago. The hardest part was parsing each line (order of precedence, infix to outfix, multiple radix's, math, exponents, etc..) The actual building of the symbol table (first pass), and assigning the opcodes to mnemonics (second pass) was pretty easy. The just write the output files, bin, hex, symbol, list etc..

My same basic assembler worked for PIC16c5x and 8748. I fould universal cross assemblers table driven assembler, which would work for any processor, and gave up on my own.

Les

- P
- Paul Keinanen
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Fri, Mar 10, 2006 9:16 PM

Backward references are easy to solve, the forward references are a bit trickier :-). Anyway, you could speculatively generate the long forward reference in the first pass and when the forward definition of a label is encountered, check if the forward branch is within short distance. If it is, in the symbol table, decrement any labels after the branch instruction by the number of bytes saved. If an optimum result is not needed, leave it this way.

However, when moving the labels upwards after the optimised instruction may reduce the distance from some other prior long branch instruction, which can also be optimised.

Thus, also the location of the speculative long branch instruction and a pointer to the symbol table entry of the target label needs also to be stored into a temporary symbol table. Those speculative long branch instructions have been converted to short branches can be removed from the symbol table.

When the symbol table shuffling is done, a normal second pass can be performed and since the locations of all labels are now fixed, it is easy to emit either a long or short branch instruction.

Paul

- B
- Bu
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Sat, Mar 11, 2006 5:38 PM

Hi Michael,

I looked up what i still have on this 30-year old project (old sentiment from my education). I could not find the papertape. I only found the listing (with all rem-statements removed to speed up the process).

This listing is not really 'structured' basic so it is not realy clear for somebody else. I do have a detailed description on the cross assembler program (that even has a simulator and reverse assembler!) but this is in Dutch.

So thanks for the interest but i can not make it available without a lot of extra work (scanning etc).

Bu

- I
- Isaac Bosompem
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Sat, Mar 11, 2006 7:11 PM

Hi Grant, sorry for the cross posting.

Do you know a good Python IDE? Which one do you use yourself?

- T
- toby
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Sat, Mar 11, 2006 7:28 PM

Eclipse will integrate nicely with Python - though I am not a Pythonist myself (I use Eclipse for C, C++, Perl, etc).

formatting link

- G
- Grant Edwards
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Sat, Mar 11, 2006 8:35 PM

Um, you didnt. Cross-post, that is.

Bash and an editor with a good python mode. I use Jed.

--
Grant Edwards                   grante             Yow!  Nice decor!
                                  at               
                               visi.com

- T
- toby
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Sat, Mar 11, 2006 10:27 PM

I don't know why I didn't think of this before, since I've been looking at it recently, but Mauro Persano's wicked-clever abuse of templates to create a mock-assembler syntax for GNU Lightning is also in this vein. I'm wishing for an excuse to try out his froofyjit:

formatting link

- T
- toby
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Sun, Mar 12, 2006 5:43 PM

Not Mauro Persano's, but actually belonging to mncw, a.k.a. bi:

formatting link

Apologies for the misattribution. And I only use the word 'abuse' jokingly :-)

- W
- Walter Banks
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Mon, Mar 13, 2006 2:35 AM

The Datamation algorithm (Sorry I don't have the exact reference) fails to work on many instruction sets for example the National COP8 family which has some instructions that have a pc offset, offset in the current page and absolute addressing modes.

The algorithm essentially says start with the longest instruction form and reduce to a shorter version of the instruction until no more instructions in the generated code have an "in range" workable form.

The COP8 case and many others with page oriented instructions start to reduce code using the Datamation algorithm a secondary problem starts to show as one of the source or destination addresses slides over a page boundary.

The freescale 6812 has an interesting similar problem where one of the call types has a different return instruction. Then this call is used all calls to the subroutine needs to be the same type. In automatically generated code we always want to generate the shortest form but if any call cannot reach, all the calls need to be changed and all the returns for the function must be changed as well.

w..

Tauno Voipio wrote:

- T
- Tauno Voipio
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Mon, Mar 13, 2006 6:34 AM

I left out the paged architectures (PDP-8 & co from the years gone), they belong to the same place as dinosaurs and Fortran - history.

The algorithm is extensible to the paged case, but as long as there are several separately translated modules, the code produced is sub-optimal.

In any case, the paged architectures are prone to page-zero overflow.

--

Tauno Voipio
tauno voipio (at) iki fi

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Mon, Mar 13, 2006 8:12 AM

Idle is a simple one that is part of most Python installations. For starting out, or for quick work, it's probably the fastest and easiest.

SPE and Eric are Python-specific IDEs.

Boa Constructor is a Python IDE and wxPython designer.

Most "big" editors are also IDEs, and can work with Python - Eclipse, jedit, (x)emacs, vim, etc.

- T
- toby
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Mon, Mar 13, 2006 6:20 PM

"Relaxation" method comes to mind. The Transputer was an interesting case, since you could build offsets of arbitrary magnitude with its general prefix mechanism. (i.e. There were an arbitrary number of jump instruction lengths, not just short and long.)

- R
- randyhyde
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Tue, Mar 14, 2006 12:30 AM

Branch displacement optimization is provably NP-Complete. And this is true whether there are only two branch sizes or an arbitrary number of them. This means that the best known solution requires, worst case, O(2**n) passes over the program (where n is the number of branches in the program) to produce the optimally sized program.

Some assemblers, such as MASM, Gas, and FASM on the x86 start with long branches and successively reduce their sizes on each pass (as legal) until they make a pass during which no branch sizes change. This does

*not* produce, guaranteed, the minimal sized program. Indeed, it's better in most cases to start off assuming the branches are short and extend them as necessary. Even so, this scheme is not guaranteed to produce the minimum sized program.

In fact, a "greedy" algorithm (make all the branches that can be made short, short) is not guaranteed to produce the smallest program. You can come up with some (synthetic) examples where making one particular branch longer, that could be shorter, and the resulting program, overall, consumes less bytes (think about other objects in the program, besides branches, whose size depends upon the span of two labels).

Nevertheless, a "relaxation" or refinement approach, while not guaranteed to be optimal, still produces very good results. It may take an arbitrary number of passes, though. Two or three is rarely enough for a complex program. I once generated a synthetic test file and FASM made nearly 2,000 passes over the source file during optimization. Cheers, Randy Hyde