Overview
You are required to write a basic functional Assembler for the RCX Lego
Mindstorms platform. You will process Lego Assembler (LASM) programs,
producing code that will run on the RCX.
In essence, your program will:
- Receive an RCX assembly source code as input (an .asm text file);
and
- Generate an RCX object file consisting of RCX bytecodes as output
(an .rcx binary file).
You must support the following subset of the RCX2 LASM instruction
set. These instructions manipulate the motors and sensors,
handle conditional and unconditional jumps, and do a few other things.
This subset is enough to support simple, yet functional, RCX programs.
All the details of the LASM instruction set can be found in Lego's RCX2
LASM instruction set reference. An important part of approaching
this project is first fully understanding the assembly code and object
code formats, so work through the examples below carefully before you
begin to code.
Requirements
- Your program takes only ONE parameter, the input filename with the
.asm extension. The program creates the output filename by replacing
the .asm extension with the .rcx extension.
- Both the parsing and code generation should be done during one pass
over the input file.
- Your program must correctly parse any sequence of whitespaces (separators
between mnemonics, constants, numbers, etc). Valid whitespaces are:
Carriage-Return (0x0D), Line-Feed (0x0A),
Tab (0x09), and Space (0x20).
- The input .asm file is case insensitive (upper or lowercase
allowed). Your program must correctly assemble either case (or a mix
of both).
- Any error during assembly must be notified to the user. No need to
specify the error type, just the line number where the
error occurred.
- The .rcx binary file must include an appropriate RCX
image header, followed by the assembled bytecode sequence.
- Your program must support ASM comments. In ASM, anything following
a semicolon (;) is treated as a comment (for the lifespan of that line).
Useful Tips
- Understand the problem first. Analyze the assembly listing for the
sample program below (SAMPLE.LST), and derive each opcode, bytecode(s) sequence
from its equivalent LASM instruction (you must operate at the bit level,
using binary and hex numbers). That's exactly what your assembler program
needs to do with each instruction.
- Remember that you are not required to write an RCX program in the
LASM language. You are required to assemble one using the assembler
tool that you write. Of course, you need to have a few ASM files available
for testing. You can start with the SAMPLE.ASM file
given below (which includes all instructions to support), and modify
a few parameters here and there to check for correctness, at least
partially.
- Note that writing an Assembler implies that you don't need to know
how to use each LASM instruction in an RCX program (therefore,
don't need to know the internals of the RCX virtual machine). You rather
need to know how to encode an instruction into an equivalent
bytecode sequence (an opcode, followed by zero or more bytecodes which
encode the various parameters). Use the RCX2
LASM instruction set reference for this.
- The NQC compiler has an assembly listing feature. Do NOT rely on
this feature of the NQC compiler to determine the syntax
of a LASM instruction. NQC's assembly listings omits parameters for
some instructions, or takes them for granted. Again, obtain the correct
syntax for each LASM instruction using the official Lego RCX2
LASM instruction set reference.
- Do u se the assembly listing feature of the NQC compiler to check
the correct bytecode sequence for a given LASM instruction - these
are correct.
- The RCX2 LASM instruction set reference
is fully indexed for searching - Press "Ctrl-Home" to move to the beginning
of the document, press "Ctrl-F" and type a LASM instruction to search,
check the "Match Whole Word Only" box, and press "Enter". (Hence, you
don't really need "VPB.hlp" nor follow the awkward opcode look-up procedure
shown in class).
- The branch instructions chk and jmp are
a bit tricky. The last parameter must encode the number of bytes (or
a function thereof) to jump. If the jump is to a previous label the
calculation is trivial and it can be performed immediately.
- However, if the jump is to a forward label (as in the chk instruction),
the calculation cannot be performed immediately (because we don't the
number of bytes to jump). Only after we have assembled the instructions
in between the jump instruction and the forward label, we can calculate
the number of bytes to jump, encode it, and write it back to the incomplete
parameter bytecode in the jump instruction.
Input file (.asm)
The input file consists of an optional definition list, followed by
a list of LASM instructions, each instruction having the following format:
[<label>:] <mnemonic> [<param1>[, <param2>[, <param3> ...]]]
where
label is used for branching purposes when needed (for the
chk and
jmp instructions
mainly),
mnemonic is a LASM instruction mnemonic, and
param1,
param2,
etc. are optional parameters represented as numbers or numeric definitions.
This is an example of a valid input file for the assembler:
; SAMPLE_1.ASM file
; Anything after a semicolon (;) in a line is treated as a "comment"
; List of instructions:
pwr 7, 2, 7 ; set power level to 7 for all motors
dir 2, 7 ; set "forward" direction for all motors
sent 0, 1 ; set type of Sensor_1 to "switch" (touch)
senm 0, 1, 0 ; set mode of Sensor_1 to "boolean" (0 or 1)
dir 2, 5 ; set "forward" direction for motors A,C
out 2, 5 ; turn "on" motors A,C
loop_1:
chk 2, 1, 2, 9, 0, loop_2 ; "Sensor_1 value == 1"?
dir 0, 5 ; if FALSE goto loop_2
out 2, 5
wait 2, 30 ; create 30 x 10 = 300 miliseconds delay
dir 2, 1
out 2, 1 ;
wait 2, 30 ; etc...
dir 2, 5 ;
out 2, 5
loop_2:
jmp loop_1 ; unconditional jump to "loop_1" label
|
Assembly Listing (.lst)
This assembly listing shows the relationship between each LASM instruction
(as documented in the RCX2 LASM instruction
set reference) and its equivalent opcode, bytecode(s) sequence.
Note that the original assembly listing output
by NQC differs from the one shown here. NQC does not reveal the value
nor the valid number of parameters for these instructions: pwr, senm, chk,
and wait. Note that the starting offset (relative
address) of each LASM instruction and its parameters are in decimal,
while their equivalent opcode, bytecode(s) sequence are in hex:
; SAMPLE.LST file
; Offset LASM instructions Opcode,Bytecode(s)
; ------ ----------------- ------------------
000 pwr 7, 2, 7 ; 13 07 02 07
004 dir 2, 7 ; e1 87
006 sent 0, 1 ; 32 00 01
009 senm 0, 1, 0 ; 42 00 20
012 dir 2, 5 ; e1 85
014 out 2, 5 ; 21 85
loop_1:
016 chk 2, 1, 2, 9, 0, loop_2 ; 85 82 09 01 00 00 15
023 dir 0, 5 ; e1 05
025 out 2, 5 ; 21 85
027 wait 2, 30 ; 43 02 1e 00
031 dir 2, 1 ; e1 81
033 out 2, 1 ; 21 81
035 wait 2, 30 ; 43 02 1e 00
039 dir 2, 5 ; e1 85
041 out 2, 5 ; 21 85
loop_2:
043 jmp loop_1 ; 27 9c
|
Object file format (.rcx)
The RCX object file format to target is the RCX Image
format from David Baum's NQC compiler. You must create and include
the appropriate RCX header information along with the assembled bytecode
in the output binary file.
You don't need to include the optional symbol table (which includes
task/subroutine names, etc). Therefore, the fSymbolCount field
in the RCXIHeader structure must be zero.
Optional Project Extension
LASM format programs are not very readable. The problem is that
the RCX2 virtual machine documentation defines mnemonics for
instruction opcodes only, not for other resources used by the
RCX virtual machine, such as motors, sensors, timers, etc. However, this
does not preclude the programmer from creating such definitions directly.
This leads us to an optional extension for LASM, a definition
list.
By reading the first instruction "pwr 7, 2, 7" it
is not clear that we want to set the power level to 7 for
all motors A,B,C. However, if the instruction is rewritten as "pwr ABC, Const, 7" instead,
it becomes more clear. Contrast the assembly file with the version below
using a definition list:
|
; SAMPLE_2.ASM file
; Optional definition list: (for improving readability)
A = 1 ; 001
AC = 5 ; 101
ABC = 7 ; 111
Fwd = 2 ; 10
Rev = 0 ; 00
Const = 2 ; next param source treated as "constant" (p.6)
Switch = 1 ; touch sensor type ("sent" p.52)
Boolean = 1 ; touch sensor mode ("senm" p.53)
Sensor_1 = 0 ; sensors are zero-based
SensorVal = 9 ; next param source treated as "sensor value" (p.6)
Equal = 2 ; "Equal to" comparison operator ("chk" p.93)
nil = 0 ; null parameter, ignore
On = 2 ; "On" motor status ("out" p.27)
; List of instructions:
pwr ABC, Const, 7 ; set power level to 7 for all motors
dir Fwd, ABC ; set "forward" direction for all motors
sent Sensor_1, Switch ; set type of Sensor_1 to "switch" (touch)
senm Sensor_1, Boolean, nil ; set mode of Sensor_1 to "boolean" (0 or 1)
dir Fwd, AC ; set "forward" direction for motors A,C
out On, AC ; turn "on" motors A,C
loop_1:
chk Const, 1, Equal, SensorVal, Sensor_1, loop_2 ; "Sensor_1 value == 1"?
dir Rev, AC ; if FALSE goto loop_2
out On, AC
wait 30 ; create 30 x 10 = 300 miliseconds delay
dir Fwd, A
out On, A
wait 30
dir Fwd, AC
out On, AC
loop_2:
jmp loop_1 ; unconditional jump to "loop_1" label
|
As optional part of this project, implement the definition list extension
for LASM.
Resources
You may find the following links helpful:
Acknowledgements
This lab was created by Luis Paris. It has been edited and slightly
modified by Myles McNally.
© 2001, 2004 by Scott Anderson, Frank Klassner,
Pam Lawhead, and Myles McNally. This work is supported by NSF grants 0088884
and 0306096. Permission to use, copy, adapt and modify these materials
for instructional purposes is granted. These materials can be obtained
from our web site
www.mcs.alma.edu/LMICSE.
If you have suggestions for improvement, please contact us via the web
site; we would really appreciate it. This file was last modified on
June 1, 2005.