LMICSE: Lego Mindstorms in Computer Science Education

Site Map | Contact Us
Project Overview | Staff | Grant Information
Short Workshops | Primary Workshops
CS 1 | Data Str. & Algo. | Prog. Languages | Architecture | Intelligent Sys. | Operating Sys. | Net-centric
Ada | C | C++ | Java | Lisp

RCX Assembler Project

small logo

Overview

You are required to write a basic functional Assembler for the RCX Lego Mindstorms platform. You will process Lego Assembler (LASM) programs, producing code that will run on the RCX.

In essence, your program will:

  • Receive an RCX assembly source code as input (an .asm text file); and
  • Generate an RCX object file consisting of RCX bytecodes as output (an .rcx binary file).

You must support the following subset of the RCX2 LASM instruction set. These instructions manipulate the motors and sensors, handle conditional and unconditional jumps, and do a few other things. This subset is enough to support simple, yet functional, RCX programs.

chk dir jmp out pwr senm sent wait

All the details of the LASM instruction set can be found in Lego's RCX2 LASM instruction set reference. An important part of approaching this project is first fully understanding the assembly code and object code formats, so work through the examples below carefully before you begin to code.

Requirements

  • Your program takes only ONE parameter, the input filename with the .asm extension. The program creates the output filename by replacing the .asm extension with the .rcx extension.
  • Both the parsing and code generation should be done during one pass over the input file.
  • Your program must correctly parse any sequence of whitespaces (separators between mnemonics, constants, numbers, etc). Valid whitespaces are: Carriage-Return (0x0D), Line-Feed (0x0A), Tab (0x09), and Space (0x20).
  • The input .asm file is case insensitive (upper or lowercase allowed). Your program must correctly assemble either case (or a mix of both).
  • Any error during assembly must be notified to the user. No need to specify the error type, just the line number where the error occurred.
  • The .rcx binary file must include an appropriate RCX image header, followed by the assembled bytecode sequence.
  • Your program must support ASM comments. In ASM, anything following a semicolon (;) is treated as a comment (for the lifespan of that line).

Useful Tips

  • Understand the problem first. Analyze the assembly listing for the sample program below (SAMPLE.LST), and derive each opcode, bytecode(s) sequence from its equivalent LASM instruction (you must operate at the bit level, using binary and hex numbers). That's exactly what your assembler program needs to do with each instruction.
  • Remember that you are not required to write an RCX program in the LASM language. You are required to assemble one using the assembler tool that you write. Of course, you need to have a few ASM files available for testing. You can start with the SAMPLE.ASM file given below (which includes all instructions to support), and modify a few parameters here and there to check for correctness, at least partially.
  • Note that writing an Assembler implies that you don't need to know how to use each LASM instruction in an RCX program (therefore, don't need to know the internals of the RCX virtual machine). You rather need to know how to encode an instruction into an equivalent bytecode sequence (an opcode, followed by zero or more bytecodes which encode the various parameters). Use the RCX2 LASM instruction set reference for this.
  • The NQC compiler has an assembly listing feature. Do NOT rely on this feature of the NQC compiler to determine the syntax of a LASM instruction. NQC's assembly listings omits parameters for some instructions, or takes them for granted. Again, obtain the correct syntax for each LASM instruction using the official Lego RCX2 LASM instruction set reference.
  • Do u se the assembly listing feature of the NQC compiler to check the correct bytecode sequence for a given LASM instruction - these are correct.
  • The RCX2 LASM instruction set reference is fully indexed for searching - Press "Ctrl-Home" to move to the beginning of the document, press "Ctrl-F" and type a LASM instruction to search, check the "Match Whole Word Only" box, and press "Enter". (Hence, you don't really need "VPB.hlp" nor follow the awkward opcode look-up procedure shown in class).
  • The branch instructions chk and jmp are a bit tricky. The last parameter must encode the number of bytes (or a function thereof) to jump. If the jump is to a previous label the calculation is trivial and it can be performed immediately.
  • However, if the jump is to a forward label (as in the chk instruction), the calculation cannot be performed immediately (because we don't the number of bytes to jump). Only after we have assembled the instructions in between the jump instruction and the forward label, we can calculate the number of bytes to jump, encode it, and write it back to the incomplete parameter bytecode in the jump instruction.

Input file (.asm)

The input file consists of an optional definition list, followed by a list of LASM instructions, each instruction having the following format:

[<label>:] <mnemonic> [<param1>[, <param2>[, <param3> ...]]]
where label is used for branching purposes when needed (for the chk and jmp instructions mainly), mnemonic is a LASM instruction mnemonic, and param1, param2, etc. are optional parameters represented as numbers or numeric definitions.

This is an example of a valid input file for the assembler:

; SAMPLE_1.ASM file
; Anything after a semicolon (;) in a line is treated as a "comment"

; List of instructions:
  pwr    7, 2, 7                ; set power level to 7 for all motors
  dir    2, 7                   ; set "forward" direction for all motors
  sent   0, 1                   ; set type of Sensor_1 to "switch" (touch)
  senm   0, 1, 0                ; set mode of Sensor_1 to "boolean" (0 or 1)
  dir    2, 5                   ; set "forward" direction for motors A,C
  out    2, 5                   ; turn "on" motors A,C
loop_1:
  chk    2, 1, 2, 9, 0, loop_2  ; "Sensor_1 value == 1"?
  dir    0, 5                   ; if FALSE goto loop_2
  out    2, 5
  wait   2, 30                  ; create 30 x 10 = 300 miliseconds delay
  dir    2, 1
  out    2, 1                   ;
  wait   2, 30                  ; etc...
  dir    2, 5                   ;
  out    2, 5
loop_2:
  jmp    loop_1                 ; unconditional jump to "loop_1" label

Assembly Listing (.lst)

This assembly listing shows the relationship between each LASM instruction (as documented in the RCX2 LASM instruction set reference) and its equivalent opcode, bytecode(s) sequence. Note that the original assembly listing output by NQC differs from the one shown here. NQC does not reveal the value nor the valid number of parameters for these instructions: pwr, senm, chk, and wait. Note that the starting offset (relative address) of each LASM instruction and its parameters are in decimal, while their equivalent opcode, bytecode(s) sequence are in hex:

; SAMPLE.LST file

; Offset   LASM instructions             Opcode,Bytecode(s)
; ------   -----------------             ------------------
  000      pwr    7, 2, 7                ; 13 07 02 07
  004      dir    2, 7                   ; e1 87
  006      sent   0, 1                   ; 32 00 01
  009      senm   0, 1, 0                ; 42 00 20
  012      dir    2, 5                   ; e1 85
  014      out    2, 5                   ; 21 85
loop_1:
  016      chk    2, 1, 2, 9, 0, loop_2  ; 85 82 09 01 00 00 15
  023      dir    0, 5                   ; e1 05
  025      out    2, 5                   ; 21 85
  027      wait   2, 30                  ; 43 02 1e 00
  031      dir    2, 1                   ; e1 81
  033      out    2, 1                   ; 21 81
  035      wait   2, 30                  ; 43 02 1e 00
  039      dir    2, 5                   ; e1 85
  041      out    2, 5                   ; 21 85
loop_2:
  043      jmp    loop_1                 ; 27 9c

Object file format (.rcx)

The RCX object file format to target is the RCX Image format from David Baum's NQC compiler. You must create and include the appropriate RCX header information along with the assembled bytecode in the output binary file.

You don't need to include the optional symbol table (which includes task/subroutine names, etc). Therefore, the fSymbolCount field in the RCXIHeader structure must be zero.

Optional Project Extension

LASM format programs are not very readable. The problem is that the RCX2 virtual machine documentation defines mnemonics for instruction opcodes only, not for other resources used by the RCX virtual machine, such as motors, sensors, timers, etc. However, this does not preclude the programmer from creating such definitions directly. This leads us to an optional extension for LASM, a definition list.

By reading the first instruction "pwr 7, 2, 7" it is not clear that we want to set the power level to 7 for all motors A,B,C. However, if the instruction is rewritten as "pwr ABC, Const, 7" instead, it becomes more clear. Contrast the assembly file with the version below using a definition list:

; SAMPLE_2.ASM file

; Optional definition list: (for improving readability)
  A = 1          ; 001
  AC = 5         ; 101
  ABC = 7        ; 111
  Fwd = 2        ; 10
  Rev = 0        ; 00
  Const = 2      ; next param source treated as "constant" (p.6)
  Switch = 1     ; touch sensor type ("sent" p.52)
  Boolean = 1    ; touch sensor mode ("senm" p.53)
  Sensor_1 = 0   ; sensors are zero-based
  SensorVal = 9  ; next param source treated as "sensor value" (p.6)
  Equal = 2      ; "Equal to" comparison operator ("chk" p.93)
  nil = 0        ; null parameter, ignore
  On = 2         ; "On" motor status ("out" p.27)

; List of instructions:
  pwr    ABC, Const, 7            ; set power level to 7 for all motors
  dir    Fwd, ABC                 ; set "forward" direction for all motors
  sent   Sensor_1, Switch         ; set type of Sensor_1 to "switch" (touch)
  senm   Sensor_1, Boolean, nil   ; set mode of Sensor_1 to "boolean" (0 or 1)
  dir    Fwd, AC                  ; set "forward" direction for motors A,C
  out    On, AC                   ; turn "on" motors A,C
loop_1:
  chk    Const, 1, Equal, SensorVal, Sensor_1, loop_2  ; "Sensor_1 value == 1"?
  dir    Rev, AC                                       ;  if FALSE goto loop_2
  out    On, AC
  wait   30                       ; create 30 x 10 = 300 miliseconds delay
  dir    Fwd, A
  out    On, A
  wait   30
  dir    Fwd, AC
  out    On, AC
loop_2:
  jmp    loop_1                   ; unconditional jump to "loop_1" label

As optional part of this project, implement the definition list extension for LASM.

Resources

You may find the following links helpful:

RCX2 LASM instruction set
RCX Image format
Bricx Command Center
Kekoa's RCX internals website
Not Quite C (NQC) compiler

Acknowledgements

This lab was created by Luis Paris. It has been edited and slightly modified by Myles McNally.