Writing a Chip 8 Emulator (Part 1)

By Craig Thomas, Sat 21 June 2014, in category Emulation

chip8, emulator

While I was in the middle of my PhD, I needed a coding project to keep my skills sharp. Even though my area was Computer Science, a lot of my work with Natural Language Generation and Semantics was theoretical (read: my Computer Science degree did not involve a lot of programming). I needed something fun that was not related to the many papers and books I was reading – something that I really enjoyed for no practical reason. Enter nostalgia for 1980’s vintage computers.

The CoCo

My first computer was a http://en.wikipedia.org/wiki/TRS-80_Color_Computer. The Color Computer (CoCo) was a Motorola 6809-based machine that ran at 0.89 MHz, and was Tandy’s solution for at home personal computing. The Color Computer line was quite different from the Zilog 80-based business machines that earned the nickname “Trash 80”. The only similarity between those lines are the TRS-80 moniker and awesome chrome plating. My experience with the CoCo at the time consisted of punching in programs from various computer magazines (such as the amazing Rainbow publication), as well as typing school reports using Scripsit, and trying my hand at designing my own video games.

At the time, I didn’t appreciate the power of the CoCo. While it lacked dedicated sound and sprite hardware that made the Commodore 64 an awesome games platform, the 6809 processor was quite advanced at that point in time. Now – having more experience with computing theory and knowledge of computing hardware in general – I wanted to learn more about it. So, I decided to write a virtual version of one. However, having not written an emulator before, I needed to test out what techniques would work first on a smaller project.

Enter the Chip 8

The Chip 8 isn’t technically a hardware device – it’s actually an interpreted language. The whole point of the Chip 8 was to create a language which would have a standardized execution profile across different hardware platforms (much like the Java language specification and Java Virtual Machine). According to sources such as Wikipedia, Chip 8 virtual machines were written for several different platforms from the late 70’s to the early 90’s – the most notable being for HP graphics calculators.

What makes writing a Chip 8 emulator a good learning project is it’s simplicity. There are roughly 40 instructions, each of which is composed of 2 bytes. Compared to other architectures, the Chip 8 has only a single addressing mode (inherent), a simple memory structure, and straightforward I/O routines. There is also a wealth of knowledge readily available about the Chip 8, and many other implementations of it available for reference if you need to know how something should work.

Writing the Emulator

There are many sources out there that describe the Chip 8 and how to emulate it in more detail (for example, here, here, and here). I’ll touch upon some of the high points. For my first shot at writing a Chip 8 emulator, I decided to use C (later I went on to re-write it in both Python and Java). One of my reasons for doing so was due to C’s control over primitive types and memory allocation.

Defining Bytes and Words

There are only two real types that we need to define for the Chip 8 - a byte and a word. A byte is simply a structure or data type capable of storing an 8-bit unsigned integer. A simple way to represent a byte is to create a type definition for one using C’s char, which is usually 8-bits in length:

typedef unsigned char byte;

The other type that we need to define is a word – two bytes put together (in other words, a word is a 16-bit unsigned integer). However, we often want to access either the high or low 8-bits individually. While we could use a byte mask and shift values to accomplish this task, it’s much easier to define a structure that will allow us this type of access for free. C provides a union type that allows us to do this quite easily:

typedef union {
   unsigned short int WORD;
   struct {
      #ifdef WORDS_LITTLE_ENDIAN
         byte high, low;
      #else
         byte low, high;
      #endif
   } BYTE;
} word;

The union essentially defines a word by saying there are two components - a WORD, which is an unsigned short integer (16-bits long), as well as a structure called a BYTE that has high and low components in it. So, to access the full 16-bit value, assuming you had a word called w, you would use w.WORD. Similarly, if you wanted the high 8-bit value of the word, it’s just w.BYTE.high, and w.BYTE.low for the lower 8-bits.

Endianness

The #ifdef WORDS_LITTLE_ENDIAN statement is way of dealing with byte ordering. In machines that deal with data larger than 8-bits, there are two ways to store the data. The most significant byte (MSB) can be stored in the first memory location, or it can be stored last.

For example, if we have the 16-bit value $FFEE, the most significant byte is $FF, and the least significant byte is $EE. In memory, two locations (say 54 and 55) are used to represent this value. The endianness of the architecture determines what memory location the MSB is stored in. For big-endian machines, $FF would be stored in location 54, while $EE would be stored in location 55. It would be stored as $FFEE.

With little-endian machines, the locations are reversed - $EE would be stored in location 54, while $FF would be stored in location 55. It would be stored as $EEFF.

Notice that in the union above, we define two different orderings for the byte interpretation of the word - low, high in the case of big-endian, and high, low in the case of little-endian. By defining a word in this way, we can change the endianness of the emulator by defining a single constant. Notice also that the definition of big- and little-endian appear to be backwards - this is because I’m developing on an x86-based system, which is little-endian. In essence, this statement converts from a little-endian architecture into whatever architecture is required.

Memory

The Chip 8 is usually defined with 4 kilobytes (4K) of memory. The first 512 bytes are reserved for the interpreter, with the first 80 bytes being reserved for sprite information relating to the characters 0-9 and A-Z. Chip 8 programs typically start at memory address $200.

The memory model for the Chip 8 is quite simple. Some computer architectures at the time – such as the CoCo – had memory mapped I/O routines. This meant that to run specific input or output routines, you needed to write a byte sequence into a particular memory address. Because the Chip 8 I/O routines are captured as actual CPU instructions, there is no need to trap memory accesses and pass them to an I/O handler. If you want to, you could easily represent its memory as a fixed length array of bytes.

For my emulator, I chose to represent memory as a global variable in the form of a pointer to a malloced block of bytes. I then created simple helper functions to initialize memory. Note that the helper function takes the size of memory as an argument. The reason behind this is that I wanted to reuse as many components for the Chip 8 in a CoCo emulator, so being able to specify memory size was useful:

int
memory_init(int memorysize)
{
   memory = (byte *)malloc(sizeof (byte) * memorysize);
   return memory != NULL;
}

I also created helper functions to read and write to memory locations:

inline byte
memory_read(register int address)
{
   return memory[address];
}

inline void
memory_write(register word address, register byte value)
{
   memory[address.WORD] = value;
}

The purpose of these helper functions is again for the future. Because of the simplicity of the Chip 8 memory structure, you don’t really need functions to read and write from memory. However, for the CoCo emulator, because memory is I/O mapped, you need to filter the reads and writes so that you can perform I/O functions if need be. It was just as easy to create the functions here and get used to using them. To help with performance, I used the inline statement to ask the compiler to expand the function where it is called from, instead of jumping to the function. Likewise, the register function asks the compiler to keep the specified variables on a CPU register, again in an effort to make the functions faster.

The Virtual CPU

General Purpose Registers

The Chip 8 CPU contains 16 general-purpose 8-bit registers. The registers - from 0 through F - can store values from 0 to 255. The registers are all unsigned, meaning that the Chip 8 does not differentiate between positive and negative numbers. The F register is usually used to store information relating to carries or borrows when addition or subtraction operations are performed. The Chip 8 registers can easily be represented in an array (note – from here on, I will be using hexadecimal instead of decimal):

byte v[0x10];

Index Register

In addition to the general purpose register, the Chip 8 contains a 16-bit index register I. The index register is used when iterating over arrays or tables. Its value is typically added to some other register to provide an address into memory.

word i;

The Stack Pointer

A special register called the stack pointer SP is used to store the current top of the hardware stack. The stack pointer is not directly accessible to the programmer. For example, the stack pointer is used by instructions such as RTS (return from subroutine) and STOR (store registers). In the case of RTS, the return address is stored in memory at the location pointed to by SP, and is then incremented. Similarly for STOR, the contents of the registers are stored in memory at the address of SP, and is incremented. Because the stack pointer can point anywhere in memory, it needs to be represented by a 16-bit value:

word sp;

Delay and Sound Timers

The Chip 8 defines two different timers – the delay timer register DT and the sound timer register ST. Both timers fire 60 times a second. When fired, the value in the timer registers is decremented by 1 until they reach zero. For the sound timer, a beep is emitted by the Chip 8 whenever the value is not zero. Both timers can store values from 0 – 255, making them byte sized.

byte st;
byte dt;

Program Counter

Perhaps the most important register is the program counter PC. The program counter keeps track of where in memory the next instruction should be fetched from. Because this value can come from anywhere in memory, it needs to store a 16-bit integer, making it word sized.

word pc;

Representing the CPU

To make all of these registers easier to keep track of, I defined a single structure:

typedef struct {
    byte v[0x10];      /**< Registers                 */
    word i;            /**< Index register            */
    word pc;           /**< Program Counter register  */
    word sp;           /**< Stack Pointer register    */
    byte dt;           /**< Delay Timer register      */
    byte st;           /**< Sound Timer register      */
    word operand;      /**< The current operand       */
} chip8regset;

This allowed me to create a single global variable called cpu that is a chip8regset. One of the items that I haven’t yet described is the operand. This is essentially the instruction that was fetched by the processor from the current address pointed to by the PC. The reason I have this here is because you often need to perform actions on it, and having it in one place is very convenient.

End of Part 1

In this article, I discussed the memory structure of the Chip 8, and described the elements that go into representing a Chip 8 CPU. In the next article, I will discuss the main emulator loop, as well as how to handle keyboard and screen interactions. The full source code of my Chip 8 emulator is available on GitHub. If C isn’t your language, you can also feel free to check out my Python implementation (which is meant specifically as a teaching tool), or my Java implementation.