Skip to content

haldean/x6502

Repository files navigation

+-------------------------------------------------------------+
|                                                             |
|                           x6502                             |
|                 a simple 6502 CPU emulator                  |
|                                                             |
+-------------------------------------------------------------+

    x6502 is an emulator for the 6502 class of processors.
    It currently supports the full instruction set of the
    6502 (plus a few extensions) and has a rudimentary
    simulated I/O bus. It should be able to run arbitrary
    x6502 bytecode with ``correct'' results, although most
    binaries for common 6502 systems (Atari, C64, Apple II,
    etc) won't function as expected, since they expect I/O
    devices to be mapped into memory where there are
    currently none.

    x6502 is freely available under the original 4-clause
    BSD license, the full text of which is included in the
    LICENSE file.

Building and running x6502

    To build x6502, just run `make' in the project root. You
    will need clang and Python installed. To use gcc, change
    the $(CC) var in the Makefile. No libraries beyond POSIX
    libc are used. This will produce the x6502 binary.

    x6502 takes the compiled 6502 object file as an
    argument, and runs it until it encounters an EXT
    instruction (EXT instructions are an extension to 6502
    bytecode, see below). You can use any 6502 assembler to
    compile to 6502 bytecode; `xa' is one that is bundled
    with Debian-based distros. Note that, by default, x6502
    loads code in at address 0x1000; you therefore need to
    either tell your assembler that that's the base address
    for the text section of your binary or override the
    default load address using the `-b' flag of x6502. Note
    that 0x1000 is the default load address for the `xa'
    assembler.

    If you want to compile a version of x6502 that dumps
    machine state after every instruction, run `make debug'
    instead of `make'. This will also disable compiler
    optimizations. This mode really does not play nice with
    vterm mode, so be warned.

Extensions to the 6502 instruction set

    x6502 recognizes two instructions that are not in the
    original 6502 instruction set. These are:

        DEBUG (0xFC): prints debugging information about the
                      current state of the emulator
        EXT (0xFF):   stops the emulator and exits

    Both of these are defined as macros in `stdlib/stdio.s'.
    To disable these extensions, compile with
    -DDISABLE_EXTENSIONS (right now, this can be done by
    adding that flag to the Makefile).

    This also implements a subset of the 65C02 and 65C816
    instruction set, in particular the WAI (0xCB)
    instruction. The WAI instruction pauses the emulator
    until an I/O interrupt is thrown.

I/O memory map:

    There are four I/O devices right now: a character input
    device, a character output device, a virtual terminal
    and a block device. Convenience constants and macros for
    the character I/O devices are defined in
    `stdlib/stdio.s' for use in user programs. Add stdlib to
    your include path and then add `#include <stdio.s>' to
    your program to use these constants.

    I/O options are controlled by setting bits on the I/O
    flag byte at address 0xFF02. The current set of
    supported flags are:

        VTERM_ENABLE (0x01):
            when set, activates vterm mode.
        WAIT_HALT (0x02):
            when set, waits for a keypress input before
            terminating upon receiving an EXT instruction.
            Non-vterm applications will probably want to set
            this flag, as some implementations of ncurses
            will clear the display when the emulator exits.

    When outputting characters, you can control the
    ``paint'' with which the characters are drawn. You can
    do so by modifying the PAINT flag at location 0xFEE8.
    Paints are an OR-ing of a color (bottom 4 bits) and a
    style (top 4 bits). Supported colors are:

                PAINT_BLACK         0x00
                PAINT_RED           0x01
                PAINT_GREEN         0x02
                PAINT_YELLOW        0x03
                PAINT_BLUE          0x04
                PAINT_MAGENTA       0x05
                PAINT_CYAN          0x06
                PAINT_WHITE         0x07

    Supported styles are:

                PAINT_DIM           0x20
                PAINT_UNDERLINE     0x40
                PAINT_BOLD          0x80

    Thus, as an example, an underlined, bold green character
    would have paint 0xC2.

I/O devices:

    The character output device is mapped to 0xFF00. Any
    character written to FF00 is immediately echoed to the
    terminal.

    The character input device is mapped to 0xFF01. When a
    character is available on standard in, an interrupt is
    raised and FF01 is set to the character that was
    received. Note that one character is delivered per
    interrupt; if the user types ``abc'', they will get
    three interrupts one after the other.

    The virtual terminal is activated by setting the
    VTERM_ENABLE bit on the IO flag byte. After the flag is
    set, the data in memory addresses 0xFB00 through 0xFEE7
    are mapped to a 40x25 grid in the host terminal. Data in
    this region is stored in row-major format, and any write
    will trigger a refresh of the vterm.

    Note that even in vterm-mode, the putchar-esque
    character output device is still usable, and will put
    the character at the position directly after the
    position of the last write.

    A commented example of how to use the character I/O
    capabilities of x6502 is provided in
    sample_programs/echo.s, and an example of a vterm
    application is provided in sample_programs/spam.s

    A block device can be mapped in with control addresses
    at 0xFF03 through 0xFF07. To use the block device, you
    must specify a binary disk image to back the device
    using the -d flag. To read from the block device, write
    an address in the disk image to 0xFF03 and 0xFF04, with
    the low byte in 0xFF03. The value at that location in
    the disk image will be written to 0xFF05, which your
    program can then read. To write, set the memory address
    to write to using the same method, then write the
    desired byte to 0xFF06. If any of these operations
    return an error, the byte at 0xFF07 will be nonzero.

Reading the source

    x6502 was written to be easy to understand and read. A
    good place to start is `cpu.h', which defines a few
    constants used throughout the code (mostly around CPU
    flags) as well as the `cpu' struct, which is used pretty
    much everywhere.

    `emu.c' is where the interesting stuff happens; this is
    the main loop of the emulator where opcodes are decoded
    as dispatched. It also handles interrupts and calls out
    to I/O handlers.

    The code for actual opcode interpretation is a little
    strange; there are lots of ``header'' files in the
    opcode_handlers directory that are not really header
    files at all. These files all contain code for handling
    opcode parsing and interpretation; with over 150
    opcodes, having all of the code to handle these in one
    file would be excessive and difficult to navigate, and
    dispatching out to functions to handle each opcode
    carries unnecessary overhead in what should be the
    tightest loop in the project. Thus, each of these header
    files is #included in emu.c in the middle of a switch
    statement, and gets access to the local scope within the
    main_loop function. It's weird but it gets the job done,
    and is the least bad of all considered options.

    The opcode handlers all use convenience functions
    defined in `functions.h', most of which are for the
    various addressing modes of the 6502 or for dealing with
    CPU flags.

    `io.c' is where the I/O bus lives; this is where we
    check to see if the emulated character device has been
    written to and where we raise an interrupt if we've
    gotten input from stdin.

    `generate_debug_names.py' reads the `opcodes.h' header
    and generates `debug-names.h', which contains a mapping
    from opcode to a string representation of that opcode.
    It's only used when dumping CPU state, either because
    the DEBUG flag was set at compile time or because a
    DEBUG instruction was hit in the binary.

    The rest of the files are pretty boring; `main.c' is
    pretty much only responsible for loading bytecode into
    memory and parsing command line arguments and `debug.c' is
    used to provide the `dump_cpu' function, which is a
    fascinating function consisting of almost nothing but
    printfs.

TODO:
    - support buffered input, where the program can read
      more than one input character at once.

THANKS:
    - voltagex on Github for sending me a patch to improve
      the sample_programs readme.
    - anatoly on HN for suggesting I add a bit on source
      code structure to the README.
    - shalmanese for coffee and pie.
    - daumiller for finding the subtraction bug.