This is a draft!
This is a work in progress, it's incomplete, and may have notes to myself, omissions etc.

This chapter discusses the architecture of the CFT processor from a programmer's perspective. The CFT is a solid-state, 16-bit architecture reminiscent, among others, of the DEC PDP-8, the first computer to be famously described as a ‘mini’ with a goodly portion of MOS 6502 thrown in.

This document was updated in 2019 to include changes in the machine's architecture.

B1.1. CFT—The Little Processor That Couldn't

Look, let's face it. You're here to be amazed. ‘Hey look, this person built their own computer! Cool!’ (said no-one ever) You're probably reading this on a computer with a multi-core CPU with billions of transistors, billions of bits of memory, running at billions of cycles a second. If a device can't push literally 15 billion bits of data to its screen every second, you threaten to murder whoever designed it. It's okay. We're all jaded. So, time to face facts.

You're here to be amazed.

You won't.

It's best you face the disappointment now, and then we can manage your expectations together. Sounds good? Good.

So. The CFT processor itself is a fairly unsurprising design. It is a stored program computer, a Von Neumann architecture machine. Just like the device you're reading this on, its programs and data are stored together, mixed in the same type of memory. It just has much, much less memory.

All instructions are exactly 16 bits wide and are made up of a single 16-bit word each that specifies the instruction, addressing mode, and operand. Instructions that take memory operands use four bits to identify the instruction, two to identify the addressing mode, and ten for the operand. Instructions that don't reference memory have different formats, but they all fit in exactly one word.

Like the 6502, the PDP-8, and many of the oldest computers, and probably unlike the device you're reading this on, the CFT is an Accumulator Architecture: most instructions operate on a single, general purpose register known as the AC. Including the Accumulator, and this is a bit surprising, there are eigtheen whole registers|registers available: four 16-bit Major Registers, two 16-bit Minor Registers, eight 8-bit memory Bank Registers used in memory management, and five single-bit flag registers that can be tested to make decisions.

Of all those, only the Accumulator is accessible directly.

In keeping with the mini theme, the processor is built around a 16-bit word length. It can only access memory in 16-bit quantities. Depending on how you want to see it, it either lacks the ability to process bytes, or its bytes can be anywhere from 5 to 16 bits wide. The PDP-8 was the same, except it could only access 12-bit values and stored 6-bit bytes in ‘SIXBIT’ encoding. In most of this document, I will be using the units Word and KW to denote 16-bit words and groups of 1,024 16-bit words respectively.

To make up for the lack of registers, I follow the PDP-8 and 6502 school of thought. There are adressing modes to access the lower 1,024 words of memory and use them as global variables or other registers. This is known as Page Zero.

Unlike the 6502, but like the PDP-8, there are some magic locations on Page Zero. The PDP-8 has eight locations that act as index registers that automatically increment when referenced. The CFT has 256 locations that can be used as far memory pointers, auto-increment or auto-decrement index registers, or stack pointers. These locations behave like additional addressing modes and make coding loops a lot tighter than you'd expect from a processor that's 90% limitations, 10% happy accidents.

Like many recent home-designed processors, the CFT is microcoded. Rather than relying on a hardwired control unit, the instruction set is built as a huge truth table using ROMs. This allows instructions to be debugged without rewiring the processor. It's been a great boon in developing new behaviours and new instructions whose need became apparent as the scope of the project got more and more Byzantine extended.

The instruction set itself started off being very simple and orthogonal, just like the PDP-8's. After several versions, this is no longer the case. This rubs be the wrong way because I really like the minimalism of the PDP-8 instruction set. But at least the current version of the CFT supports re-entrancy and recursion now.

As you'd expect from a tiny home-designed machine, there are many things missing from this design.

  • There are no privilege levels. Bye bye multi-user operating systems.
  • There are no processor exceptions. Bye bye memory management.
  • There are no memory managing features. So far so bad: the CFT will never run Unix. But wait, there's less.
  • There is no pipelining. Bye bye speed.
  • There is no speculative execution. Bye bye even more speed. At least we'll never be vulnerable to Spectre exploits either.
  • There is no hardware stack. Bye bye Forth. Oh, wait. We do have Forth because masochism. The 2019 version of the CFT has multiple hardware stacks and a hardware stack pointer. The masochism remains.
  • There is no floating point arithmetic. So no, it won't run Crysis.
  • There is no integer division.
  • There is no integer multiplication.
  • Hell, there's no integer subtraction.
  • ~You can't shift or roll by an arbitrary number of bits.~ The 2019 version of the ALU can do this, provided you're okay with ‘arbitrary’ meaning ‘1 to 15’).

On the other hand, the PDP-8 had less than even this, and did okay enough!

In the style of 1960s computers, the CFT's simplicity blurs the dividing line between processor and peripherals and allows for instruction set extensions to be provided by peripherals or co-processors.

Speaking of peripherals, the CFT is a pure 16-bit machine. It can address up to 64 kiloWords of RAM—remember, that's 65,536 words, not 65000 words (or, you know 50,000 words if your job is marketing hard drives). A memory banking feature we'll discuss later extends this to 24 bits (16 MW!) using memory banking registers like those of the 65C816. While adding other, more severe shortcomings refreshingly challenging problems to solve.

So far the design has been very Accumultor-y, and what's more Accumulator-y than the 6502? (and isn't a Hollerith tabulating machine?). To attract the Z80 crowd too, the CFT can also address up to 65,536 words of I/O space—what Zilog and Intel termed ‘I/O ports’. This is separate from memory space and is accessed using different instructions. Limitations in the instruction set and address decoding make accessing the top 64,512 I/O addresses challenging or slow, whichever has fewer letters. 1,2024 I/O addresses. 1,024 I/O addresses should be enough for anyone. Z80 kids, you only have 256 I/O ports. At least you should be ecstatic!

The next sections will discuss this trainwreck of a design in (possibly literally) painful detail.

B1.2. Power On and Reset

The processor samples its reset input. When any device asserts a reset, the processor goes into a Reset Hold condition and waits for a preset number of processor cycles for its clock generator and other units to stabilise. During this time, a number of registers are reset to their initial values. When the cycle count expires, the Reset Hold condition is also lifted, and execution begins at 24-bit address 80:0000, the first address of the ROM. If the front panel has been configured for RAM only, the address is 00:0000 instead (the beginning of RAM—but in this mode the CFT starts halted and boot code must be toggled in).

B1.3. Word Size

The word size is 16 bits. There are no facilities for accessing quantities smaller than one word, and no single-instruction facilities for accessing quantities longer than one word.

In this book, one kilo-Word (kW) is 210 = 1,024 Words.

B1.4. Data Types

The CFT processor's hardware is dimly aware of the following data types:

  1. 16-bit signed integers representing the range 0 to 65,535.
  2. 16-bit unsigned integers representing the range -32,768 to 32,767 in two's complement.
  3. 16-bit unsigned integers representing the range -32,767 to 32,767 in one's complement.

The choice of arithmetic operations was made so the architecture can afford to be type-agnoistic. The only arithmetic operations are addition, left shifts and right shifts. Of these, right shift has a bitwise (unsigned) variant and a separate sign-extending (signed) variant.

There are facilities to detect numeric overflow and carry out for the addition instruction as well as incrementation and decrementation.

All other support (and even signed integer support, really) is implemented in software.

B1.5. Addressing

The CFT architecture can address up to 16 MW of memory, plus 1,024 W of I/O space. Memory is accessed as banks of 64 kW indexed by memory bank registers.

B1.5.1. Banks

The 64 kW address space of the CFT is expanded to 16 MW using a memory banking scheme. The top eight bits of a 24-bit address are obtained from one of eight 8-bit Memory Bank Registers, and addressing modes generate the rest. These banks are very simplistic: attempting to access 00:FFFF+1 doesn't get you 01:0000 but 00:0000. Apple IIgsprogrammers will be right at home with this.

There are separate bank registers used to fetch code, to store data, to access the stack, and to access a system-wide Page Zero. Another four bank registers for general purpose access allow up to 512 KW to be accessed without (much) trickery. If you want 32-bit linear memory, better go back to ARM!

B1.5.2. Memory Management Contexts

It's becoming hard to find the venerable 74670 4×4-bit register file, four of which which made up the original Memory Bank registers. The best modern fit was a small SRAM. As it happens, the smallest 8-bit SRAM I found was organised as 2,048×8 bits. We only need 8×8 bits. What to do with the remaining 2,040? Why, contexts of course!

Yes, the CFT may have a unenviable memory management unit, but it has a memory management unit.

An 8-bit Context Register controls which of 256 sets of Memory Bank Registers@@@ to use. This way, different parts of the computer may have different views of memory. This actually opens up a lot of possibilities like having separate ‘processes’ in a multi-tasking definition.

The updated hardware uses contexts internally too, for interrupt handling and to implement OS services.

B1.5.3. Pages

The instruction set's 10-bit operand fields imply a ‘page’ limitation. Within each bank, main memory is split up into pages, each 1,024 Words. Instructions usually reference memory addresses relative to the page they are executing in. This is a little bit like the universally loathed segmented architecture of the Intel 8086, minus the popularity and utility. Other than convenience and the fact that instruction operands are ten bits wide, nothing stops a program from accessing any memory page and 63 of the 64 pages in the (unexpanded) memory space don't have any special semantics. But any location further than 1,024 Words from the currently executed address must be accessed using Indirect mode, and the location of the indirection pointer must be within those 1,024 Words. Very much like the PDP-8, there.

B1.5.4. Page Zero

The one exception is the first page, Page Zero. It is given special treatment by the instruction set. All instructions that access memory can also reference memory or I/O addresses in Page Zero, no matter what page they are executing in. As such, Page Zero is always used for system variables, constants, operating system vectors and other data that must be globally accessible.

Any instruction with the R field set will access memory in Page Zero.

B1.5.5. Page Zero Auto-Index Locations

The 256 words in Page Zero addresses 0300–03FF (inclusive) are so-called Autoindex Registers. In a way, these add supplemental addressing modes by stealing two bits from the instruction operand field when Page Zero is accessed in Indirect Mode.

There are four kinds of Autoindex Registers, and for a change of PDP, they resemble PDP-11 index registers:

  • Simple index: the register is used for indirect, cross-bank memory access.

  • Auto-increment: after this address is used to access memory indirectly, the value at that location on Page Zero is incremented.

  • Auto-decrement: after this address is used to access memory indirectly, the value at that address is decremented.

  • Stack pointer: after this address is used to write to memory indirectly, the value is incremented. Before the address is used to read from memory indirectly, the location is decremented. This is meant to implement a secondary stack.

There are 64 registers of each of the four types. Each of these locations, even the simple index, access memory relative to one of the eight Memory Banking Registers. Here is the same information in tabular form:

Address (low 10 bits)AddressesFunctionRelative To
0n'nnnnnnnn512Global variable/constant/register.-
10'nnnnnnnn256Global variable/constant/register-
11'00XXXnnn648 Simple Index registersBank Register nnn
11'01XXXnnn648 Auto-increment registersBank Register nnn
11'10XXXnnn648 Auto-decrement registersBank Register nnn
11'11XXXnnn648 Stack PointersBank Register nnn

B1.5.6. Page Zero Memory Map

Here's a complete Page Zero Memory Map based on this information.

Address (low 16 bits)Contents
0000First Page Zero address. Nothing special about it.
02FF768th word Page Zero address.
0300First simple index register relative to MB0
0301First simple index register relative to MB1
0302First simple index register relative to MB2
0303First simple index register relative to MB3
0304First simple index register relative to MB4
0305First simple index register relative to MB5
0306First simple index register relative to MB6
0307First simple index register relative to MB7
0338Eighth simple index register relative to MB0
0339Eighth simple index register relative to MB1
033AEighth simple index register relative to MB2
033BEighth simple index register relative to MB3
033CEighth simple index register relative to MB4
033DEighth simple index register relative to MB5
033EEighth simple index register relative to MB6
033FEighth simple index register relative to MB7
0340First auto-increment register relative to MB0
0341First auto-increment register relative to MB1
0342First auto-increment register relative to MB2
0343First auto-increment register relative to MB3
0344First auto-increment register relative to MB4
0345First auto-increment register relative to MB5
0346First auto-increment register relative to MB6
0347First auto-increment register relative to MB7
0378Eighth auto-increment register relative to MB0
0379Eighth auto-increment register relative to MB1
037AEighth auto-increment register relative to MB2
037BEighth auto-increment register relative to MB3
037CEighth auto-increment register relative to MB4
037DEighth auto-increment register relative to MB5
037EEighth auto-increment register relative to MB6
037FEighth auto-increment register relative to MB7
0380First auto-decrement register relative to MB0
0381First auto-decrement register relative to MB1
0382First auto-decrement register relative to MB2
0383First auto-decrement register relative to MB3
0384First auto-decrement register relative to MB4
0385First auto-decrement register relative to MB5
0386First auto-decrement register relative to MB6
0387First auto-decrement register relative to MB7
03B8Eighth auto-decrement register relative to MB0
03B9Eighth auto-decrement register relative to MB1
03BAEighth auto-decrement register relative to MB2
03BBEighth auto-decrement register relative to MB3
03BCEighth auto-decrement register relative to MB4
03BDEighth auto-decrement register relative to MB5
03BEEighth auto-decrement register relative to MB6
03BFEighth auto-decrement register relative to MB7
03C0First stack register relative to MB0
03C1First stack register relative to MB1
03C2First stack register relative to MB2
03C3First stack register relative to MB3
03C4First stack register relative to MB4
03C5First stack register relative to MB5
03C6First stack register relative to MB6
03C7First stack register relative to MB7
03F8Eighth stack register relative to MB0
03F9Eighth stack register relative to MB1
03FAEighth stack register relative to MB2
03FBEighth stack register relative to MB3
03FCEighth stack register relative to MB4
03FDEighth stack register relative to MB5
03FEEighth stack register relative to MB6
03FFEighth stack register relative to MB7

B1.5.7. I/O Space

I/O space consists of 1,024 I/O locations that can be used to communicate with peripherals.

Accessing addresses in the range 0400-FFFF is still possible using indirect addressing, but slower than direct access and fraught with perils. Don't do it. Besides, only the lower 10 bits are decoded by the hardware. It would be pointless.

The upper 8 bits of the 24-bit physical address bus are undefined during I/O transactions.

B1.6. Registers

There are eighteen registers in the CFT architecture. They are split into the major registers, minor registers, memory bank registers, and flag registers. Major and minor registers are all 16 bits wide. Memory Bank Registers are 8 bits wide, and flag registers are one bit wide.

RegisterWidthNameUse
AC16AccumulatorGeneral purpose register
PC16Program CounterAddress of next instruction to fetch
DR16Data RegisterHelps implement Indirection
SP16Stack PointerSubroutine/Interrupt stack
IR16Instruction RegisterInstruction being executed
AR24Address RegisterDrives the Address Bus
CTX8Context RegisterSelects from 256 sets of MB0–MB7. (below)
MB08(also MBP) Memory Bank 0Program Memory Bank (instruction fetches)
MB18(also MBD) Memory Bank 1Bank for Indirect addressing mode
MB28(also MBS) Memory Bank 2Bank for Hardware Stack
MB38(also MBZ) Memory Bank 3Bank for Page Zero
MB48Memory Bank 4General purpose memory bank
MB58Memory Bank 5General purpose memory bank
MB68Memory Bank 6General purpose memory bank
MB78Memory Bank 7General purpose memory bank
N1Negative Nancy FlagIndicates Accumulator is negative
Z1Zero FlagIndicates Accumulator is zero
V1Overflow FlagResult of last addition can't fit in 16 bits
I1Interrupt FlagInterrupts are allowed
L1Link registerExtends the Accumulator by one bit

B1.6.1. Major Registers

These registers are used directly by the programming model. These are all 16-bit registers. The hardware provides facilities for these registers to be read from, written to, incremented, and decremented. Deep magic!

B1.6.1.1. Accumulator (AC)

This is a 16-bit register, or perhaps the 16-bit register. It is the only register directly and fully accessible via the instruction set and the one almost all instructions operate on. The hardware allows this register to be read from, written to, incremented and decremented.

B1.6.1.2. Program Counter (PC)

A 16-bit register. It contains the address in memory (relative to memory bank Memory Bank Register 0 (MBP)) of the next instruction to be executed. The hardware provides facilities to read from, write to, and increment this register. The instruction set lacks a direct means of reading from the register, although there are indirect ways of doing this. Instructions that skip, jump, call subroutines, and interrupts modify this register.

B1.6.1.3. Data Register (DR)

This is a 16-bit register used internally by the CFT to buffer addresses during indirect addressing. The hardware can read from, write to, increment and decrement this register. There are instructions to transfer the AC to the DR and vice versa, which turns it into a somewhat awkward scratch register as it's clobbered every time indirection is used.

B1.6.1.4. Stack Pointer (SP)

This is a 16-bit register used to access the hardware stack. This is the stack the system pushes return addresses to for flow control, interrupts, and the like. The Stack Pointer points to the first unused location on the stack. It's incremented after pushes and decremented before pops. There are indirect facilities to access this register programmatically. The stack may be up to 64 kW in size, but may be located in any memory bank. The stack is accessed relative to Memory Bank Register Memory Bank Register 2 (MBS) (Memory Bank Register 2 (MB2)) which contributes the top eight bits of the 24-bit stack address.

Additional (slower) stacks may be implemented using the 64 Page Zero stack auto-index registers.

There are no hardware bound checks on the Stack Pointer. They wouldn't be much help much anyway, as the CFT lacks exception interrupts.

B1.6.2. Minor Registers

These registers are used internally by the processor, and are built with less functionality than major registers.

B1.6.2.1. Instruction Register (IR)

The Instruction Register (IR) holds the instruction currently being executed. It is 16 bits wide and there is no way to access it programmatically or read its contents. It may only be written, and this only happens at the end of an instruction fetch. The value read decides the behaviour of the Control Unit during instruction execution.

B1.6.2.2. Address Register (AR)

The Address Register (AR) is a 24-bit register that drives the Address Bus. Like the IR, the AR can't be read from, it exists as a buffer for the address to be put on the Address Bus. It receives its value from the Address Generation Logic (AGL) in conjunction with one of the Memory Banking Registers.

B1.6.3. Memory Management Registers

B1.6.3.1. Memory Bank Registers

There are eight 8-bit bank registers that are used to extend addresses from 16 bits to 24 bits. These registers may be read from and written to.

Memory Bank RegisterAlso Known AsUsed for
MB0MBPInstruction fetching, local page data access
MB1MBDIndirecti data access
MB2MBSHardware stack
MB3MBZPage Zero data
MB4General Purpose data bank
MB5General Purpose data bank
MB6General Purpose data bank
MB7General Purpose data bank

MBP points to the memory bank where the program being executed resides. Like the 65C816, this can be modified using long jump instructions. Unlike the 65C816, it can also be read and written to directly. Data accessed using Page Local mode also uses this bank.

MBP is the Data Bank. All data accessed using Indirect mode come from this bank.

MBS is the Stack Bank. The hardware stack addresses memory relative to this bank.

Memory Bank Register 3 (MBZ) is the Page Zero Bank. All Page Zero and memory accesses data in this bank.

Banks 4 to 7 are general purpose bank pointers and can be used for any data access. The only way to use those programmatically is via the Page Zero auto-index register locations.

Please note that all registers can easily point to the same bank. A machine with ROM will probably have its MBP pointing to a ROM bank, but a memory-poor CFT could then colocate the Memory Bank Register 1 (MBD), MBSand MBZ on the first RAM bank.

Behaviour After Reset
After reset, the Bank Registers are masked by post-reset circuitry. All addresses generated will reference either bank 80 or 00, depending on the position of the RAM/ROM switch on the front panel. After the first write to an MBR address, memory banking is enabled, the registers are unmasked, and their values are expected to be crud. It is crucial to program the first four registers as soon as possible after reset, and before any subroutines or memory writes are attempted.

B1.6.3.2. Context Register

The Memory Banking Registers are really a 256×8 array of independent, 8-bit registers. Which set of registers is active is controlled by the value in the 8-bit CTX register.

Behaviour After Reset
The value of the CTX is undefined after reset, but since the MBRs are hardwired at reset, this isn't important. Setting it up to a sane value is part of a good reset sequence.

There are two special contexts:

  • Context 0 is set when an Interrupt is received, including the WAIT instruction.
  • Context 1 is set by the TRAP (OS system call, service or soft interrupt) instruction.

B1.6.4. Flag Registers

These are single-bit flags. They are used to sense or set the state of the system and form the basis of flow control.

Used as a carry bit during arithmetic, effectively extending the AC register to 17 bits, and as the 17th bit during roll instructions. It may be used as a generic flag and may be tested, set or cleared by user programs.

It is toggled automatically whenever the AC increments above ffff (carry out), or decrements below 0000 (borrow out). This includes increment, decrement and addition instructions.

B1.6.4.2. Negative Flag (N)

This flag always follows the value of the most significant bit of the AC, which can be used to denote negative numbers. If set, interpreting the AC as a signed number indicates a negative value. If treating the AC as an unsigned quantity, the N flag is the fastest way to test the Accumulator's highest-order bit.

NMeaning
0Accumulator is non-negative.
1Accumulator is negative.

B1.6.4.3. Zero Flag (Z)

Cannot be controlled directly by the user. This flag register is set when AC is zero, i.e. all 16 bits are clear. This (along with the N Flag) is the fastest means of numerical comparison on the CFT architecture.

ZMeaning
0Accumulator is non-zero.
1Accumulator is zero.

B1.6.4.4. Overflow Flag (V)

Cannot be controlled directly by the user. This flag is set when a two's complement signed addition yields a result that will not fit in 16 bits.

VMeaning
0Addition fit in 16 bits.
1Addition couldn't fit in 16 bits.

B1.6.4.5. Interrupt Flag (I)

This single-bit register controls the computer’s behaviour on detecting an interrupt request. The register may be manipulated by the user to allow or mask interrupts.

IMeaning
0Ignore interrupts.
1Enable servicing of interrupts.

B1.6.5. Page Zero Registers

In addition to the processor registers, the programming model treats the memory addresses 0000-03FF as 1,024 Page Zero registers, some with special meanings as described earlier. These are often termed simply ‘registers’ in the context of CFT Assembly. This is identical to the way both the PDP-8 and 6502 treated their own equivalent pages. In practical use, a ‘register’ in this context is the equivalent of a global variable, and it is up to the operating system to decide how Page Zero is laid out and utilised.

B1.7. Instruction Formats

Before 2019, the CFT used a single instruction format, but that had started to change already. In 2019, there are multiple instruction formats, which allow considerably more than 16 instructions, especially when some instructions don't need to access memory. Memory-accessing instructions share one general instruction format:

1514131211109876543210
OpcodeIROperand

Other instructions use this general format, with some instructions having more detailed custom formats for the operand.

1514131211109876543210
0000OpcodeOperand

From most to least significant, they are:

Instruction opcode
(most significant 4 bits). This field identifies the instruction to be performed.
Indirection Mode
(1 bit). Depending on the instruction, this bit selects between the literal and direct, or the direct and indirect addressing mode. Note that in some cases, unusual addressing modes may be selected when this bit is set. Jumps do this, for instance.
Register Mode
(1 bit). This bit controls whether addresses and literals are relative to the current page, or relative to Page Zero (the register page).
Operand or address offset
(least significant 10 bits). This allows ten bits of literals or addresses to be specified in an instruction. The most significant six bits are filled in from one of two sources as follows:
  • If the Register Mode bit is set (1), the six most significant bits of the Program Counter (PC) are used. This is page-relative addressing.
  • If the Register Mode bit is clear (0), the six most significant bits of the operand are zero. This is Register addressing, also known as Page Zero addressing, since Page Zero is used for system registers.

The instruction format imposes some limitations, but limitations are fun, aren't they?

An instruction with a 10-bit operand can only access 1,024 locations of memory. If it sets the Register Mode (R Field (R)) bit, only the 1,024 words in Page Zero may be accessed. If the bit is clear, only the 1,024 word page the instruction is executing in may be accessed. The PDP-8 was plagued by the same issue, but then again so was every single RISC microprocessor out there. The RISC solutions were considerably less masochistic than the PDP-8 and CFT ones. Here's what's done on the CFT to mitigate the problem somewhat.

RAM-based subroutines store temporary data on their own page, or use special ‘scratch’ registers in Page Zero. Both variants take the same time to execute.

In the case of literals, we either limit ourselves to constants in the range 000-3FF, store commonly-used large constants (such as -1, FFFF and -2, FFFE) in a Page Zero constant table, or a combination of these techniques.

We must take special care when subprograms cross a page boundary. When that happens, code referencing any local data will instead refer to the same offset within the new page, and will be invalid. The CFT Assembler issues a cross-page warning when using symbolic names (labels) for such local data.

Jumps and subroutine calls suffer from the same issue. Indirect Register Addressing is commonly used as a solution to this problem: the address (vector) of the subprogram in question is stored in a Page Zero location, and the jump is made using indirection. This has the added benefit of allowing the vectors to be changed so that system services may be overridden, but costs an extra memory access cycle.

B1.8. A Note About Semantic Notation

Semantic notation shows you what an instruction does at a high level.

If you're reading this, you've probably read micropocessor handbooks and can probably grok semantics notations osmotically. If now, though, here's a few tips. Starting soon, I'll be talking about processor semantics, which is a way to say what the processor at a high level, and while appearing very scientific. (win-win)

Semantics are shown like this: PC ← mem[a]. This means that memory is read at address a, and the result is stored in the PC. The left hand side can be:

  • A register of some sort: PC ← mem[a].
  • The <L,AC> vector when both L and AC are modified at once as a 17-bit value: <L,AC> ← AC + mem[a].
  • A memory write cycle: mem[a] ← PC. (write the PC to address a)
  • An I/O space write cycle: io[a] ← AC. (write the AC to I/O address a)

The right hand side can be:

  • Another register: PC ← DR.
  • A memory or I/O read. These can be nested if indirection is used. mem[a] is the value at address a. mem[mem[a== is the value in memory at the address pointed to by a.
  • A constant: PC ← 0. The CFT has a tiny store of constants which it can write to things.
  • A simple arithmetic expression: PC ← PC + 1.
  • A conditional involving a flag: L ⇒ PC ← PC + 1. ‘If L is set, increment the PC.’
  • A conditional involving a negated flag: ¬L ⇒ PC ← PC + 1. ‘If L is clear, increment the PC.’

Lower-case variables always denote the instruction operand. The name differs to hint at the addressing mode, but this isn't consistent.

Many instructions have complex semantics. Those are shown as multiple statements like this, comma-separated. The comma indicates execution of those semantics in the order shown. We don't do parallelism. The semantics of POP are mem[a] ← mem[a] - 1, AC ← mem[mem[a==.

Note: the retro reference card uses a much denser way to show the same semantics.

B1.9. Addressing Modes

I've had a lot of trouble deciding what constitutes an Addressing Mode on the CFT architecture, and how to describe them. CPU manufacturers took pride in the number of them and often went out of their way to describe things as addressing modes that aren't addressing modes.

In some cases, like the 6502's Register mode, this made sense. Instructions accessing the 6502's Page Zero only needed two bytes rather than the three needed by instructions accessing 16-bit locations. The CFT has a single instruction width though. Formally, only two bits decide the addressing mode.

However, microprograms are free to interpret those two bits in various ways, leading to many addressing modes. Add auto-index location to that and the whole thing gets even more complicated. If I were to get rigorous about this (and I've tried), we'd end up with addressing modes like ‘Double Register Indirect Auto-Decrement Address’, which is actually used by JSR. This is scary stuff.

Before Microcode Version 7, most instructions were very regular in how they used their addressing modes. Most of them. We had the sui generis JMPII instruction that performed two indirection steps, for instance.

With Microcode Version 7, it's all up in the air. To wit, here's the surprisingly long and complex list of 14 addressing modes:

Implied.
There's no operand, or no addressing.
Literal.
The value in the operand is taken literally, as a binary value. (of various widths from 3 to 10 bits, depending on the instruction).
Accumulator.
The value of the AC is used to determine the address directly, relative to MBP. (this is used in flow control currently)
Accumulator Indirect.
The value of the AC is the location of the value in memory, relative to MBP. (this is used in flow control)
Page-Local
The address if formed directly using the 10-bit operand, extended using the 6 most significant bits of the PC (the current page), and relative to the MBP.
Register
The address if formed directly using the 10-bit operand. The 6 most significant bits are zero and the address is relative to MBZ.
Indirect
An address is formed using page-local addressing. This value at this is address is loaded from a memory location relative to MBD.
Register Indirect
An address is formed using Register addressing. This value at this is address is loaded from a memory location relative to MBZ.
Memory Bank-Relative Indirect
An address is formed using Register addressing. This value at this is address is loaded from a memory location relative to an arbitrary Memory Bank Registers (MBR).
Auto-Increment
Same as Memory Bank-Relative Indirect, but the value at the MBZ location is incremented after use.
Auto-Decrement
Same as Memory Bank-Relative Indirect, but the value at the MBZ location is incremented after use.
Stack
Same as Memory Bank-Relative Indirect, but the value at the MBZ location is decremented **before reads**, and incremented * *after writes** to implement hardware assisted stacks separately from the main hardware stack. Note that these stacks can be relative to any MBR, not just MBS.
Auto-Increment Double Indirect
Same as Auto-Increment, but the address to be incremented is used as *another* address. This is used to implement jump tables and call lists and is basically how the Forth Interpreter works.
Auto-Decrement Double Indirect
Same as Auto-Increment Double Indirect, but the jumptable pointer is decremented.

B1.10. Instruction Set Reference

B1.10.1. Some Brief Notes on Assembly Notation

To simplify understanding of the instruction set, opcode mnemonics are used rather than actual machine code. In many cases, the notation used is that of the Standard CFT Assembly language which merits some description. This is not a full definition of CFT Assembly language, merely enough of it to facilitate discussing the instruction set in a more human-readable form.

CFT Assemblers parse space-separated lexical tokens denoting either symbolic instruction names or hexadecimal numbers. Literals can be decimal (e.g. 15, hexadecimal (e.g. &e) or binary (e.g #1110). For ease, and since the CFT uses a lot of bitfields, binary notation treats - as 0, and ' is ignored: so #--------'-11001-- is the same as #0000000001100100. Symbols are converted to numbers using symbol tables. The Assembler defines some, the user can define others. The resultant numbers, which must be 16 bits in width, are ORred together to form an instruction.

This provides great simplicity and generality at the expense of making the code slightly less readable from a modern Assembly perspective (although PDP-8 Assembly programmers will feel right at home). You can get instructions that look like SKP SNA SSL—the gut reaction of a modern Assembly programmer would be to assume someone deleted two new-lines, but this just combines to an instruction that skips the next instruction if the N or L flags are set.

Anything after the first semicolon (;) or slash (/) on a line is considered a comment and ignored.

Like most modern Assemblers (and against PDP-8 assembly conventions), labels are denoted by a colon (:) suffix. Literals may be used as labels: they change the address of the next word to be assembled.

The Standard Assembler has a number of additional features, but we won't need those to discuss the instruction set.

Here's a brief example of CFT Assembly showing some basic syntax, and also how space-separated fields are ORred together to build 16-bit instructions. For example, the I mnemonic is simply defined as 000010000000000 which simply sets the I bit in an instruction. Combined with, e.g. the LOAD mnemonic we can form LOAD I &342 or, if we're veryperverse (we are), we could even write &342 I LOAD. Both would assemble to the same single instruction, but the first form is a lot more conventional.

&fff0:                 ; Set the assembly address
        JMP I 1        ; Boot code: cross-page (long) jump
        .word start    ; The address of the 'start' label below

&1000:                 ; The rest of the program starts at address &1000
start:                 ; A label
        LOAD 0         ; Load direct (decimal operand)
        LOAD I 834     ; Load indirect (decimal operand)
        &342 I LOAD    ; Perversion! Valid, but against conventions.
        LOAD I R &007F ; Load Page Zero and indirect
        LOAD I R &0080 ; Load, autoindexing
        IN R PANEL 0   ; Read panel switches
        CLL RBL        ; Shift left one bit
        HALT           ; Halt the system (a macro)

B1.10.2. Conditional Skips: The SKP Bitmap Super-Instruction

1514131211109876543210
00001000100GNZLV

If a bit is set, the corresponding flag is tested and the skip taken if it's set. With G clear, the results of each of the four tests are ORred together and the next instruction is skipped if any are set. The table below lists those combinations for which mnemonics exist. User-defined combinations may be used in CFT Assembly by writing something like SNA SZA in a single line. SKPdoes not need to be specified. This is the way the PDP-8 assembles similar instructions, too.

Obviously NOP makes no sense ORred with other instructions. The dashes -indicate don't care values. They are defined as zeroes in the instruction table.

All variants of this instruction take 3 processor cycles to execute, whether the skip is taken or not.

GNZLVMnemonicInstruction
00000NOPNo operation. (never skip)
01---SNASkip if Accumulator Negative
011--SNPSkip if Accumulator Non-Positive
0-1--SZASkip if Accumulator Zero
0--1-SSLSkip if Link is Set
0---1SSVSkip if Overflow is Set

With G clear, the following Group 1 skips are available. If a bit is set, the corresponding flag is tested and the skip taken if it's clear. Bits may be combined and the skip will be taken if all of the flags are set.

GNZLVMnemonicInstruction
10000SKIPUnconditional skip. (always skip)
11---SNNSkip if Accumulator Non-Negative
111--SPASkip if Accumulator Positive
1-1--SNZSkip if Accumulator Non-Zero
1--1-SCLSkip if Link is Clear
1---1SCVSkip if Overflow is Clear
Why the difference between Group 0 and Group 1?
G simply inverts the final result of the flag test. Because the tests are ORred together, this applies DeMorgan's Law: !(a || b) == !a && !b. This is actually very useful, and allows a full range of comparisons of the Accumulator. And guess what: the PDP-8 did it this way too.
Misnomer Alert
The SKP instruction without any flag bits set is a misnomer! It performs no check and never skips, which is why that combination is called NOP. Also, it's too close to the SKIP instruction, which always skips. I'm on the fence as to whether the super-instruction mnemonic needs to be renamed to something like _SKP or removed altogether from CFT Assembly to avoid confusion.
How many instructions?
When people ask me about the size of the instruction set (that's a lie: no-one's ever asked this—or anything else for that matter), bitmapped instructions are a source of ambiguity. Does SKP count as one instruction? Does it count as 16? Does it count as the 12 defined above? These days, I just count the defined mnemonics, even though they can be combined to form other, more exotic instructions that lack mnemonics of their own.

B1.10.3. Memory Management

B1.10.3.1. LCT — Read Context Register

1514131211109876543210
000010010XXXXXXX

B1.10.3.2. SCT — Set Context Register

1514131211109876543210
000010011XXXXXXX

B1.10.3.3. LMB — Read Memory Bank Register

1514131211109876543210
000010100XXXXN

B1.10.3.4. SMB — Set Memory Bank Register

1514131211109876543210
000010101XXXXN

B1.10.3.5. NMB — Initialise MBR

1514131211109876543210
000010110XXXXN

B1.10.3.6. ECT — Enter Context

1514131211109876543210
000010111value

B1.10.4. Unary Operations: the UOP Bitmap Instruction

1514131211109876543210
0000111000CLACLLNOTINCDECCPL

Each bit set in this bitfield will result in a particular instruction being executed. This is always done in the order the bits are arranged, left to right. Defined mnemonics for this insruction are as follows:

CLACLLNOTINCDECCPLMnemonicInstruction
000000NOP8No operation for 8 cycles
1-----CLAClear Accumulator
-1----CLLClear Link
-1---1SELSet Link
--1---NOTInvert Accumulator bits
--11--NEGTwo's complement negation of Accumulator
---1--INCIncrement Accumulator
----1-DECDecrement Accumulator
-----1CPLToggle Link

All of these mnemonics include the UOP instruction in their definitions, so you don't need to say UOP NEG. Just NEGwill work fine.

Deceptively ‘minor’
Some of these instructions are deceptive. You might think CLA is a great way to clear the Accumulator, but the UOP instruction always completes in 8 cycles. In comparison, LI 0 does the same in 3. If you have a constant like .word MINUS1 &FFFF in Page Zero, XOR MINUS1 is also faster than NOT. The value of these minor instructions comes when they are combined with others, not on their own.

1514131211109876543210
0000111010CLACLLNOTINCDECCPL

This instruction is identical to UOP, but executes the minor instructions only if L is set. If this is the case, the instruction takes 9 processor cycles to run. If L is clear, it only needs 3.

Unlike UOP, IFL must be specified here! The minor instructions all include UOP as part of their definitions, but they need to be modified to use IFL. The instruction set has been defined such that even though CLA is really UOP CLA, IFL UOP CLA will still execute IFL. (IFL has one extra bit set)

Don't use NOP8
NOP8 is a misnomer here. If L is clear, it executes in 3 cycles. We already have a 3-cycle NOP. It's part of the SKPbitmap insruction and it's called NOP. If L is set, NOP8 executes in 9 cycles.

B1.10.6. IFV — Unary Operations if Overflow Set

1514131211109876543210
0000111100CLACLLNOTINCDECCPL

This instruction is identical to IFL, but executes the minor instructions only if Overflow flag (V) is set. If this is the case, the instruction takes 9 processor cycles to run. If V is clear, it only needs 3.

Unlike UOP, IFV must be specified here! The minor instructions all include UOP as part of their definitions, but they need to be modified to use IFV. The instruction set has been defined such that even though CLA is really UOP CLA, IFV UOP CLA will still execute IFV. (IFV has one extra bit set)

Don't use NOP8
NOP8 is a misnomer here. If V is clear, it executes in 3 cycles. We already have a 3-cycle NOP. It's part of the SKPbitmap insruction and it's called NOP. If V is set, NOP8 executes in 9 cycles.

B1.10.7. Flow Control

B1.10.7.1. LJSR — Long Jump to Subroutine

1514131211109876543210
00011RVector Address

B1.10.7.2. LRET — Return from Long Subroutine Jump

1514131211109876543210
000000001XXXXXXX

B1.10.7.3. LJMP — Long Jump

1514131211109876543210
00101RVector Address

B1.10.7.4. JSR — Jump to Subroutine

1514131211109876543210
0011IRAddress

Pushes the value of the PC onto the Hardware Stack (incrementing Stack Pointer (SP)), then jumps to the address specified in the operand. This instruction has non-standard addressing modes!

Page Local Mode. The operand is a local page offset. The PC jumps jumps to that location.

Indirect Mode. The operand is a local page offset. The address of the subroutine is loaded from that offset and the PC jumps to it.

Page Zero Mode. The operand is a Page Zero offset. The PC jumps to that location.

B1.10.7.5. RET — Return from Subroutine

1514131211109876543210
000000010XXXXXXX

B1.10.7.6. JMP — Jump

1514131211109876543210
0100IRAddress

B1.10.7.7. DSZ — Decrement and Skip if Zero

1514131211109876543210
1010IRAddress

(except if an auto-increment location is used, then it's ISZ!)

B1.10.7.8. JPA — Jump to Address in Accumulator

1514131211109876543210
000011000XXXXXXX

B1.10.7.9. JSA — Jump to Subroutine Address in Accumulator

1514131211109876543210
000011001XXXXXXX

B1.10.8. Device I/O

B1.10.8.1. IN — Input from I/O Space

1514131211109876543210
0101IRI/O Address

B1.10.8.2. OUT — Output to I/O Space

1514131211109876543210
0110IRI/O Address

B1.10.8.3. IOT — I/O Space Transaction

1514131211109876543210
0111IRI/O Address

B1.10.9. Memory Access

B1.10.9.1. LOAD — Load from Memory

1514131211109876543210
1000IRAddress

B1.10.9.2. STORE — Store to Memory

1514131211109876543210
1001IRAddress

B1.10.9.3. IND — Load Address in Accumulator

1514131211109876543210
000011111XXXXXXX

B1.10.10. Stack

B1.10.10.1. PHA — Push Accumulator onto Stack

1514131211109876543210
000001001XXXXXXX

B1.10.10.2. PPA — Pop Accumulator from Stack

1514131211109876543210
000001010XXXXXXX

B1.10.10.3. PHF — Push Flags onto Stack

1514131211109876543210
000001011XXXXXXX

B1.10.10.4. PPF — Pop Flags from Stack

1514131211109876543210
000001100XXXXXXX

B1.10.10.5. PEEK — Peek at top of hardware stack

1514131211109876543210
000011010XXXXXXX

B1.10.11. Shifting and Rolling

This is accomplished using a 1-bit serial barrel shifter. It takes a while to work, but it's faster and simpler than the pre-2019 version.

Historical Note
The pre-2019 CFT included the vary basics of shifting and rolling: bit and nybble shifts and rolls. There was no arbitrary shifting or sign extension. Ironically, with 1 and 4 bit distances, this was half of a barrel shifter architecture, implemented in software. This was stored in the ALU ROMs though, and 2 and 8 bit operations wouldn't have fit.

B1.10.11.1. SHL — Bitwise Shift Left

1514131211109876543210
000010000000Distance

B1.10.11.2. SHR — Bitwise Shift Right

1514131211109876543210
000010000001Distance

B1.10.11.3. ASR — Arithmetic Shift Right

1514131211109876543210
000010000011Distance

B1.10.11.4. ROL — Rotate Left

1514131211109876543210
000010000100Distance

B1.10.11.5. ROR — Rotate Right

1514131211109876543210
000010000101Distance

B1.10.12. Arithmetic and Logic

B1.10.12.1. ADD — Add Memory to Accumulator

1514131211109876543210
1100IRAddress

B1.10.12.2. AND — Bitwise And Memory with Accumulator

1514131211109876543210
1101IRAddress

B1.10.12.3. OR — Bitwise Or Memory with Accumulator

1514131211109876543210
1110IRAddress

B1.10.12.4. XOR — Exclusive Or Memory with Accumulator

1514131211109876543210
1111IRAddress

B1.10.12.5. SWAB — Swap high and low bytes of Accumulator

1514131211109876543210
000000111XXXXXXX

B1.10.13. Register Transfers

B1.10.13.1. TAS — Transfer Accumulator to Stack Pointer

1514131211109876543210
000000011XXXXXXX

B1.10.13.2. TSA — Transfer Stack Pointer to Accumulator

1514131211109876543210
000000100XXXXXXX

B1.10.13.3. TAD — Transfer Accumulator to Data Register

1514131211109876543210
000000101XXXXXXX

B1.10.13.4. TDA — Transfer Data Register to Accumulator

1514131211109876543210
000000110XXXXXXX

B1.10.14. Interrupts and System Calls

B1.10.14.1. TRAP — Swap high and low bytes of Accumulator

1514131211109876543210
000001000value

B1.10.14.2. IRET — Return from Interrupt Service Routine

1514131211109876543210
000000000XXXXXXX

B1.10.14.3. STI — Set I Flag, Allow Interrupts

1514131211109876543210
000001101XXXXXXX

B1.10.14.4. CLI — Clear I Flag, Block Interrupts

1514131211109876543210
000001110XXXXXXX

B1.10.14.5. WAIT — Wait for Interrupt

1514131211109876543210
000001111XXXXXXX

B1.10.15. Miscellaneous

B1.10.15.1. LIA — Set Accumulator to Address

Loads the AC with the literal value specified in the instruction. The Page Local form of this instruction is used to load the AC with a address in the current page. The Page Zero form loads an address in Page Zero. Please note that these addresses are 16 bits wide. What Memory Bank they will be relative to depends on how the Accumulator will then be used.

1514131211109876543210
00010RAddress

Some examples:

&1000:
        LIA &21         ; Set Page-Relative Address (sets AC to &1021)
        LIA R &21       ; Set Register Address (sets AC to &21)
        LI &21          ; Same as above (note that R is implied in LI)

B1.10.15.2. LI — Set Accumulator to Literal

This is really an alias of LIA R above. The difference is only in the semantics: LI loads a 10-bit literal value into the Accumulator. It just so happens that a Page Zero address does the exact same thing.

1514131211109876543210
000100Literal

B1.10.15.3. HCF — Halt and Catch Fire

1514131211109876543210
000011011XXXXXXX