Asynchronous SPI device stack for Sming. Initially written for ESP8266 but supports other architectures.

Problem statement

The ESP8266 has limited I/O and the most useful way to expand it is using serial devices. There are several types available:


Can be fast-ish but more limited than SPI, however slave devices generally cheaper, more widely available and simpler to interface as they only require 2 pins. The ESP8266 does have hardware support however, so requires a bit-banging solution. Inefficient.


Designed for streaming audio devices but has other uses. See Esp8266 Drivers.


Tied in with RS485, Modbus, DMX512, etc. Well-supported with asynchronous driver.


Speed generally limited by slave device capability, can be multiplexed (‘overlapped’) onto SPI0 (flash memory) pins. The two SPI modules are identical, but the additional pins for quad/dual modes are only brought out for SPI0. In addition, three user chip selects are available in this mode; there is only one in normal mode although this can be handled using a regular GPIO.

The purpose of this driver is to provide a consistent interface which makes better use of the hardware capabilities.

  • Classes may inherit from HSPI::Device to provide support for specific devices such as Flash memory, SPI RAM, LCD controllers, etc.

  • Device objects are attached to the stack via specified PinSet (overlapped or normal) and chip select.

  • Multiple concurrent devices are supported, limited only by available chip selects and physical constraints such as wire length, bus speeds.

  • 2 and 4-bit modes are supported via overlap pins.

  • A HSPI::Request object supports transfers of up to 64K. The controller splits these into smaller transactions as required.

  • Asynchronous execution so application does not block during SPI transfer. Application may provide a per-request callback to be notified when request has completed.

  • Blocking requests are also supported.

  • Support for moving data between Sming Streams and SPI memory devices using the HSPI::StreamAdapter.

SPI system expansion

A primary use-case for this driver is to provide additional resources for the ESP8266 using SPI devices such as:

  • MC23S017 SPI bus expander. Operates at 10MHz (regular SPI) and provides 16 additional I/O with interrupt capability.

  • High-speed shift registers. These can be wired directly to SPI buses to expand GPIO capability.

  • Epson S1D13781 display controller. See TFT_S1D13781. Evaluation boards are inexpensive and is a useful way to evaluate display modules with TFT interfaces. The Newhaven NHD-5.0-800480TF-ATXL#-CTP was used during development of this driver.

  • Bridgetek FT813 EVE TFT display controller. This supports dual/quad modes and clocks up to 30MHz.

  • NRF24L01 RF transceiver. Rated bus speed is 8MHz but it seems to work fine at 40MHz.

  • Serial memory devices. The library contains a driver for the IS62/65WVS2568GALL fast serial RAM, which clocks up to 45MHz and supports SDI/SQI modes.

Software operation


We have:

  • Controller: SPI hardware

  • Device: Slave device on the SPI bus

  • Request: Details of a transaction for a specific device

The Controller maintains an active list of requests (as a linked-list), but does not own the request objects. These will be allocated by the application, configured then submitted to the Controller for execution.

The Device specifies how a slave is connected to the bus, and that may change dynamically. For example, at reset an SPI RAM may be in SPI mode but can be switched into SDI/SQI modes. This is device-specific so would be implemented by superclassing HSPI::Device.


Each HSPI::Request is split into transactions. A transaction has four phases: command - address - MOSI - dummy - MISO. All phases are optional. The dummy bits are typically used in read modes and specified by the device datasheet. No data is transferred during this phase.

The ESP8266 hardware FIFO is used for MOSI/MISO phases and is limited to 64 bytes, so larger transfers must be broken into chunks. The driver handles this automatically.

Requests may be executed asynchronously so the call will not block and the CPU can continue with normal operations. An optional callback is invoked when the request has completed. As an example, consider moving a 128KByte file from flash storage into FT813 display memory:

  1. Read the first file chunk into a RAM buffer, submit an SPI request1 to transfer it asynchronously

  2. Read the second file chunk into another RAM buffer, and prepare request2 for that (but do not submit it yet)

  3. When request1 has completed, submit request2 (from the interrupt callback). Schedule a task to read the next chunk and prepare request1.

  4. When request2 has completed, continue from step (2) to submit request1, etc.


A 64-byte data transfer (full hardware FIFO with 1 command byte and 3-byte address) at 26MHz would take 21us (5.25us in QIO mode) or 1680 (420) CPU cycles. To transfer 128Kbytes would take 2048 such transactions, 43ms (11ms for QIO), not including memory copy overheads.

In practice request sizes will be much smaller due to RAM constraints. Nevertheless, at high clock speeds the interrupt rate increases to the point where it consumes more CPU cycles than the actual transfer. The driver therefore disables interrupts in these situations and executes the request in task mode.

Bear in mind that issuing a blocking request will also require all queued requests to complete.

The driver does not currently support out-of-order execution, which might prioritise faster devices.

Pin Set

To avoid confusion, we’ll refer to the flash memory SPI bus as SPI0, and the user bus as SPI1. This driver doesn’t support direct use of SPI0 as on most devices it is reserved for flash memory. However, an overlap mode is supported which makes use of hardware arbitration to perform SPI1 transactions using SPI0 pins. This has several advantages:

  • Liberates three GPIO which would normally be required for MOSI, MISO and SCLK.

  • Only one additional pin is required for chip select.

  • Additional 2/4 bits-per-clock modes are available for supported devices.

For the ESP8266, these are the HSPI::PinSet assignments:

MISO=GPIO12, MOSI=GPIO13, SCLK=GPIO14. One chip select:
  1. GPIO15 (HSPI CS)

MISO=SD0, MOSI=SD1, IO2=SD3, IO3=SD2, SCLK = CLK. Three chip selects:
  1. GPIO15 (HSPI_CS)

  2. GPIO1 (SPI_CS1 / UART0_TXD). This conflicts with the normal serial TX pin which should be swapped to GPIO2 if required.

  3. GPIO0 (SPI_CS2)


Typically a GPIO will be assigned to perform chip select (CS). The application should register a callback function via HSPI::onSelectDevice() which performs the actual switching. This MUST be in IRAM.


The connections for IO2/3 look wrong above, but on two different models of SPI RAM chip these have been verified as correct by writing in SPIHD mode and reading in quad mode.

Multiplexed CS

Multiple devices can be supported on a single CS using, for example using a HC138 3:8 decoder. The CS line is connected to an enable input, with three GPIO outputs setting A0-2.

A custom controller should be created like this:

class CustomController: public HSPI::Controller
   bool startDevice(Device& dev, PinSet pinSet, uint8_t chipSelect) override
       * You should perform any custom validation here and return false on failure.
       * For example, if we're only using 3 of the 8 available outputs.
      auto addr = chipSelect & 0x07;
      if(addr > 3) {
         debug_e("Invalid CS addr: %u", addr);
         return false;

       * Provide a callback to route chip select signal as required.
       * Note this only needs to be done once, so could be called externally without
       * subclassing HSPI::Controller if the other mechanisms aren't required.

       * Initialise hardware Controller
      auto cs = chipSelect >> 3;
      return HSPI::Controller::startDevice(dev, pinSet, cs);

   static void IRAM_ATTR selectDevice(uint8_t chipSelect, bool active);

   uint8_t activeChipSelect{0};

Now in the .cpp file:

CustomController spi;

void IRAM_ATTR CustomController::selectDevice(uint8_t chipSelect, bool active)
   // Only perform GPIO if CS changes as GPIO is expensive
   if(active && chipSelect != activeChipSelect) {
      auto addr = chipSelect & 0x07;
      digitalWrite(PIN_MUXADDR0, addr & 0x01);
      digitalWrite(PIN_MUXADDR1, addr & 0x02);
      // As we only need 2 address lines, can leave this one
      // digitalWrite(PIN_MUXADDR2, addr & 0x03);

      activeChipSelect = chipSelect;

   // If using a hardware CS output then we're done.

   if(spi.getActivePinSet() == HSPI::PinSet::manual) {
      // For manual chip select operation, set the state directly here

IO Modes

Not to be confused with HSPI::ClockMode, the HSPI::IoMode determines how the command, address and data phases are transferred:


Bits per clock


IO Mode



















































SDI and SQI are not supported directly by hardware, but is implemented within the driver using the address phase. In these modes, commands are limited to 8 bits.

This seems to be consistent with the ESP32 IDF driver, as in spi_ll.h:

/** IO modes supported by the master. */
typedef enum {
    SPI_LL_IO_MODE_NORMAL = 0,  ///< 1-bit mode for all phases
    SPI_LL_IO_MODE_DIO,         ///< 2-bit mode for address and data phases, 1-bit mode for command phase
    SPI_LL_IO_MODE_DUAL,        ///< 2-bit mode for data phases only, 1-bit mode for command and address phases
    SPI_LL_IO_MODE_QIO,         ///< 4-bit mode for address and data phases, 1-bit mode for command phase
    SPI_LL_IO_MODE_QUAD,        ///< 4-bit mode for data phases only, 1-bit mode for command and address phases
} spi_ll_io_mode_t;

Somne devices (e.g. W25Q32 flash) have specific commands to support these modes, but others (e.g. IS62/65WVS2568GALL fast serial RAM) do not, and the SDI/SQI mode setting applies to all phases. This needs to be implemented in the driver as otherwise the user code is more complex than necesssary and performance suffers considerably.


The HSPI::StreamAdapter provides support for streaming of data to/from memory devices.

This would be used, for example, to transfer content to or from a FileStream or FlashMemoryStream to SPI RAM asynchronously.

Supported devices must inherit from HSPI::MemoryDevice.


enum HSPI::ClockMode

SPI clock polarity (CPOL) and phase (CPHA)


enumerator mode0

CPOL: 0 CPHA: 0.

enumerator mode1

CPOL: 0 CPHA: 1.

enumerator mode2

CPOL: 1 CPHA: 0.

enumerator mode3

CPOL: 1 CPHA: 1.

enum HSPI::IoMode

Mode of data transfer.


enumerator SPI

One bit per clock, MISO stage concurrent with MISO (full-duplex)

enumerator SPIHD

One bit per clock, MISO stage follows MOSI (half-duplex)

enumerator SPI3WIRE

Half-duplex using MOSI for both sending and receiving data.

enumerator DUAL

Two bits per clock for Data, 1-bit for Command and Address.

enumerator DIO

Two bits per clock for Address and Data, 1-bit for Command.

enumerator SDI

Two bits per clock for Command, Address and Data.

enumerator QUAD

Four bits per clock for Data, 1-bit for Command and Address.

enumerator QIO

Four bits per clock for Address and Data, 1-bit for Command.

enumerator SQI

Four bits per clock for Command, Address and Data.

enumerator MAX
enum HSPI::PinSet

How SPI hardware pins are connected.


enumerator none


enumerator normal

Standard HSPI pins.

enumerator manual

HSPI pins with manual chip select.

enumerator overlap

Overlapped with SPI 0.

struct HSPI::Request

Defines an SPI Request Packet.

Request fields may be accessed directly or by use of helper methods.

Application is responsible for managing Request object construction/destruction. Queuing is managed as a linked list so the objects aren’t copied.

Applications will typically only require a couple of Request objects, so one can be prepared whilst the other is in flight. This helps to minimises the setup latency between SPI transactions.

Set value for command phase

inline void setCommand(uint16_t command, uint8_t bitCount)
  • command

  • bitCount – Length of command in bits

inline void setCommand8(uint8_t command)

Set 8-bit command.



inline void setCommand16(uint16_t command)

Set 16-bit command.



Set value for address phase

inline void setAddress(uint32_t address, uint8_t bitCount)
  • address

  • bitCount – Length of address in bits

inline void setAddress24(uint32_t address)

Set 24-bit address.



Public Functions

inline void setAsync(Callback callback = nullptr, void *param = nullptr)

Set request to asynchronous execution with optional callback.

Public Members

Device *device = {nullptr}

Target device for this request.

Request *next = {nullptr}

Controller uses this to queue requests.

uint16_t cmd = {0}

Command value.

uint8_t cmdLen = {0}

Command bits, 0 - 16.

uint8_t async

Set for asynchronous operation.

uint8_t task

Controller will execute this request in task mode.

uint8_t busy

Request in progress.

uint32_t addr = {0}

Address value.

uint8_t addrLen = {0}

Address bits, 0 - 32.

uint8_t dummyLen = {0}

Dummy read bits between address and read data, 0 - 255.

uint8_t sizeAlign = {0}

Required size alignment of each transaction (if split up)

Data out

Outgoing data.

Data in

Incoming data.

Callback callback = {nullptr}

Completion routine.

void *param = {nullptr}

User parameter.

struct HSPI::Data

Specifies a block incoming or outgoing data.

Data can be specified directly within Data, or as a buffer reference.

Command or address are stored in native byte order and rearranged according to the requested byteOrder setting. Data is always sent and received LSB first (as stored in memory) so any re-ordering must be done by the device or application.

Set internal data value of 1-4 bytes


Data is sent LSB, MSB (native byte order)

inline void set8(uint8_t data)

Set to single 8-bit value.



inline void set16(uint16_t data)

Set to single 16-bit value.



inline void set32(uint32_t data, uint8_t len = 4)

Set to 32-bit data.

  • data

  • len – Length in bytes (1 - 4)

Public Functions

inline void clear()

Reset to zero-length.

inline void set(const void *data, uint16_t count)

Set to reference external data block.

  • data – Location of data

  • count – Number of bytes

Public Members

void *ptr

Pointer to data.

uint16_t length

Number of bytes of data.

uint16_t isPointer

If set, data is referenced indirectly, otherwise it’s stored directly.

class HSPI::Device

Manages a specific SPI device instance attached to a controller.

Subclassed by Graphics::SpiDisplay, Graphics::XPT2046, HSPI::MemoryDevice

Public Functions

inline bool begin(PinSet pinSet, uint8_t chipSelect, uint32_t clockSpeed)

Register device with controller and prepare for action.

  • pinSet – Use PinSet::normal for Esp32, other values for Esp8266

  • chipSelect – Identifies the CS number for ESP8266, or the GPIO pin for ESP32

  • clockSpeed – Bus speed

inline bool isReady() const

Determine if the device is initialised.



virtual IoModes getSupportedIoModes() const = 0

Return set of IO modes supported by a device implementation.

inline bool isSupported(IoMode mode) const

Determine if the device/controller combination supports an IO mode Must be called after begin() as other settings (e.g. pinset) can affect support.

inline void onTransfer(Callback callback)

Set a callback to be invoked before a request is started, and when it has finished.


callback – Invoked in interrupt context, MUST be in IRAM

class HSPI::MemoryDevice : public HSPI::Device

Base class for read/write addressable devices.

Subclassed by HSPI::RAM::IS62_65, HSPI::RAM::PSRAM64

Prepare a write request

virtual void prepareWrite(HSPI::Request &req, uint32_t address) = 0

Prepare request without data.

inline void prepareWrite(HSPI::Request &req, uint32_t address, const void *data, size_t len)

Prepare request with data.

  • dataData to send

  • len – Size of data in bytes

Prepare a read request

virtual void prepareRead(HSPI::Request &req, uint32_t address) = 0

Prepare without buffer.

inline void prepareRead(HSPI::Request &req, uint32_t address, void *buffer, size_t len)

Prepare with buffer.

  • buffer – Where to write incoming data

  • len – Size of buffer

Public Functions

inline void write(uint32_t address, const void *data, size_t len)

Write a block of data.


Limited by current operating mode

  • address

  • data

  • len

inline void read(uint32_t address, void *buffer, size_t len)

Read a block of data.


Limited by current operating mode

  • address

  • data

  • len

class PSRAM64 : public HSPI::MemoryDevice

PSRAM64(H) pseudo-SRAM.

class IS62_65 : public HSPI::MemoryDevice

IS62/65WVS2568GALL fast serial RAM.

class HSPI::Controller : public ControllerBase

Manages access to SPI hardware.

Public Types

using SelectDevice = void (*)(uint8_t chipSelect, bool active)

Interrupt callback for custom Controllers.

For manual CS (PinSet::manual) the actual CS GPIO must be asserted/de-asserted.

Expanding the SPI bus using a HC138 3:8 multiplexer, for example, can also be handled here, setting the GPIO address lines appropriately.

  • chipSelect – The value passed to startDevice()

  • active – true when transaction is about to start, false when completed

Public Functions

void end()

Disable HSPI controller.


Reverts HSPI pins to GPIO and disables the controller

IoModes getSupportedIoModes(const Device &dev) const

Determine which IO modes are supported for the given device.

May be restricted by both controller and device capabilities.

inline void onSelectDevice(SelectDevice callback)

Set interrupt callback to use for manual CS control (PinSet::manual) or if CS pin is multiplexed.


Callback MUST be marked IRAM_ATTR

virtual bool startDevice(Device &dev, PinSet pinSet, uint8_t chipSelect, uint32_t clockSpeed)

Assign a device to a CS# using a specific pin set. Only one device may be assigned to any CS.

Custom controllers should override this method to verify/configure chip selects, and also provide a callback (via onSelectDevice()).

virtual void stopDevice(Device &dev)

Release CS for a device.

void configChanged(Device &dev)

Devices call this method to tell the Controller about configuration changes. Internally, we just set a flag and update the register values when required.

inline SpiBus getBusId() const

Get the active bus identifier.

On successful call to begin() returns actual bus in use.

bool loopback(bool enable)

For testing, tie MISO <-> MOSI internally.

enable true to enable loopback, false for normal receive operation


true – on success, false if loopback not supported

class StreamAdapter

Helper class for streaming data to/from SPI devices.


Used by

Environment Variables





SoC support

  • esp32

  • esp32c3

  • esp32s2

  • esp32s3

  • esp8266

  • host

  • rp2040