Skip to main content

FPGA Zero to Hero Vol 4 EBAZ4205 Chronicles

·1888 words·9 mins· loading ·
Table of Contents

FPGA Zero to Hero Vol 4 EBAZ4205 Chronicles
#

In the previous blog post link, we explored UART communication on an Altera/Intel FPGA, diving deep into a bare-metal transmitter and receiver implementation. Unfortunately, I messed up the polarity of the Vcc and Gnd pin and burnt my board. I decided to re-order another board but now the cost has increased significantly.Attempts to salvage it by replacing the Linear Drop Out (LDO) voltage regulators were not successful.

After searching for all forums I hit gold in the depth of internet in some Japanese forums. The board I found cost just 20 bucks and has a Xilinx Zynq Z7010 FPGA. It is used part from some old crypto miners FPGA boards used by Chinese bitcoin miners. The board has some quirks but is the cheapest FPGA board I have seen (with Zynq AMD FPGA) that can be used as a development board. The board is known as EBAZ4205 crypto miner board and can be found on Ebay and AliExpress at insane 15- 30 price range.

EBAZ4205

To continue the journey of FPGA I wanted to explore another protocol that is quite commonly used in embedded systems. Let’s continue our journey exploring popular serial communication protocol: Serial Peripheral Interface (SPI).

This protocol differs from UART in that it relies on a shared clock (generated by the master) and uses dedicated lines to transmit (MOSI) and receive (MISO) data. The naming already explains the receiver and transmitter part:

  1. MOSI (Master Out Slave In)
  2. MISO (Master In Slave Out)
  3. SCK (Serial Clock)
  4. SS (Slave Select) sometimes as CS (Chip Select)

While it uses more pins than UART, SPI can offer faster data rates and full-duplex communication. There is no upper limit on data rate as it is dependent on the slave capabilities. It can go up tp 80 MHz for some devices. For UART that tops around baud 921000 with quite high bit error rate.

Understanding SPI
#

SPI is a synchronous serial communication interface used in embedded systems to achieve high-speed data transfers. Unlike UART, SPI does not require start or stop bits or a specific baud rate derived from a clock divider. Instead, the SPI master outputs a clock (SCK), and both master and slave shift data on MOSI and MISO lines in synchronization with this clock.

A typical SPI communication flow from Master to Slave looks like this:

  1. SS / CS is driven low by the master to select the slave device.
  2. The master toggles SCK, and on each clock edge, bits are shifted out on MOSI by the master and read on bit by bit from MISO from the slave.
  3. After the data is fully transferred (typically 8 bits), the master can drive SS/CS high again to end the transaction or start another transfer sequence.

Remember that SPI has no formal standard defined but there are few things like preferred like Most Significant Bit First , 8 Bits per Transfer, Enable line is Active Low. Clock is Low when inactive and Data is Valid on Clock Leading edge. So its always important to read the data sheet of the Master or Slave device that you want to use before implementing. For example if you want to use a SPI based Analog to Digital converter ADC with FPGA then the first thing need to be considered are all the factors described above to successfully integrate it. I will be implementing the

SPI Modes
#

SPI can operate in four modes (Mode 0, 1, 2, 3), determined by clock polarity (CPOL) and clock phase (CPHA). In this blog example, we’ll illustrate Mode 1 (CPOL=0, CPHA=1). Adapting to other modes involves shifting data on different clock edges or inverting the initial clock state, but the underlying principles remain the same.

SPI Master Code
#

Below is a simple SPI Master module written in verilog. We assume a 50 MHz system clock and want an SPI clock of, say, 1 MHz for demonstration. This is done with a counter-based clock divider, similar to how we derived a baud rate in the UART example.

module spi_master #(
    parameter CLOCK_FREQ = 50_000_000,
    parameter SPI_CLK    = 1_000_000
)(
    input  wire       clk,        
    input  wire       rst,        
    input  wire [7:0] tx_data,    
    output wire [7:0] rx_data,    
    output reg        busy,       
    output reg        sck,        
    output reg        mosi,       
    input  wire       miso,       
    output reg        cs          
);
    localparam integer DIVIDER = CLOCK_FREQ / (2*SPI_CLK);
    localparam [1:0]  IDLE  = 2'b00, START = 2'b01,
                      TRANS = 2'b10, DONE  = 2'b11;
    reg [1:0]  state;
    reg [7:0]  shifter_tx;
    reg [7:0]  shifter_rx;
    reg [2:0]  bit_cnt; 
    reg [15:0] clk_cnt;   
    assign rx_data = shifter_rx;
    always @(posedge clk or posedge rst) begin
        if (rst) begin
            state       <= IDLE;
            sck         <= 1'b0;
            cs          <= 1'b1;
            busy        <= 1'b0;
            shifter_tx  <= 8'h00;
            shifter_rx  <= 8'h00;
            bit_cnt     <= 3'd0;
            clk_cnt     <= 16'd0;
            mosi        <= 1'b0;
        end else begin
            case (state)
            IDLE : begin
                cs         <= 1'b1;
                busy       <= 1'b0;
                sck        <= 1'b0;
                clk_cnt    <= 16'd0;
                shifter_tx <= tx_data;
                bit_cnt    <= 3'd7;
                busy       <= 1'b1;
                state      <= START;
            end
            START : begin
                cs <= 1'b0;
                if (clk_cnt == DIVIDER-1) begin
                    clk_cnt <= 16'd0;
                    state   <= TRANS;
                end else begin
                    clk_cnt <= clk_cnt + 1;
                end
            end
            TRANS : begin
                if (clk_cnt == DIVIDER-1) begin
                    clk_cnt <= 16'd0;
                    sck     <= ~sck;
                    if (~sck) begin
                        mosi       <= shifter_tx[7];
                        shifter_tx <= {shifter_tx[6:0], 1'b0};
                    end
                    else begin
                        shifter_rx <= {shifter_rx[6:0], miso};
                        if (bit_cnt == 3'd0) begin
                            state <= DONE;
                        end else begin
                            bit_cnt <= bit_cnt - 1;
                        end
                    end
                end else begin
                    clk_cnt <= clk_cnt + 1;
                end
            end
            DONE : begin
                cs   <= 1'b1;
                sck  <= 1'b0;
                busy <= 1'b0;
                state<= IDLE;
            end
            endcase
        end
    end
endmodule

Lets dig deeper into the SPI Master code. This generates a SPI master module. First few part of the code is already explained in previous blog. Where we define the module and output and input wires and registers. The most important wires are:

    output reg   sck,        
    output reg   mosi,       
    input  wire  miso,       
    output reg   cs 

The state machine gets data from output wire [7:0] rx_data and serialize it in bit format from mosi.

The first part of code is reset before starting the state machine.

    state       <= IDLE;
    sck         <= 1'b0;
    cs          <= 1'b1;
    busy        <= 1'b0;
    shifter_tx  <= 8'h00;
    shifter_rx  <= 8'h00;
    bit_cnt     <= 3'd0;
    clk_cnt     <= 16'd0;
    mosi        <= 1'b0;

After all registers and output wire are set to initial value the IDLE state begins. In this state the module sets busy to high and sck is set to low. The shifter_tx that is used to serialize the input data is initialized with data register tx_data. bit counter is set to 7 and next state is set to START.

    cs         <= 1'b1;
    busy       <= 1'b0;
    sck        <= 1'b0;
    clk_cnt    <= 16'd0;
    shifter_tx <= tx_data;
    bit_cnt    <= 3'd7;
    busy       <= 1'b1;
    state      <= START;

In START state the chip select cs is set to low and that begins the SPI transmission. The START state remain on for time for 1 bit transmission on SPI. This is calculated using clock counter clk_cnt. That is read till DIVIDER-1 and then the state changes to TRANS

    cs          <= 1'b0;
    if (clk_cnt == DIVIDER-1) begin
        clk_cnt <= 16'd0;
        state   <= TRANS;
    end else begin
        clk_cnt <= clk_cnt + 1;
    end

In transmission state TRANS again the clk_cnt counter is used control the timing as before. When the clk_cnt reached DIVIDER-1 count sck is toggled and On the negative sck cycle the mosi is set to MSB of shifter_tx reg the register itself is shifted. And on positive sck cycle the shifter_rx is filled with current value of miso. if bit_cnt reaches 0 state is changed to DONE else it is counter is decrease by 1.

    if (clk_cnt == DIVIDER-1) begin
        clk_cnt <= 16'd0;
        sck     <= ~sck;
        if (~sck) begin
            mosi       <= shifter_tx[7];
            shifter_tx <= {shifter_tx[6:0], 1'b0};
        end
        else begin
            shifter_rx <= {shifter_rx[6:0], miso};
            if (bit_cnt == 3'd0) begin
                state <= DONE;
            end else begin
                bit_cnt <= bit_cnt - 1;
            end
        end
    end else begin
        clk_cnt <= clk_cnt + 1;
    end

DONE state is quite easy to understand and I will skip it as home work for the reader of the blog.

From Zero to Block Design in Xilinx Design Software
#

Create new project Vivado project
#

Download latest version of Vivado Design software from AMD website. There is a free Webpack license version but you need to register on the website. I am using Ubuntu 22.04 but it should work same for Windows.

First step after download is create a new project

  1. File -> New -> Project
Create Project Create Project 2
  1. Project Name: MySPI
Create Project 3
  1. Choose RTL Project -> Do not add sources at this stage.*
Create Project 4
  1. Part/Board Selection
    • Select the bare silicon: “XC7Z010‑1CLG400”
Create Project 5
  1. Finish. Vivado drops you into the Flow Navigator.
Create Project 6

Empty Block Design
#

  1. Flow Navigator -> IP Integrator -> Create Block Design
    • Name it system
  2. Right‑click the canvas -> Add IP -> Zynq7 Processing System.
Add IP
  1. With the PS selected, click Run Block Automation.
Add IP 2 Add IP 3
  1. We need one more clock as clk input for our SPI module.
    • Double click on the ZYNQ7 Processing system IP and go Clock Configure as shown below
Add IP 3
  1. Create a HDL wrapper by right clicking on the Block Diagram design source.
Add IP 4

Creating SPI Master module
#

SPI Master Module 1 SPI Master Module 2 SPI Master Module 3 SPI Master Module 4

Adding SPI Master module to Block Design
#

SPI to Block 1

After adding the module we need to connect all ports to external interface as shown below

SPI to Block 2

Also add a constant value (for now we are just interested in testing the interface) IP that we need to send via SPI Master interface.

SPI to Block 2

Connect the const to SPI tx_data. The final diagram should look something like this:

Final block diagram

Adding Constraints file
#

Now to connect the virtual port like for eg. sck_0 to the real hardware GPIOs we need to define the connection.

We need to create a constraint file that defines these connections

We follow same workflow as creating a verilog file

Constraint 1 Constraint 2

Once the EBAZ4205.xdc is open, copy following content into it and save it.

set_property -dict { PACKAGE_PIN H15   IOSTANDARD LVCMOS33 } [get_ports { miso_0 }];  
set_property -dict { PACKAGE_PIN B19   IOSTANDARD LVCMOS33 } [get_ports { mosi_0 }];  
set_property -dict { PACKAGE_PIN C20   IOSTANDARD LVCMOS33 } [get_ports { cs_0 }];    
set_property -dict { PACKAGE_PIN H17   IOSTANDARD LVCMOS33 } [get_ports { sck_0 }];   
set_property -dict { PACKAGE_PIN D18   IOSTANDARD LVCMOS33 } [get_ports { busy_0 }];   
set_property -dict { PACKAGE_PIN A20   IOSTANDARD LVCMOS33 } [get_ports { rst_0 }];   

This connects teh Pins on Data1 port to the SPI Master module ports.

Synthesize and Program the FPGA
#

Run the Run Synthesis with pressing SPI final result button. then run Run Implementation and Generate Bitstream.

Finally the Bit file with .bit extention is ready to be flashed. I use Waveshare USB Platform cable as JTAG programmer to program it.

SPI final result

Final result
#

As in my previous blogs I tested the output with a Logic analyser. And got 0xAA as expected.

SPI final result

Although the SPI master is working we are still not using the power of Zynq FPGA. Zynq series of FPGAs has dual core Cortex™-A9 based processing system (PS) which can run a small linux image. In my next blog I will be continue the journey and try to run connect this module with the ARM AXI Lite interface. This will enable us to communicate with this Module within a Linux running on its ARM core. We will be able to send and receive data from within a running Linux instance.