Ruprecht-Karls-Universität Heidelberg


Verification and Implementation of PCI Express Endpoint Remote Configuration using EXTOLL

Diploma Thesis by Dirk Frey

Abstract:

Latest developments in the field of graphics processor unit (GPU) accelerated computing provide add-in cards with more than 1 teraFLOP of computing power. These cards communicate using a standardised PCI Express (PCIe) interface and offer an opportunity to increase the computing power of a system significantly by extensive use of such accelerator cards. A low-latency network in combination with such an add-in card allows small form factor and power efficient remote nodes which only consist of an accelerator board and a network card without a processor or main memory. Such remote nodes however cannot configure themselves, but must be configured remotely. Such an approach for remote configuration of PCIe devices like GPUs via the Extoll network is presented in this thesis.

Full configuration support for directly connecting a PCIe GPU with a PCIe network device acting as Rootport requires remote access and support for all types of PCIe packets that might be required to initialize and configure the GPU. To ease the hardware effort of translating or tunneling PCIe traffic and to give software maximum flexibility this thesis presents an approach that uses a PCIe Backdoor titled hardware module to allow a remote software process to create any kind of PCIe request as well as the ability to extract specific PCIe packets that are of intereset to the software.

This thesis will give an insight into the PCIe Express protocol in general and describe the necesary existing and newly developed hardware modules. A verification environment, written in SystemVerilog and using the Universal Verification Methodology (UVM) class library extension was created to exhaustively test the new hardware module and its interaction with the Extoll network chip acting as PCIe Rootport.

A detailed operation description of the testbench is supplied and the methodology and techniques are explained in detail. The results of the simulation as well as the results of a real hardware implementation are presented and evaluated.

 

« back

back to top