NI 78xx API Reference

Content Type
Programming Language
Current manual

Using DRAM Effectively with NI R Series Multifunction RIO

Using DRAM Effectively with NI R Series Multifunction RIO

The following design considerations can affect the throughput and storage capacity that you can achieve with the dynamic random access memory (DRAM) interface of the FPGA on R Series devices:

Access Size and Frequency

Access size refers to the amount of information stored in one memory address. You can set up memory items to use a variety of data types, but if you want to achieve the best performance, use a data type that matches the access size of the device. The access size is the exact number of bits that are written and read in a given memory access.

The following table lists the specifications for the NI PCIe and NI PXIe R Series devices that support FPGA-accessible DRAM, including the optimum access size.

Device Number of DRAM Banks Size per Bank Maximum Theoretical Bandwidth per Bank Access Size
NI PCIe-7821R 1 512 MB 800 MB/s 64 bits
NI PCIe-7822R 1 512 MB 800 MB/s 64 bits
NI PCIe-7857 1 512 MB 800 MB/s 64 bits
NI PCIe-7858 1 512 MB 800 MB/s 64 bits
NI PXIe-7821R 1 512 MB 800 MB/s 64 bits
NI PXIe-7822R 1 512 MB 800 MB/s 64 bits
NI PXIe-7847R 1 512 MB 800 MB/s 64 bits
NI PXIe-7857R 1 512 MB 800 MB/s 64 bits
NI PXIe-7858R 1 512 MB 800 MB/s 64 bits
NI PXIe-7861 1 512 MB 800 MB/s 64 bits
NI PXIe-7862 1 512 MB 800 MB/s 64 bits
NI PXIe-7865 1 512 MB 800 MB/s 64 bits
NI PXIe-7866 1 512 MB 800 MB/s 64 bits
NI PXIe-7867R 1 512 MB 800 MB/s 64 bits
NI PXIe-7868R 1 512 MB 800 MB/s 64 bits

Data Recommendations

If you use a data type that is smaller than the access size, the remaining bits receive an unknown and invalid value, but still get written and take up both space and bandwidth. For example, if the access size is 64 bits wide and you choose a 32-bit data type when configuring the DRAM, the remaining 32 bits are of an unknown and invalid data type. The following figure shows an optimized memory element and a memory element in which the data type is smaller than the access size.

NI recommends using data types that are exactly the same width as the access size of the DRAM to ensure that each access is optimized. Memory items accept clusters to be used as data types, and information can be packaged into clusters to achieve data types larger than those native to LabVIEW.

NI recommends that you push data into the memory item interface within the DRAM clock domain, which is 100 MHz. Right-click the FPGA target in a LabVIEW project, select New»FPGA Base Clock, and choose the DRAM Clock resource. Bandwidth is maximized when data is pushed into the memory item interface at the clock rate.

Note Note  You can use clock sources other than DRAM Clock, but performance will not be optimized.

Request Pipelining

The DRAM architecture is highly pipelined, resulting in relatively long latency between data requests and the execution of the requests. NI recommends that prerequest samples, which helps maintain high throughput.

To prerequest samples, request the samples you want to read without waiting for the data valid strobe of the retrieve method. Even though each individual request is still subject to latency and some nondeterminism, you now get much higher transfer rates because DRAM can access several pieces of data sequentially instead of treating each request separately.

Sequential Access

DRAM is optimized for high storage density and high bandwidth. DRAM accesses data sequentially and in large blocks. For example, you have to read the data in address 0x1 after you have read the data in address 0x0 and the processor reads large blocks of memory into cache, even if the program being executed requests a single byte.

To maximize performance, avoid switching between reading and writing, accessing noncontiguous addresses, or writing to memory in decrementing-address fashion. The most efficient access strategy is to perform only one type of access, either reading or writing, on a large number of sequential addresses. Although this is optimal, it is not practical for most applications. A more practical approach is to maximize the amount of sequential data being accessed and minimizing changes in access modes.

Was this information helpful?