High Throughput Multiply

Download PDF

Updated2025-01-28
8 minute(s) read

Computes the product of x and y.

This function supports only scalar and array values of the fixed-point data type.

Dialog Box Options


Parameter	Description
Fixed-Point Configuration	Specifies the encodings, word lengths, and integer word lengths of the input and output terminals of this function. The configurations you specify determine the value range of the terminals. x Type— Specifies the fixed-point configuration of the x input terminal. If you wire a fixed-point data type to this terminal, LabVIEW dims this section and uses information from the wire. Signed—Specifies that this terminal is signed. Unsigned—Specifies that this terminal is unsigned. Word length—Specifies the word length of this terminal. Integer word length—Specifies the integer word length of this terminal. y Type— Specifies the fixed-point configuration of the y input terminal. If you wire a fixed-point data type to this terminal, LabVIEW dims this section and uses information from the wire. Signed—Specifies that this terminal is signed. Unsigned—Specifies that this terminal is unsigned. Word length—Specifies the word length of this terminal. Integer word length—Specifies the integer word length of this terminal. *xy Type— Specifies the fixed-point configuration of the xy* output terminal. Adapt to source—Specifies whether LabVIEW automatically adjusts the fixed-point configuration of the output data type to avoid overflow and rounding errors. By default, this checkbox contains a checkmark and LabVIEW dims the following options. Note LabVIEW supports a maximum word length of 64 bits and a maximum integer word length of 2047 bits. If you place a checkmark in this checkbox and the output data type requires a word length that exceeds these maximum values, overflow and/or rounding errors might occur. Signed—Specifies that this terminal is signed. Unsigned—Specifies that this terminal is unsigned. Word length—Specifies the word length of this terminal. Integer word length—Specifies the integer word length of this terminal. Include overflow status—Specifies whether the output terminal includes the overflow status. LabVIEW propagates this status to downstream nodes. Including this status requires additional FPGA resources. By default, this checkbox does not contain a checkmark. If you place a checkmark in this checkbox, the overflow status becomes TRUE in either of the following situations: The overflow status of an input terminal is TRUE. Overflow occurs during the operation of this function. If you place a checkmark in the Adapt to source checkbox, LabVIEW sets Include overflow status depending on whether an input terminal includes this status. Overflow mode— Specifies how this function handles overflow. You can choose either Wrap (default) or Saturate. Note The Saturateoption requires more FPGA resources and a longer combinatorial path than the Wrap option does. In this situation, choosing Saturate might decrease the maximum clock rate at which this function can compile. Rounding mode— Specifies how this function rounds the output data if rounding is necessary. You can choose Truncate (default), Round Half-Up, or Round Half-Even. If rounding occurs, the option you choose might affect the amount of resources this function requires.
Execution Mode	Specifies how this function executes. Outside single-cycle Timed Loop— Configures this Express VI to execute outside a single-cycle Timed Loop. If you select this option and place this Express VI inside a single-cycle Timed Loop, the Code Generation Errors window reports an error when you compile the FPGA VI. Inside single-cycle Timed Loop— Configures this Express VI to execute inside a single-cycle Timed Loop. If you select this option and place this Express VI outside a single-cycle Timed Loop, the Code Generation Errors window reports an error when you compile the FPGA VI.
Pipelining Options	Specifies options for pipelining this function internally. These options affect the maximum clock rate this function can achieve. Refer to the Improving Function Performance by Pipelining section of this topic for information about this relationship. Number of pipelining stages—Specifies the number of pipelining stages this function uses internally. Increasing the number of stages increases the clock rate at which this function can compile but also increases the amount of FPGA resources this function requires. The default is 0 stages, which specifies no pipelining. The maximum is 12 stages because adding more than 12 stages does not increase the maximum clock rate at which this function can compile. Implementation resource—Specifies how to implement the multiplier. To enable this option, increase the Number of pipelining stages to be greater than 0 stages. You can choose from the following options: Auto(default)—Specifies that the compiler decides whether to use embedded block multipliers or look-up tables (LUTs) to implement the multiplier. Look-Up Table—Specifies that this function uses LUTs to implement the multiplier. Selecting this option might increase the clock rate at which this function can compile.
Registers	Specifies whether to add internal registers for function output terminal. This section is available only if you select Inside single-cycle Timed Loop. Note Adding registers can reduce the length of the combinatorial path, which can prevent compilation errors that result from a long combinatorial path. However, adding registers also increases the latency of this function, which means this function takes additional clock cycles to return a valid result. Register inputs—Adds internal registers after the inputs to this function. Selecting this option increases the latency of the function by one cycle. Register outputs—Adds internal registers before the outputs of this function. Selecting this option increases the latency of the function by one cycle.
Optional Terminal	Specifies a setting for displaying an optional block diagram terminal. Operation overflow—Specifies that this function displays the operation overflow output terminal on the block diagram. This terminal indicates whether overflow occurred during the operation of this function.
Configuration Feedback	Displays information about how this function executes. This information is based on the configuration options you specify.

Inputs/Outputs

x —

Specifies the multiplicand.

y —

Specifies the multiplicand.

input valid —

Specifies whether the next data point has arrived for processing. Wire the output valid output of an upstream node to this input to transfer data from the upstream node to this node.

To display this handshaking terminal, select the Inside single-cycle Timed Loop option and perform either of the following actions:

Place a checkmark in either the Register inputs or Register outputs checkbox.
Set the Number of pipelining stages to at least 1.

These options are located in the configuration dialog box.

ready for output —

Specifies whether downstream nodes are ready for this node to return a new value. The default is TRUE. Use a Feedback Node to wire the ready for input output of a downstream node to this input of the current node.

Note If this terminal is FALSE during a given cycle, the output valid terminal returns FALSE during that cycle.

To display this terminal, select the Inside single-cycle Timed Loop option and perform either of the following actions:

Place a checkmark in either the Register inputs or Register outputs checkbox.
Set the Number of pipelining stages to at least 1.

These options are located in the configuration dialog box.

x*y —

Returns the product of x and y.

operation overflow —

Returns TRUE if the theoretical computed value exceeds the valid range of the output data type. If operation overflow returns TRUE, the Overflow mode option determines the value this function returns.

LabVIEW displays the operation overflow terminal only if you place a checkmark in the Operation overflow checkbox. This checkbox is located in the Optional Terminal section of the configuration dialog box.

output valid —

Returns TRUE if this node has computed a result that downstream nodes can use. Wire this output to the input valid input of a downstream node to transfer data from the node to the downstream node.

To display this terminal, select the Inside single-cycle Timed Loop option and perform either of the following actions:

Place a checkmark in either the Register inputs or Register outputs checkbox.
Set the Number of pipelining stages to at least 1.

These options are located in the configuration dialog box.

ready for input —

Returns TRUE if this node is ready to accept new input data. Use a Feedback Node to wire this output to the ready for output input of an upstream node.

Note If this terminal returns FALSE during a given cycle, LabVIEW discards any data that other nodes send to this node during the following cycle. LabVIEW discards this data even if the input valid terminal is TRUE during the following cycle.

To display this terminal, select the Inside single-cycle Timed Loop option and perform either of the following actions:

Place a checkmark in either the Register inputs or Register outputs checkbox.
Set the Number of pipelining stages to at least 1.

These options are located in the configuration dialog box.

During the cycles before output valid returns TRUE for the first time, this function might return different results on an FPGA target than on a host computer. The results become identical after the first time output valid returns TRUE.

Improving Function Performance by Pipelining

You can improve the timing performance of this function on an FPGA target by adjusting the Number of pipelining stages. The functionality of a pipelined multiplier is equivalent to a non-pipelined multiplier cascaded by a certain number of registers. The number of registers is equal to the number of pipelining stages.

In general, increasing the Number of pipelining stages also increases the maximum clock rate this function can achieve. However, the actual clock rate depends on many factors, including the following ones:

The FPGA target you use
The size of the multiplier
The rounding and overflow modes you select
The Implementation resource you select
Other FPGA logic besides the multiplier

The following graphs show the Number of pipelining stages vs. maximum clock rate estimations on Xilinx Virtex-II, Virtex-5, and Spartan-3 FPGA targets, respectively.

Note NI obtained these estimates after synthesis and without considering the impact of the rounding mode, overflow mode, or routing. Therefore, the clock rate estimates might be higher than what you actually can achieve.

In the previous figure, each line represents a multiplier of a certain size that uses a particular Implementation resource. For example, the I32*I32 Block line shows a multiplier that multiplies two 32-bit signed integers together by using an embedded block multiplier. The result is a signed 64-bit integer: <+/–,32,32> * <+/–,32,32> = <+/–,64,64>. If you use one pipelining stage on a Virtex-II FPGA target, this type of multiplier can achieve a maximum clock rate of about 51 MHz. If you use three pipelining stages, the maximum clock rate becomes about 76 MHz.

The following figures show similar information for other FPGA targets.

Examples

Refer to the following example files included with LabVIEW FPGA Module.

labview\examples\CompactRIO\FPGA Fundamentals\FPGA Math and Analysis\High-Throughput Math\Vector Normalization\Vector Normalization.lvproj
labview\examples\R Series\FPGA Fundamentals\FPGA Math and Analysis\High-Throughput Math\Vector Normalization\Vector Normalization.lvproj

In This Section

Dialog Box Options
Inputs/Outputs
Improving Function Performance by Pipelining
Examples

Was this information helpful?

Yes No

LabVIEW FPGA Module Programming Reference Manual

Filters

Content Type

Programming Language

Table of Contents

High Throughput Multiply

Dialog Box Options

Inputs/Outputs

Improving Function Performance by Pipelining

Examples

In This Section

Log in to get a better experience

Download PDF

Additional Comments