Technical data management streaming (TDMS) is the most common file format used by NI software to store acquired data channels, and it is also open to 3rd party tools. For an overview of the benefits of the TDMS file format and the many programs and APIs that write and read these files, please review the following document:
For an overview of the bit-level binary structure of the TDMS file, please review the following document:
TDMS File Format Internal Structure
Often, just writing a valid TDMS file is not enough. This document outlines the best practices for TDMS file creation so that later on you can optimally find, analyze, compare, and report the acquired data. Following these TDMS file recommendations will enable additional data management functionality, improve channel timing clarity, and maximize loading speed.
ITDMS data files typically contain two different types of information: the acquired data arrays, often called bulk data, and the set-up conditions and/or scalar results, often called meta-data. TDMS files always store the bulk data in individual 1D arrays called channels. TDMS files can store meta-data as scalar name-value pairs, called properties. They can be attached to the file level, the channel group level, or the channel level. What information you store as properties, which level you attach those properties to, and how you name each property have a great effect on the usefulness of the meta-data for data management purposes.
Writing Meta-Data to Properties, Not Channels
Meta-data should always be stored to properties, which are searchable, instead of to data channels, whose values are not searchable. TDMS data channels are designed for arrays of data for graphing and analysis. Storing scalar set-up information and scalar result values to one-value data channels confuses the reading application, because it mixes “real” data channels with “decoy” data channels. When named scalar information is stored instead to properties, it vastly improves the browsing and searching experience in many TDMS reading applications such as NI LabVIEW and DIAdem. For example, see the intuitive simultaneous display of channel data values (graph) and channel properties (table) that follows:
Figure 1. Channel Data and Properties
Writing Properties With Valid Names
A TDMS property has three components: the property value, the property name, and the property level. The TDMS file format allows any characters to be used in the property value, but many of the TDMS file writing and reading applications have property name restrictions. The following property name recommendations will guarantee that the property names will remain unchanged regardless of which TDMS reading application is used.
There are two special property names that are important to use in all TDMS files which are listed in the following table. Note that the (lower case) capitalization for these two particular properties is important.
Table 1. Special Property Names
Level | Name | Description |
File | datetime | Start DateTime of the whole TDMS file |
Channel | unit_string | String representation of the channel unit |
The "datetime" property on the File level is the only date or time property that is guaranteed to be queryable in all DataFinders. This is because a date or time property is only queryable in a DataFinder if it has been optimized. The "datetime" property on the File level is one of a small number of base model properties that are always optimized in every DataFinder. The "unit_string" property on the Channel level is the starting point of engineering unit handling in all NI data management software. Store your engineering unit symbol in this property, and you will have many more options for unit management downstream.
Writing Properties to the Right TDMS Level
A TDMS property has three components: the property value, the property name, and the property level. The TDMS file format allows you to save a property to the file, any of the channel groups in the file, or any of the channels in any of the channel groups. Where you save the property makes a big difference in its usefulness. The following recommendations maximize your ability to search for desired parts of a TDMS file based on one or more property conditions.
The converse rules also apply:
A common mistake is to write a collection of set-up properties to a channel group called something like Setup Info which contains no channels. Because of the rules behind the hierarchical DataFinder searching, this makes it impossible to search for selected channel groups or selected channels that satisfy these set-up property conditions.
Acquired data channels have associated timing information, either implicitly (constant sampling rate) or explicitly (channel of time values). In order to automatically analyze these data channels, the associated time information must be reliably identifiable when reading the TDMS file. Note that the terms timing and time here are intended to include a larger category of associated x-axis information.
Most often, acquired data is plotted versus time on the x-axis, but other examples of associated x-axis information include angle, frequency, displacement, and so on. The following recommendations pertain equally to these other associated x-axis quantities, even though the term time is used below for simplicity. Two commonly used methods of recording the associated timing information are implicit waveform channel properties and explicit date/time channels. Each requires a number of additional channel properties to be completely documented.
Associating Time Channels and Data Channels
If your TDMS file’s associated time values for the acquired data channels are stored in one or more explicit time channels, then you need a convention to indicate which acquired data channels are associated with which explicit time channels. A convention is needed here because the TDMS file format does not provide a built-in method to make this association.
The clearest and simplest approach is to always have only one explicit time channel inside each channel group and to always position the explicit time channel as the first channel in that channel group. This leads to two common cases: one explicit time channel plus one acquired data channel in each channel group (XY) or one explicit time channel plus multiple acquired data channels in each channel group (XYYY), as the following figures illustrate:
Figure 2. XY Channel Group
Figure 3. XYYY Channel Group
If you adopt this suggestion to have only one time channel in each channel group, then you should set the “wf_xcolumns” property on the channel group level to have the value "one", which will tell all NI software that the first channel in that group is the time channel. If for some reason your time channel is not first, you can set the “xchannel” property on the channel group level and fill it with the name of the time channel in that channel group (in the current example, that name is “Time”).
Table 2. Properties to set for XY Channel Relationships
Level | Name | Value |
Group | wf_xcolumns | one |
Group | xchannel | Time |
Writing Complete DateTime Channels and Properties
TDMS files offer a native datetime data type for both properties and data channels. When you save datetime information, make sure to always use this built-in datetime option. Writing a numeric value of elapsed seconds is not sufficient to record a datetime, because different applications that write and read TDMS files have different conventions for the starting datetime value and even the increment metric (seconds as opposed to days). For NI LabVIEW programs, you should always wire a brown datetime wire directly to the property value or channel data input, as shown by the following:
Figure 4. DateTime Property and Data Channel in NI LabVIEW
An additional consideration for datetime values is the recording and reading of geographic location (time zone). Some applications that write and read TDMS files are geo-relative (assume the same time zone), while others are geo-absolute (UTC, based on Greenwich, England). If the TDMS writing and reading applications do not match in their geography expectations, your read datetime values can be different from your written datetime values. To safeguard against this possibility, it is best to save an additional UTC_Offset property (as a real number) which stores the number of (fractional) hours between the TDMS writing application and Greenwich Mean Time.
Writing Complete Waveform Channels
If your data channels are acquired at irregular time intervals, then an explicit time channel is required to document the timing accurately. More often, though, all data channels are acquired at a constant sampling rate, usually hardware timed. In this case, simply storing the timing information as a set of channel properties is an elegant and entirely sufficient approach to accurately document their timing—no explicit time channels are needed. The standard waveform property names for TDMS files are listed in the following table.
Table 3. Standard Waveform Property Names
Name | Example | Required? | Description |
wf_xname | Time | Required | Name of the x-axis quantity |
wf_xunit_string | s | Required | Unit of the x-axis quantity |
wf_start_offset | 0 | Required | Start offset value of the x-axis |
wf_increment | 0.001 | Required | Increment value of the x-axis |
wf_start_time | Optional | Start DateTime value of the time axis | |
wf_samples | 1000 | Required | Number of values of the x-axis |
Note that the property names are case sensitive and must be in lowercase. The wf_xname and wf_xunit_string properties are not set by default in NI LabVIEW— you need to add those properties yourself to every waveform channel in the TDMS file.
The TDMS file format was designed to stream data as quickly as possible while still being flexible enough to accommodate changes in the number of channels and their sampling rates during the acquisition. Data files that stream quickly, though, do not necessarily load quickly. The TDMS file is an entirely binary file that consists of multiple sections, one layered on top of the other as you write to the file. These sections contain buffers of data values assigned to one or more channels and/or meta-data properties attached to one or more levels. The fewer sections the TDMS file has, in general, the faster it will load.
Each time a TDMS file is written or read, a TDMS_Index file is created that contains a map of the binary sections. Subsequent reads of the same TDMS file consult the TDMS_Index file to read all the properties and to determine the correct byte positions to read out each stored block of each channel. Roughly speaking, if the resulting TDMS_Index file is similar in size to its TDMS file, then this TDMS file is “fragmented,” meaning it has more sections than it needs and will therefore load slower than it should. There are several approaches you can implement, both during and after the acquisition, that will minimize the number of unnecessary sections and maximize the reading speed of the resulting TDMS file.
Writing TDMS Files with Minimal Fragmentation
First off, if you are acquiring data with NI’s data acquisition hardware, consider using the NI-DAQmx TDMS writing capability, since it automatically writes un-fragmented TDMS files. If you are using NI LabVIEW to acquire your data channels, you can choose the VIs from the TDMS Advanced palette to write a minimally fragmented TDMS file. If you are using the standard TDMS writing functions, the following tips will help minimize the TDMS file fragmentation.
Defragmenting TDMS Files After the Acquisition
Even if your data acquisition constraints force you to create fragmented TDMS data files, you can still address the issue after the acquisition. If you are using NI LabVIEW, you can execute the TDMS Defragment function to rewrite the TDMS file with minimal fragmentation. Alternatively, if you load the TDMS data file into NI DIAdem and simply resave it, the resulting TDMS data file will be minimally fragmented.
Writing Load-Speed-Enhancing Channel Properties
If your target application to read TDMS files is NI DIAdem, you can dramatically improve loading speed into NI DIAdem by creating the following four properties attached to every numeric or datetime data channel in the TDMS file. If any one of these four properties is missing or not populated with a valid value when a given TDMS data channel is being loaded, then NI DIAdem will automatically calculate all four properties for that channel in order to speed up graph axis auto-scaling. If these properties are already created and filled with valid values and attached to each data channel in the TDMS file, then that TDMS file will load much faster into NI DIAdem, because that calculation for each channel will be avoided during the TDMS file loading process.
Note that the property names are case sensitive and must be in lowercase. None of these four channel properties are set by default in NI LabVIEW— you need to add those properties yourself to every waveform channel you create in the TDMS file.
Table 3. Properties to Improve Loading Speed in NI DIAdem
Name | Example | Description |
minimum | -3.14 | The minimum value of the channel |
maximum | 3.14 | The maximum value of the channel |
monotony | Not monotone | If the channel is monotone rising or falling |
novaluekey | No | If any NaN values are in the channel |
You collect data to make decisions. However, inefficiently organizing the raw data may cause problems when analyzing your data. The key to organizing raw data in your application involves thinking about the current system requirements and how the file can adapt for future application needs.
|