The architecture described in this paper provides a design pattern for the design of a rocket test system. There are many engineering decisions to make to implement this architecture. The goal of this paper is to guide a team through the aspects of the architecture to ensure that the most critical aspects are considered at the start of the design process.
When making decisions about how to implement the architecture, the following topics have proven to be some of the most critical to consider as early in the process as possible.
System Latency
Within each subsystem, and across the entire test facility, what is the worst possible delay that can be tolerated to maintain control and safety in the test area?
Latency across the system is the result of several design decisions. Faster loops will increase the communication speed from one component in one system to another component in another system. But engineers must also consider the number of hops between systems—if data must be passed among several systems to reach the final target, the cumulative delays will be more than if data can be passed more directly between control systems. When making design decisions, consider how data is made available to other systems—directly, or by being copied multiple times among the systems.
Timing
Systems spread across the facility will have differing time clocks. What amount of time skew can you tolerate across your measurements?
Most systems tag measurements with the system clock from the local device. When analyzing data across systems, it is helpful to be able to correlate the data across these systems. A common solution is to provide an absolute time across all the systems, using IEEE-1588 or a similar protocol. The time may be provided by the facility supervisor system, or systems may rely on a GPS signal for the timebase.
A related consideration is how to correlate the data between the process computer and the ground system. In a rocket test this is fairly straightforward, but in a launch situation this becomes more complicated because any links between the ground system and the rocket will be lost at launch. Since test systems replicate launch conditions, this should be considered when designing the test stand.
Distribution of Shared Resources
Which subsystems will be shared among test stands, and which subsystems will be dedicated to a specific test stand?
There are two costs to balance when considering shared resources—the costs of duplicating resources and the costs of sharing resources. There is a significant cost to set up a cryogenic tank system. But running cryogenic fluids to two separate test stands also incurs a significant cost and complexity. Sharing resources may also limit the ability to run two activities in parallel if they both require the resource to run.
Managing Race Conditions
How will shared resources be protected from competing instructions from control systems?
Any control system that can be controlled by multiple command systems runs the risk of performing an unintended action because of a lack of discipline in the communication system. For example, a valve may start an operation based on a request from a test stand, but if a second test stand overwrites the set point the result be a catastrophic failure in both stands.
The design team must carefully review the system for potential race conditions to ensure that there is a proper lockout procedure for any command signal in the system.
Race conditions can also affect measurement data if data is overwritten before a storage system retrieves the data—the data being retrieved may not be the intended data.
Redundancy
What systems must have redundant controls in place? What level of redundancy will be in place?
Redundancy can be applied at many places within a system—there can be redundant sensors, wiring, acquisition devices, processors, algorithms, or power supplies. Some space companies require triple redundancy throughout the system for maximum safety. Others identify the highest-risk failure points and focus redundancy efforts on those failure points.
There are several models of redundancy the design team can choose from for each point in the system. In standby redundancy, an identical secondary unit backs up the primary unit. In a cold standby system, the secondary unit is idle, operating only when a watchdog identifies that the primary unit has failed. In a hot standby system, the secondary unit is powered up and actively monitoring the system, but its outputs are not used until a watchdog switches control to it. This can shorten downtime in a failure but does not preserve the reliability of the secondary unit since it is in active operation.
Modular redundancy is similar to the hot standby approach, but both systems run in parallel, and both generate outputs for the system. A voting system, sometimes called an auctioneer or voter, decides which outputs should be used. This provides bumpless transfers in the event of a failure of one of the controllers. This model can be extended beyond two controllers to multiple controllers. These and other examples are discussed in the NI white paper on Redundant System Basic Concepts.
Environmental Requirements
What environments will the measurement equipment be subjected to? What additional infrastructure do we need to protect the measurement and control equipment?
During a propulsion test, equipment on or near the stand will be subject to extreme environmental conditions. These may include sudden shocks, continuous vibration, and high temperatures.
Between tests, equipment will also be subject to environmental extremes. Hot or cold temperatures, humidity, and salt spray are all specific threats to the availability of the equipment for a test.
Engineers must be aware of the environmental conditions of the test stand. With that information, they must select or design equipment that exceeds the potential requirements of the system. This may require that they buy ruggedized equipment, add protection like conformal coating, or protect the equipment in a cabinet or an environmentally controlled outbuilding.
Network Topology
What network technologies will provide the optimum performance for data transfer on the network, including redundancy in case of a component failure?
There are many options available when designing a network topology. A successful facility topology will require detailed conversations between the IT infrastructure team and the test engineering teams. Test teams will need to describe data bandwidth, latency, and technology needs. The IT team will need to understand encryption, layout, and redundancy to plan the network layout.
Among the decisions in designing the network, the design team must decide on a redundancy model—which may include running redundant network cables throughout the facility, using rapid spanning tree protocol (RSTP), and using multiple distribution switches.
I/O Coverage
What signals do we need to measure or control?
One of the first tasks the engineering team faces is collecting a list of the signals that need to be measured or controlled in the test stand. While documenting signals, they need to list the signal type, location, resolution, data rate, excitation needs, safety needs, and voltage and current levels.
With this information, engineers can collect the signals into measurement banks, and then select the right hardware to provide access to all the signals.
Data Bandwidth
Can the network topology handle the amount of data expected during a test?
The design of the network—including computing devices, switch hardware, and subnetwork architecture—establishes the limit to the amount of data that can be moved across the network. The design team must carefully review the components of the network, looking for any bottlenecks in the system.
A theoretical calculation can provide guidance to a system design, but network applications never achieve full theoretical data rates. Data overhead and latencies impact the total throughput on the network. In designing a network, it is advisable to keep data rates significantly below the theoretical limit.
Safety
What safety systems will be required to be in place?
A rocket facility has many dangerous conditions. A mistake in the design, implementation, or operation of the systems may result in a catastrophic accident. The design team must be aware of safety protocols required by federal and local laws. The design team must also consider how to protect the personnel, equipment, and area associated with the test station in ways that are not covered by laws.
Some of the areas at a rocket facility are hazardous zones because of the gases used to power the rocket engine. Some of these gases cannot be fully contained, creating a zone where any electrical spark can result in a fire or explosion. To prevent this situation, any equipment in the hazardous area must be intrinsically safe—that is, incapable of generating a spark. This can be managed by moving electrical equipment outside of the hazardous zone. An electrically controlled valve can be placed outside the zone, so that the only equipment in the zone is the pipe leading away from the valve.
If a device must be located inside the hazardous zone, the equipment must be certified as intrinsically safe by the equipment vendor. In the U.S. this means Class 1 Div 1 certification. In Europe, this means ATEX certification based on the gas type.
If a device is outside the hazardous zone, but runs a signal into it, the device must have an intrinsic barrier to prevent a spark generated in the device from being passed into the zone. Even low-level devices, like thermocouple measurement instruments, require an intrinsic barrier to prevent power from the device (like an attached power supply) from passing into the zone. An intrinsic barrier can be attached into the signal path between the device and the hazardous zone and provides protection against both voltage and current spikes. Note that intrinsic barriers vary based on the signal type, so a barrier designed for a thermocouple would not be appropriate for a valve controller.
Certifications
What certifications need to be met by the facility, support systems, and test stand?
Different certifications are required for different areas based on populations, the purpose of the facility, local laws, and the purpose of the rocket equipment. For example, a rocket test performed on a U.S. Air Force Base may require AFSPCMAN 91-7108 certification prior to any rocket activity.
In addition to certifications required to perform the test, certifications impact the goal of the testing. If the purpose of the testing is to certify the rocket engine for use, the test stand design must meet the demands of that certification. For example, MIL-STD-8109 ensures that the device being tested meets the expected conditions of the use of the product. MIL-STD-20210 ensures that components under 300 lbs. meet the electrical and environmental requirements of a demanding application. These may be required if the U.S. Department of Defense is an intended customer of the technology being tested.