Test levels, constraints, and recovery procedures

Discuss the generic proposals for SJTAG
User avatar
Bradford Van Treuren
SJTAG Chair Emeritus
Posts: 152
Joined: Fri Nov 16, 2007 2:06 pm
Location: VT Enterprises Consulting Services, USA

Test levels, constraints, and recovery procedures

Post by Bradford Van Treuren »

This discussion topic was realized during the 11 February 2008 meeting. The purpose of this discussion forum is to discuss the granularity of test levels that are required for system level boundary-scan testing, the tests which may be performed at each of these levels, the associated test constraints for each test type, and the circuit recovery procedure that is required following the application of the tests at the various test levels.

The identified test levels so far are:
  • Power-On Self Test (POST)
    • Pre-boot (test co-processor initiated test)
    • Boot level test (Primarily focus on the hardware required to boot the OS and other hardware if the boot time budget allows. This level of testing could be performed by a test co-processor or by a software based testing process applying boundary-scan vectors to the circuit under test. In the latter case, only the circuits external to the boot hardware could be tested using boundary-scan since the boot hardware is required to perform the testing operations. Since the boot hardware is exercised during the application of the boundary-scan vectors, it may be presumed the boot hardware is functioning properly if the scan tests pass. Clearly, there may be some gaps in the coverage of the boot hardware with this approach.)
    • Firmware or Board Support Package level test (Test the hardware that was not able to be tested by Boot level test and/or additional hardware on the board that is testable without requiring total reboot and fits within overall system boot time budget
    • Initiated test level (All tests at this level may be selected and applied by the system diagnostic management software at an on-demand request operation)
  • Initiated test or On-demand testing
    • These tests may be run by the system level diagnostic software and pertain to two categories of testing state for a board. These state categories are:
      1. Off-line or out-of-service state (In this state, the circuit under test is not performing its functional mission (e.g., call processing, image processing), but is in a mode where the board is responsive to the system controller and poised for operation or is in a failed state that preventing it from performing its operation and is needing attention. This is the state in which intrusive interconnect tests and configuration changes to the circuit may be performed.)
      2. Active or On-line state (In this state, the circuit under test is performing its functional mission. Thus, only non-intrusive JTAG operations may be performed to this circuit during this state. These operations may include register monitoring (e.g., voltage or temperature values), SAMPLE of signal states, or other non-intrusive operations as defined by the IEEE 1149.1 standard or device vendor.)
There has been much discussion as to what constitutes a POST operation. The three areas identified during the discussion are:
  • FRU level POST (Typically, this involved the autonomous testing of the smallest field replacable unit (FRU) by itself. This may be a board in a chassis slot, a mezzanine board plugged into a carrier board, or a carrier board with associated mezzanine boards in aggregate as a since FRU.)
  • System level POST (This includes the FRU level POST as well as testing the interconnections between the FRU members.)
  • Functional Block Test (This area represents the partitioning of an FRU into its functional blocks where multiple similar functional blocks are used to provide a parallel processing structure for a circuit with one or more block elements allowed to be decommissioned without forcing the overall circuit from being able to perform its functional mission. An example is multiple DSP filters used to process information in parallel to boost the overall throughput/bandwidth of a circuit. It is believed that future systems may constitute more of this type of circuitry and demand a more precise partitioning of testing in the system.)
For test constraints, board level constraints seem to be easier to identify for tests, but still not all issues have been identified. For board structural testing (aka, interconnect testing on the board), there must be constraints on the edge connector signals that go off-board to prevent the signals from affecting the operation of the system. It has been noted that there is not a fixed set of constraints for all tests and that test developers need to have the freedom to pick and choose what signals need to be constrained for tests performed in the system. I feel there may be some general guidelines we can produce that would be beneficial to the community to aid in ensuring a test is going to work in a system without causing system operational problems.

The subject of test recovery in a system is a difficult one. Boundary-Scan testing is notorious at causing the internal logic of devices to go into wierd states following the application of an interconnect test. This is because many of the stimulus states provided by the test are unusual and abnormal to the circuit design. Since the internals of the device are not insulated from the changes of inputs being applied by the tests during EXTEST (or even INTEST), the stimulus can cause circuits to lock up and not function properly following a test. Thus, there needs to be some form of recovery process to allow the circuit being tested to be transisitioned into a known stable state before going into service. The method discussed is a way to force a reset of the board to reinitialize the circuitry involved in the testing. The difficulty with this approach is that a reset will trigger the POST testing to take place again. Various schemes were discussed at the meeting as to how people get around the "chicken and egg" problem (what comes first?). All agree there needs to be a way to conditionally apply the boundary-scan testing at power-up and that a recovery process should be able to persistently preserve during the restart that boundary-scan POST was already performed and not needed again. This is true for both hardware co-processor test interfaces as well as software based interfaces.

I leave further discussion for replies to this forum site.

Bradford Van Treuren
Distinguished Member of Technical Staff
VT Enterprises Consulting Services