diff options
Diffstat (limited to 'Documentation/driver-api/soundwire/error_handling.rst')
-rw-r--r-- | Documentation/driver-api/soundwire/error_handling.rst | 65 |
1 files changed, 65 insertions, 0 deletions
diff --git a/Documentation/driver-api/soundwire/error_handling.rst b/Documentation/driver-api/soundwire/error_handling.rst new file mode 100644 index 000000000000..aa3a0a23a066 --- /dev/null +++ b/Documentation/driver-api/soundwire/error_handling.rst @@ -0,0 +1,65 @@ +======================== +SoundWire Error Handling +======================== + +The SoundWire PHY was designed with care and errors on the bus are going to +be very unlikely, and if they happen it should be limited to single bit +errors. Examples of this design can be found in the synchronization +mechanism (sync loss after two errors) and short CRCs used for the Bulk +Register Access. + +The errors can be detected with multiple mechanisms: + +1. Bus clash or parity errors: This mechanism relies on low-level detectors + that are independent of the payload and usages, and they cover both control + and audio data. The current implementation only logs such errors. + Improvements could be invalidating an entire programming sequence and + restarting from a known position. In the case of such errors outside of a + control/command sequence, there is no concealment or recovery for audio + data enabled by the SoundWire protocol, the location of the error will also + impact its audibility (most-significant bits will be more impacted in PCM), + and after a number of such errors are detected the bus might be reset. Note + that bus clashes due to programming errors (two streams using the same bit + slots) or electrical issues during the transmit/receive transition cannot + be distinguished, although a recurring bus clash when audio is enabled is a + indication of a bus allocation issue. The interrupt mechanism can also help + identify Slaves which detected a Bus Clash or a Parity Error, but they may + not be responsible for the errors so resetting them individually is not a + viable recovery strategy. + +2. Command status: Each command is associated with a status, which only + covers transmission of the data between devices. The ACK status indicates + that the command was received and will be executed by the end of the + current frame. A NAK indicates that the command was in error and will not + be applied. In case of a bad programming (command sent to non-existent + Slave or to a non-implemented register) or electrical issue, no response + signals the command was ignored. Some Master implementations allow for a + command to be retransmitted several times. If the retransmission fails, + backtracking and restarting the entire programming sequence might be a + solution. Alternatively some implementations might directly issue a bus + reset and re-enumerate all devices. + +3. Timeouts: In a number of cases such as ChannelPrepare or + ClockStopPrepare, the bus driver is supposed to poll a register field until + it transitions to a NotFinished value of zero. The MIPI SoundWire spec 1.1 + does not define timeouts but the MIPI SoundWire DisCo document adds + recommendation on timeouts. If such configurations do not complete, the + driver will return a -ETIMEOUT. Such timeouts are symptoms of a faulty + Slave device and are likely impossible to recover from. + +Errors during global reconfiguration sequences are extremely difficult to +handle: + +1. BankSwitch: An error during the last command issuing a BankSwitch is + difficult to backtrack from. Retransmitting the Bank Switch command may be + possible in a single segment setup, but this can lead to synchronization + problems when enabling multiple bus segments (a command with side effects + such as frame reconfiguration would be handled at different times). A global + hard-reset might be the best solution. + +Note that SoundWire does not provide a mechanism to detect illegal values +written in valid registers. In a number of cases the standard even mentions +that the Slave might behave in implementation-defined ways. The bus +implementation does not provide a recovery mechanism for such errors, Slave +or Master driver implementers are responsible for writing valid values in +valid registers and implement additional range checking if needed. |