Coding bug??? Not sure where you are getting this. I specifically excluded SEU because that is one situation that can cause problems with
*any* design in unpredictable ways. Otherwise this is a system design issue. If your system is subject to "glitches" then a means should be designed into the handshake to resolve timeouts. Resetting the entire FPGA or board shouldn't be necessary.You are talking about a specified timeout on a communications protocol, not a watchdog.
If you can design sequential control logic that doesn't have FSMs, then you are a better man than I am... or you are good at renaming circuits. Everything sequential is an FSM other than simple data registers. A counter is a FSM.
Yes, defining transitions for every possible state is a good tool, if needed. But by default adding a watchdog timer is overkill, especially when it simply masks a bug rather than exposing it.
The Transputer had math instructions that would halt the CPU when an overflow occurred. It sounded crazy at the time, but that is actually preferable to letting an erroneous system continue running. Watchdogs are often like that, they let a system continue running in a corrupt way rather than pointing to the bug.
Again, I don't call that a watchdog since it is actually a part of your protocol. A watchdog is used to catch problem you know nothing about but you want the system to continue to run. In CPUs they reset the system so user intervention isn't required. But it is still a disruption to the user if they are using it at the time.
In FPGAs the logic can be designed to not hang. It may require work to do the proper analysis, but it is not just possible, but saves money in the long run when you don't need to fix difficult to find bugs. Bottom line is adding a watchdog to an FPGA to catch unknown problems shows that something is missing from the design process.