Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Watchdog timer
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Fault detection== A watchdog timer provides automatic detection of catastrophic malfunctions that prevent the computer from kicking it. However, computers can have other, less-severe types of faults which do not interfere with kicking, but which still require watchdog oversight. To support these, a computer system is typically designed so that its watchdog timer will be kicked only if the computer deems the system functional. The computer determines whether the system is functional by conducting one or more fault detection tests and will kick the watchdog only if all tests have passed.<ref name="ganssle"/> [[File:Wdctl screenshot.png|upright=1.5|thumb|Screenshot of <code>[[wdctl]]</code>, a program that shows watchdog status]] In computers that are running an operating system and multiple [[Process (computing)|processes]], a single, simple test might be insufficient to guarantee normal operation, as it could fail to detect a subtle fault condition and consequently kick the watchdog even though a fault condition exists. For example, in the case of the Linux operating system, a user-space watchdog [[Daemon (computing)|daemon]] may simply kick the watchdog periodically without performing any tests. As long as the daemon runs normally, the system will be protected against serious system crashes such as a [[kernel panic]]. To detect less severe faults, the daemon<ref name="LinuxWatchdogManpage"/> can perform tests that cover various aspects of the system condition, including resource availability (e.g., [[Computer memory|memory]], [[file handles]], CPU time), evidence of expected process activity (e.g., system daemons running, specific files being present or updated), overheating, and network activity.<ref name="LinuxWatchdogTests"/> Upon discovery of a failed test, the computer may attempt to perform a sequence of corrective actions under software control, culminating with a software-initiated reboot. If the software fails to invoke a reboot, the hardware watchdog timer β if available β will timeout and invoke a hardware reset. In effect, this is a multistage watchdog timer in which the software comprises the first and the hardware WDT the final stage. In a Linux system, for example, the watchdog daemon can be configured to attempt to perform a software-initiated reboot, which may be preferable to a hardware reset as it allows file systems to be safely [[Mount (computing)|unmounted]] and fault information to be logged prior to the reboot. It is essential, however, to have the insurance provided by a hardware WDT, to allow for the case in which a fault causes the daemon itself to malfunction, and thus become unable to invoke a reboot.<ref name="ganssle"/>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)