xref: /netbsd-src/share/man/man9/heartbeat.9 (revision cecde1b5250be188abd1ea1de5507e00d7ddefbe)
1.\"	$NetBSD: heartbeat.9,v 1.6 2024/06/02 13:28:45 andvar Exp $
2.\"
3.\" Copyright (c) 2023 The NetBSD Foundation, Inc.
4.\" All rights reserved.
5.\"
6.\" Redistribution and use in source and binary forms, with or without
7.\" modification, are permitted provided that the following conditions
8.\" are met:
9.\" 1. Redistributions of source code must retain the above copyright
10.\"    notice, this list of conditions and the following disclaimer.
11.\" 2. Redistributions in binary form must reproduce the above copyright
12.\"    notice, this list of conditions and the following disclaimer in the
13.\"    documentation and/or other materials provided with the distribution.
14.\"
15.\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
16.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
17.\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
18.\" PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
19.\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
20.\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
21.\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
22.\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
23.\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
24.\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
25.\" POSSIBILITY OF SUCH DAMAGE.
26.\"
27.Dd July 6, 2023
28.Dt HEARTBEAT 9
29.Os
30.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
31.Sh NAME
32.Nm heartbeat
33.Nd periodic checks to ensure CPUs are making progress
34.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
35.Sh SYNOPSIS
36.Cd "options HEARTBEAT"
37.Cd "options HEARTBEAT_MAX_PERIOD_DEFAULT=15"
38.Pp
39.\"
40.In sys/heartbeat.h
41.\"
42.Ft void
43.Fn heartbeat_start void
44.Ft void
45.Fn heartbeat void
46.Ft void
47.Fn heartbeat_suspend void
48.Ft void
49.Fn heartbeat_resume void
50.Fd "#ifdef DDB"
51.Ft void
52.Fn heartbeat_dump void
53.Fd "#endif"
54.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
55.Sh DESCRIPTION
56The
57.Nm
58subsystem verifies that soft interrupts
59.Pq Xr softint 9
60and the system
61.Xr timecounter 9
62are making progress over time, and panics if they appear stuck.
63.Pp
64The number of seconds before
65.Nm
66panics without progress is controlled by the sysctl knob
67.Li kern.heartbeat.max_period ,
68which defaults to 15.
69If set to zero, heartbeat checks are disabled.
70.Pp
71The periodic hardware timer interrupt handler calls
72.Fn heartbeat
73every tick on each CPU.
74Once per second
75.Po
76i.e., every
77.Xr hz 9
78ticks
79.Pc ,
80.Fn heartbeat
81schedules a soft interrupt at priority
82.Dv SOFTINT_CLOCK
83to advance the current CPU's view of
84.Xr time_uptime 9 .
85.Pp
86.Fn heartbeat
87checks whether
88.Xr time_uptime 9
89has changed, to see if either the
90.Xr timecounter 9
91or soft interrupts on the current CPU are stuck.
92If it hasn't advanced within
93.Li kern.heartbeat.max_period
94seconds worth of ticks, or if it has updated and the current CPU's view
95of it hasn't been updated by more than
96.Li kern.heartbeat.max_period
97seconds, then
98.Fn heartbeat
99panics.
100.Pp
101.Fn heartbeat
102also checks whether the next online CPU has advanced its view of
103.Xr time_uptime 9 ,
104to see if soft interrupts
105.Pq including Xr callout 9
106on that CPU are stuck.
107If it hasn't updated within
108.Li kern.heartbeat.max_period
109seconds,
110.Fn heartbeat
111sends an
112.Xr ipi 9
113to panic on that CPU.
114If that CPU has not acknowledged the
115.Xr ipi 9
116within one second,
117.Fn heartbeat
118panics on the current CPU instead.
119.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
120.Sh FUNCTIONS
121.Bl -tag -width Fn
122.It Fn heartbeat
123Check for timecounter and soft interrupt progress on this CPU and on
124another CPU, and schedule a soft interrupt to advance this CPU's view
125of timecounter progress.
126.Pp
127Called by
128.Xr hardclock 9
129periodically.
130.It Fn heartbeat_dump
131Print each CPU's heartbeat counter, uptime cache, and uptime cache
132timestamp (in units of heartbeats) to the console.
133.Pp
134Can be invoked from
135.Xr ddb 9
136by
137.Ql call heartbeat_dump .
138.It Fn heartbeat_resume
139Resume heartbeat monitoring of the current CPU.
140.Pp
141Called after a CPU has started running but before it has been
142marked online.
143.It Fn heartbeat_start
144Start monitoring heartbeats systemwide.
145.Pp
146Called by
147.Fn main
148in
149.Pa sys/kern/init_main.c
150as soon as soft interrupts can be established.
151.It Fn heartbeat_suspend
152Suspend heartbeat monitoring of the current CPU.
153.Pp
154Called after the current CPU has been marked offline but before it has
155stopped running.
156.El
157.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
158.Sh CODE REFERENCES
159The
160.Nm
161subsystem is implemented in
162.Pa sys/kern/kern_heartbeat.c .
163.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
164.Sh SEE ALSO
165.Xr swwdog 4 ,
166.Xr wdogctl 8
167.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
168.Sh HISTORY
169The
170.Nm
171subsystem first appeared in
172.Nx 11.0 .
173