History log of /openbsd-src/sys/kern/kern_sched.c (Results 1 – 25 of 103)
Revision Date Author Comments
# 35bbce86 24-Nov-2024 claudio <claudio@openbsd.org>

Add KASSERT on P_WSLEEP in setrunqueue() and sched_chooseproc().

P_WSLEEP indicates that the thread is still on a CPU executing and
has not yet mi_switched away to sleep. So it is a bug to make such

Add KASSERT on P_WSLEEP in setrunqueue() and sched_chooseproc().

P_WSLEEP indicates that the thread is still on a CPU executing and
has not yet mi_switched away to sleep. So it is a bug to make such
a thread runnable or even worse try to switch to it.
OK mpi@

show more ...


# 7b3f8d1d 08-Oct-2024 claudio <claudio@openbsd.org>

Move common code to update the proc runtime into tuagg_add_runtime().

OK mpi@ kn@


# 1b325262 06-Oct-2024 jsg <jsg@openbsd.org>

remove unused sched_cost_load variable


# 9c699cfd 09-Jul-2024 claudio <claudio@openbsd.org>

In sched_toidle() only call the TRACEPOINT if curproc is set.
sched_toidle() is called by cpu_hatch() to start APs and then curproc
may be NULL.
OK mpi@


# 6fc6c6cc 08-Jul-2024 mpi <mpi@openbsd.org>

Remove the KASSERT() in sched_unpeg_curproc().

This fix rebooting a GENERIC.MP kernel on SP machines because unpeg is out
of the loop in smr_thread().


# cf31dfde 08-Jul-2024 mpi <mpi@openbsd.org>

Introduce sched_unpeg_curproc() to abstract the current implementation.

ok kettenis@, mlarkin@, miod@, claudio@


# 241d6723 08-Jul-2024 claudio <claudio@openbsd.org>

Rework per proc and per process time usage accounting

For procs (threads) the accounting happens now lockless by curproc using
a generation counter. Callers need to use tu_enter() and tu_leave() for

Rework per proc and per process time usage accounting

For procs (threads) the accounting happens now lockless by curproc using
a generation counter. Callers need to use tu_enter() and tu_leave() for this.
To read the proc p_tu struct tuagg_get_proc() should be used. It ensures
that the values read is consistent.

For processes only the time of exited threads is accumulated in ps_tu and
to get the proper process time usage tuagg_get_process() needs to be called.
tuagg_get_process() will sum up all procs p_tu plus the ps_tu.

This removes another SCHED_LOCK() dependency. Adjust the code in
exit1() and exit2() to correctly account for the full run time.
For this adjust sched_exit() to do the runtime accounting like it is done
in mi_switch().

OK jca@ dlg@

show more ...


# a09e9584 03-Jun-2024 claudio <claudio@openbsd.org>

Remove the now unsued s argument to SCHED_LOCK and SCHED_UNLOCK.

The SPL level is not tacked by the mutex and we no longer need to track
this in the callers.
OK miod@ mlarkin@ tb@ jca@


# 2286c11f 28-Feb-2024 mpi <mpi@openbsd.org>

No need to kick a CPU twice when putting a thread on its runqueue.

From Christian Ludwig, ok claudio@


# 1d970828 24-Jan-2024 cheloha <cheloha@openbsd.org>

clockintr: switch from callee- to caller-allocated clockintr structs

Currently, clockintr_establish() calls malloc(9) to allocate a
clockintr struct on behalf of the caller. mpi@ says this behavior

clockintr: switch from callee- to caller-allocated clockintr structs

Currently, clockintr_establish() calls malloc(9) to allocate a
clockintr struct on behalf of the caller. mpi@ says this behavior is
incompatible with dt(4). In particular, calling malloc(9) during the
initialization of a PCB outside of dt_pcb_alloc() is (a) awkward and
(b) may conflict with future changes/optimizations to PCB allocation.

To side-step the problem, this patch changes the clockintr subsystem
to use caller-allocated clockintr structs instead of callee-allocated
structs.

clockintr_establish() is named after softintr_establish(), which uses
malloc(9) internally to create softintr objects. The clockintr subsystem
is no longer using malloc(9), so the "establish" naming is no longer apt.
To avoid confusion, this patch also renames "clockintr_establish" to
"clockintr_bind".

Requested by mpi@. Tweaked by mpi@.

Thread: https://marc.info/?l=openbsd-tech&m=170597126103504&w=2

ok claudio@ mlarkin@ mpi@

show more ...


# bb00e811 24-Oct-2023 claudio <claudio@openbsd.org>

Normally context switches happen in mi_switch() but there are 3 cases
where a switch happens outside. Cleanup these code paths and make the
machine independent.

- when a process forks (fork, tfork,

Normally context switches happen in mi_switch() but there are 3 cases
where a switch happens outside. Cleanup these code paths and make the
machine independent.

- when a process forks (fork, tfork, kthread), the new proc needs to
somehow be scheduled for the first time. This is done by proc_trampoline.
Since proc_trampoline is machine dependent assembler code change
the MP specific proc_trampoline_mp() to proc_trampoline_mi() and make
sure it is now always called.
- cpu_hatch: when booting APs the code needs to jump to the first proc
running on that CPU. This should be the idle thread for that CPU.
- sched_exit: when a proc exits it needs to switch away from itself and
then instruct the reaper to clean up the rest. This is done by switching
to the idle loop.

Since the last two cases require a context switch to the idle proc factor
out the common code to sched_toidle() and use it in those places.

Tested by many on all archs.
OK miod@ mpi@ cheloha@

show more ...


# 709f9596 19-Sep-2023 claudio <claudio@openbsd.org>

Add a KASSERT for p->p_wchan == NULL to setrunqueue()

There is the same check in sched_chooseproc() but that is too late
to know where the bad insertion into the runqueue was done.
OK mpi@


# a332869a 14-Sep-2023 cheloha <cheloha@openbsd.org>

clockintr, scheduler: move statclock handle from clockintr_queue to schedstate_percpu

Move the statclock handle from clockintr_queue.cq_statclock to
schedstate_percpu.spc_statclock. Establish spc_s

clockintr, scheduler: move statclock handle from clockintr_queue to schedstate_percpu

Move the statclock handle from clockintr_queue.cq_statclock to
schedstate_percpu.spc_statclock. Establish spc_statclock during
sched_init_cpu() alongside the other scheduler clock interrupts.

Thread: https://marc.info/?l=openbsd-tech&m=169428749720476&w=2

show more ...


# a3464c93 10-Sep-2023 cheloha <cheloha@openbsd.org>

clockintr: support an arbitrary callback function argument

Callers can now provide an argument pointer to clockintr_establish().
The pointer is kept in a new struct clockintr member, cl_arg. The
po

clockintr: support an arbitrary callback function argument

Callers can now provide an argument pointer to clockintr_establish().
The pointer is kept in a new struct clockintr member, cl_arg. The
pointer is passed as the third parameter to clockintr.cl_func when it
is executed during clockintr_dispatch(). Like the callback function,
the callback argument is immutable after the clockintr is established.

At present, nothing uses this. All current clockintr_establish()
callers pass a NULL arg pointer. However, I am confident that dt(4)'s
profile provider will need this in the near future.

Requested by dlg@ back in March.

show more ...


# 529ac442 06-Sep-2023 cheloha <cheloha@openbsd.org>

clockintr: clockintr_establish: change first argument to a cpu_info pointer

All CPUs control a single clockintr_queue. clockintr_establish()
callers don't need to know about the underlying clockint

clockintr: clockintr_establish: change first argument to a cpu_info pointer

All CPUs control a single clockintr_queue. clockintr_establish()
callers don't need to know about the underlying clockintr_queue.
Accepting a cpu_info pointer as argument simplifies the API.

From mpi@.

ok mpi@

show more ...


# 3cdedeae 31-Aug-2023 cheloha <cheloha@openbsd.org>

sched_cpu_init: remove unnecessary NULL-checks for clockintr pointers

sched_cpu_init() is only run once per cpu_info struct, so we don't
need these NULL-checks.

The NULL-checks are a vestige of clo

sched_cpu_init: remove unnecessary NULL-checks for clockintr pointers

sched_cpu_init() is only run once per cpu_info struct, so we don't
need these NULL-checks.

The NULL-checks are a vestige of clockintr_cpu_init(), which runs more
than once per CPU and uses the checks to avoid leaking clockintr handles.

Thread: https://marc.info/?l=openbsd-tech&m=169349579804340&w=2

ok claudio@

show more ...


# 94c38e45 29-Aug-2023 claudio <claudio@openbsd.org>

Remove p_rtime from struct proc and replace it by passing the timespec
as argument to the tuagg_locked function.

- Remove incorrect use of p_rtime in other parts of the tree. p_rtime was
almost alwa

Remove p_rtime from struct proc and replace it by passing the timespec
as argument to the tuagg_locked function.

- Remove incorrect use of p_rtime in other parts of the tree. p_rtime was
almost always 0 so including it in any sum did not alter the result.
- In main() the update of time can be further simplified since at that time
only the primary cpu is running.
- Add missing nanouptime() call in cpu_hatch() for hppa
- Rename tuagg_unlocked to tuagg_locked like it is done in the rest of
the tree.

OK cheloha@ dlg@

show more ...


# 9b3d5a4a 14-Aug-2023 mpi <mpi@openbsd.org>

Extend scheduler tracepoints to follow CPU jumping.

- Add two new tracpoints sched:fork & sched:steal
- Include selected CPU number in sched:wakeup
- Add sched:unsleep corresponding to sched:sleep w

Extend scheduler tracepoints to follow CPU jumping.

- Add two new tracpoints sched:fork & sched:steal
- Include selected CPU number in sched:wakeup
- Add sched:unsleep corresponding to sched:sleep which matches add/removal
of threads on the sleep queue

ok claudio@

show more ...


# 9ac452c7 11-Aug-2023 cheloha <cheloha@openbsd.org>

hardclock(9), roundrobin: make roundrobin() an independent clock interrupt

- Remove the roundrobin() call from hardclock(9).

- Revise roundrobin() to make it a valid clock interrupt callback.
It

hardclock(9), roundrobin: make roundrobin() an independent clock interrupt

- Remove the roundrobin() call from hardclock(9).

- Revise roundrobin() to make it a valid clock interrupt callback.
It is still periodic and it still runs at one tenth of the hardclock
frequency.

- Account for multiple expirations in roundrobin(): if two or more
roundrobin periods have elapsed, set SPCF_SHOULDYIELD on the running
thread immediately to simulate normal behavior.

- Each schedstate_percpu has its own roundrobin() handle, spc_roundrobin.
spc_roundrobin is started/advanced during clockintr_cpu_init().
Intervals elapsed across suspend/resume are discarded.

- rrticks_init and schedstate_percpu.spc_rrticks are now useless:
delete them.

Tweaked by mpi@. With input from mpi@ and claudio@.

Thread: https://marc.info/?l=openbsd-tech&m=169127381314651&w=2

ok mpi@ claudio@

show more ...


# 44e0cbf2 05-Aug-2023 cheloha <cheloha@openbsd.org>

hardclock(9): move setitimer(2) code into itimer_update()

- Move the setitimer(2) code responsible for updating the ITIMER_VIRTUAL
and ITIMER_PROF timers from hardclock(9) into a new clock interru

hardclock(9): move setitimer(2) code into itimer_update()

- Move the setitimer(2) code responsible for updating the ITIMER_VIRTUAL
and ITIMER_PROF timers from hardclock(9) into a new clock interrupt
routine, itimer_update(). itimer_update() is periodic and runs at the
same frequency as the hardclock.

+ Revise itimerdecr() to run within itimer_mtx instead of entering
and leaving it.

- Each schedstate_percpu has its own itimer_update() handle, spc_itimer.
A new scheduler flag, SPCF_ITIMER, indicates whether spc_itimer was
started during the last mi_switch() and needs to be stopped during the
next mi_switch() or sched_exit().

- A new per-process flag, PS_ITIMER, indicates whether ITIMER_VIRTUAL
and/or ITIMER_PROF are running. Checking the flag is easier than
entering itimer_mtx to check process.ps_timer[]. The flag is set
and cleared in a new helper function, process_reset_itimer_flag().

- In setitimer(), call need_resched() when the state of ITIMER_VIRTUAL
or ITIMER_PROF is changed to force an mi_switch() and update
spc_itimer.

claudio@ notes that ITIMER_PROF could be implemented as a high-res
timer using the thread's execution time as a guide for when to
interrupt the process and assert SIGPROF. This would probably work
really well in single-threaded processes. ITIMER_VIRTUAL would be
more difficult to make high-res, though, as you need to exclude time
spent in the kernel.

Tested on powerpc64 by gkoehler@. With input from claudio@.

Thread: https://marc.info/?l=openbsd-tech&m=169038818517101&w=2

ok claudio@

show more ...


# 1588c842 05-Aug-2023 claudio <claudio@openbsd.org>

Remove the P_WSLEEP specific KASSERT(). Not only procs in state SSTOP
can be added to the run queue but also procs in state SRUN. The latter
happens when schedcpu() kicks in before the proc had a cha

Remove the P_WSLEEP specific KASSERT(). Not only procs in state SSTOP
can be added to the run queue but also procs in state SRUN. The latter
happens when schedcpu() kicks in before the proc had a chance to run.
Problem spotted by gkoehler@
OK cheloha@

show more ...


# 834cc80d 03-Aug-2023 claudio <claudio@openbsd.org>

Remove the per-cpu loadavg calculation.
The current scheduler useage is highly questionable and probably not helpful.
OK kettenis@ cheloha@ deraadt@


# 96496668 27-Jul-2023 cheloha <cheloha@openbsd.org>

sched_init_cpu: move profclock staggering to clockintr_cpu_init()

initclocks() runs after sched_init_cpu() is called for secondary CPUs,
so profclock_period is still zero and the clockintr_stagger()

sched_init_cpu: move profclock staggering to clockintr_cpu_init()

initclocks() runs after sched_init_cpu() is called for secondary CPUs,
so profclock_period is still zero and the clockintr_stagger() call for
spc_profclock is useless. For now, just stagger spc_profclock during
clockintr_cpu_init() along with everything else.

show more ...


# 671537bf 25-Jul-2023 cheloha <cheloha@openbsd.org>

statclock: move profil(2), GPROF code to profclock(), gmonclock()

This patch isolates profil(2) and GPROF from statclock(). Currently,
statclock() implements both profil(2) and GPROF through a comp

statclock: move profil(2), GPROF code to profclock(), gmonclock()

This patch isolates profil(2) and GPROF from statclock(). Currently,
statclock() implements both profil(2) and GPROF through a complex
mechanism involving both platform code (setstatclockrate) and the
scheduler (pscnt, psdiv, and psratio). We have a machine-independent
interface to the clock interrupt hardware now, so we no longer need to
do it this way.

- Move profil(2)-specific code from statclock() to a new clock
interrupt callback, profclock(), in subr_prof.c. Each
schedstate_percpu has its own profclock handle. The profclock is
enabled/disabled for a given CPU when it is needed by the running
thread during mi_switch() and sched_exit().

- Move GPROF-specific code from statclock() to a new clock interrupt
callback, gmonclock(), in subr_prof.c. Where available, each cpu_info
has its own gmonclock handle . The gmonclock is enabled/disabled for
a given CPU via sysctl(2) in prof_state_toggle().

- Both profclock() and gmonclock() have a fixed period, profclock_period,
that is initialized during initclocks().

- Export clockintr_advance(), clockintr_cancel(), clockintr_establish(),
and clockintr_stagger() via <sys/clockintr.h>. They have external
callers now.

- Delete pscnt, psdiv, psratio. From schedstate_percpu, also delete
spc_pscnt and spc_psdiv. The statclock frequency is not dynamic
anymore so these variables are now useless.

- Delete code/state related to the dynamic statclock frequency from
kern_clockintr.c. The statclock frequency can still be pseudo-random,
so move the contents of clockintr_statvar_init() into clockintr_init().

With input from miod@, deraadt@, and claudio@. Early revisions
cleaned up by claudio. Early revisions tested by claudio@. Tested by
cheloha@ on amd64, arm64, macppc, octeon, and sparc64 (sun4v).
Compile- and boot- tested on i386 by mlarkin@. riscv64 compilation
bugs found by mlarkin@. Tested on riscv64 by jca@. Tested on
powerpc64 by gkoehler@.

show more ...


# f2e7dc09 14-Jul-2023 claudio <claudio@openbsd.org>

struct sleep_state is no longer used, remove it.
Also remove the priority argument to sleep_finish() the code can use
the p_flag P_SINTR flag to know if the signal check is needed or not.
OK cheloha@

struct sleep_state is no longer used, remove it.
Also remove the priority argument to sleep_finish() the code can use
the p_flag P_SINTR flag to know if the signal check is needed or not.
OK cheloha@ kettenis@ mpi@

show more ...


12345