#
7ced3071 |
| 23-Jul-2024 |
Aaron LI <aly@aaronly.me> |
vfs/procfs: Whitespace and style fixes
|
#
2b3f93ea |
| 13-Oct-2023 |
Matthew Dillon <dillon@apollo.backplane.com> |
kernel - Add per-process capability-based restrictions
* This new system allows userland to set capability restrictions which turns off numerous kernel features and root accesses. These restricti
kernel - Add per-process capability-based restrictions
* This new system allows userland to set capability restrictions which turns off numerous kernel features and root accesses. These restrictions are inherited by sub-processes recursively. Once set, restrictions cannot be removed.
Basic restrictions that mimic an unadorned jail can be enabled without creating a jail, but generally speaking real security also requires creating a chrooted filesystem topology, and a jail is still needed to really segregate processes from each other. If you do so, however, you can (for example) disable mount/umount and most global root-only features.
* Add new system calls and a manual page for syscap_get(2) and syscap_set(2)
* Add sys/caps.h
* Add the "setcaps" userland utility and manual page.
* Remove priv.9 and the priv_check infrastructure, replacing it with a newly designed caps infrastructure.
* The intention is to add path restriction lists and similar features to improve jailess security in the near future, and to optimize the priv_check code.
show more ...
|
Revision tags: v6.4.0, v6.4.0rc1, v6.5.0, v6.2.2, v6.2.1, v6.3.0, v6.0.1 |
|
#
5229377c |
| 07-Sep-2021 |
Sascha Wildner <saw@online.de> |
kernel/libc: Remove the old vmm code.
Removes the kernel code and two system calls.
Bump __DragonFly_version too.
Reviewed-by: aly, dillon
|
#
1eeaf6b2 |
| 20-May-2021 |
Aaron LI <aly@aaronly.me> |
vm: Change 'kernel_map' global to type of 'struct vm_map *'
Change the global variable 'kernel_map' from type 'struct vm_map' to a pointer to this struct. This simplify the code a bit since all inv
vm: Change 'kernel_map' global to type of 'struct vm_map *'
Change the global variable 'kernel_map' from type 'struct vm_map' to a pointer to this struct. This simplify the code a bit since all invocations take its address. This change also aligns with NetBSD's 'kernal_map' that it's also a pointer, which also helps the porting of NVMM.
No functional changes.
show more ...
|
Revision tags: v6.0.0, v6.0.0rc1, v6.1.0, v5.8.3, v5.8.2, v5.8.1, v5.8.0, v5.9.0, v5.8.0rc1, v5.6.3 |
|
#
13dd34d8 |
| 18-Oct-2019 |
zrj <rimvydas.jasinskas@gmail.com> |
kernel: Cleanup <sys/uio.h> issues.
The iovec_free() inline very complicates this header inclusion. The NULL check is not always seen from <sys/_null.h>. Luckily only three kernel sources needs
kernel: Cleanup <sys/uio.h> issues.
The iovec_free() inline very complicates this header inclusion. The NULL check is not always seen from <sys/_null.h>. Luckily only three kernel sources needs it: kern_subr.c, sys_generic.c and uipc_syscalls.c. Also just a single dev/drm source makes use of 'struct uio'. * Include <sys/uio.h> explicitly first in drm_fops.c to avoid kfree() macro override in drm compat layer. * Use <sys/_uio.h> where only enums and struct uio is needed, but ensure that userland will not include it for possible later <sys/user.h> use. * Stop using <sys/vnode.h> as shortcut for uiomove*() prototypes. The uiomove*() family functions possibly transfer data across kernel/user space boundary. This header presence explicitly mark sources as such. * Prefer to add <sys/uio.h> after <sys/systm.h>, but before <sys/proc.h> and definitely before <sys/malloc.h> (except for 3 mentioned sources). This will allow to remove <sys/malloc.h> from <sys/uio.h> later on. * Adjust <sys/user.h> to use component headers instead of <sys/uio.h>.
While there, use opportunity for a minimal whitespace cleanup.
No functional differences observed in compiler intermediates.
show more ...
|
Revision tags: v5.6.2, v5.6.1, v5.6.0, v5.6.0rc1, v5.7.0 |
|
#
2a7bd4d8 |
| 18-May-2019 |
Sascha Wildner <saw@online.de> |
kernel: Don't include <sys/user.h> in kernel code.
There is really no point in doing that because its main purpose is to expose kernel structures to userland. The majority of cases wasn't needed at
kernel: Don't include <sys/user.h> in kernel code.
There is really no point in doing that because its main purpose is to expose kernel structures to userland. The majority of cases wasn't needed at all and the rest required only a couple of other includes.
show more ...
|
Revision tags: v5.4.3, v5.4.2 |
|
#
fcf6efef |
| 02-Mar-2019 |
Sascha Wildner <saw@online.de> |
kernel: Remove numerous #include <sys/thread2.h>.
Most of them were added when we converted spl*() calls to crit_enter()/crit_exit(), almost 14 years ago. We can now remove a good chunk of them agai
kernel: Remove numerous #include <sys/thread2.h>.
Most of them were added when we converted spl*() calls to crit_enter()/crit_exit(), almost 14 years ago. We can now remove a good chunk of them again for where crit_*() are no longer used.
I had to adjust some files that were relying on thread2.h or headers that it includes coming in via other headers that it was removed from.
show more ...
|
Revision tags: v5.4.1, v5.4.0, v5.5.0, v5.4.0rc1, v5.2.2, v5.2.1, v5.2.0, v5.3.0, v5.2.0rc, v5.0.2 |
|
#
7a45978d |
| 09-Nov-2017 |
Matthew Dillon <dillon@apollo.backplane.com> |
kernel - Fix bug in vm_fault_page()
* Fix a bug in vm_fault_page() and vm_fault_page_quick(). The code is not intended to update the user pmap, but if the vm_map_lookup() results in a COW, any
kernel - Fix bug in vm_fault_page()
* Fix a bug in vm_fault_page() and vm_fault_page_quick(). The code is not intended to update the user pmap, but if the vm_map_lookup() results in a COW, any existing page in the underlying pmap will no longer match the page that should be there.
The user process will still work correctly in that it will fault the COW'd page if/when it tries to issue a write to that address, but userland will not have visibility to any kernel use of vm_fault_page() that modifies the page and causes a COW if the page has already been faulted in.
* Fixed by detecting the COW and at least removing the pte from the pmap to force userland to re-fault it.
* This fixes gdb operation on programs. The problem did not rear its head before because the kernel did not pre-populate as many pages in the initial exec as it does now.
* Enhance vm_map_lookup()'s &wired argument to return wflags instead, which includes FS_WIRED and also now has FS_DIDCOW.
Reported-by: profmakx
show more ...
|
Revision tags: v5.0.1 |
|
#
22b7a3db |
| 17-Oct-2017 |
Sascha Wildner <saw@online.de> |
kernel: Remove <sys/sysref{,2}.h> inclusion from files that don't need it.
Some of the headers are public in one way or another so bump __DragonFly_version for safety.
While here, add a missing <sy
kernel: Remove <sys/sysref{,2}.h> inclusion from files that don't need it.
Some of the headers are public in one way or another so bump __DragonFly_version for safety.
While here, add a missing <sys/objcache.h> include to kern_exec.c which was previously relying on it coming in via <sys/sysref.h> (which was included by <sys/vm_map.h> prior to this commit).
show more ...
|
Revision tags: v5.0.0, v5.0.0rc2, v5.1.0, v5.0.0rc1, v4.8.1, v4.8.0, v4.6.2, v4.9.0, v4.8.0rc |
|
#
dc039ae0 |
| 28-Jan-2017 |
Matthew Dillon <dillon@apollo.backplane.com> |
kernel - Change vm_fault_page[_quick]() semantics + vkernel fixes
* vm_fault_page[_quick]() needs to be left busied for PROT_WRITE so modifications made by the caller do not race other operations
kernel - Change vm_fault_page[_quick]() semantics + vkernel fixes
* vm_fault_page[_quick]() needs to be left busied for PROT_WRITE so modifications made by the caller do not race other operations in the kernel. Modify the API to accomodate the behavior.
* Fix procfs write race with new vm_fault_page() API.
* Fix bugs in ept_swapu32() and ept_swapu64() (vkernel + VMM)
* pmap_fault_page_quick() doesn't understand EPT page tables, have it fail for that case too. This fixes bugs in vkernel + VMM mode.
* Also do some minor normalization of variables names in pmap.c
* vkernel/pmap - Use atomic_swap_long() to modify PTEs instead of a simple (non-atomic) assignment.
* vkernel/pmap - Fix numerous bugs in the VMM and non-VMM code for pmap_kenter*(), pmap_qenter*(), etc.
* vkernel/pmap - Collapse certain pmap_qremove_*() routines into the base pmap_qremove().
show more ...
|
#
3091de50 |
| 17-Dec-2016 |
Matthew Dillon <dillon@apollo.backplane.com> |
kernel - Tag vm_map_entry structure, slight optimization to zalloc, misc.
* Tag the vm_map_entry structure, allowing debugging programs to break-down how KMEM is being used more easily.
This re
kernel - Tag vm_map_entry structure, slight optimization to zalloc, misc.
* Tag the vm_map_entry structure, allowing debugging programs to break-down how KMEM is being used more easily.
This requires an additional argument to vm_map_find() and most kmem_alloc*() functions.
* Remove the page chunking parameter to zinit() and zinitna(). It was only being used degeneratively. Increase the chunking from one page to four pages, which will reduce the amount of vm_map_entry spam in the kernel_map.
* Use atomic ops when adjusting zone_kern_pages.
show more ...
|
Revision tags: v4.6.1, v4.6.0, v4.6.0rc2, v4.6.0rc, v4.7.0, v4.4.3, v4.4.2, v4.4.1, v4.4.0, v4.5.0, v4.4.0rc, v4.2.4, v4.3.1, v4.2.3, v4.2.1, v4.2.0, v4.0.6, v4.3.0, v4.2.0rc, v4.0.5, v4.0.4, v4.0.3, v4.0.2, v4.0.1, v4.0.0, v4.0.0rc3, v4.0.0rc2, v4.0.0rc, v4.1.0, v3.8.2 |
|
#
93f86408 |
| 23-Jul-2014 |
Matthew Dillon <dillon@apollo.backplane.com> |
kernel - Redo struct vmspace allocator and ref-count handling.
* Get rid of the sysref-based allocator and ref-count handler and replace with objcache. Replace all sysref API calls in other kerne
kernel - Redo struct vmspace allocator and ref-count handling.
* Get rid of the sysref-based allocator and ref-count handler and replace with objcache. Replace all sysref API calls in other kernel modules with vmspace_*() API calls (adding new API calls as needed).
* Roll-our-own hopefully safer ref-count handling. We get rid of exitingcnt and instead just leave holdcnt bumped during the exit/reap sequence. We add vm_refcnt and redo vm_holdcnt.
Now a formal reference (vm_refcnt) is ALSO covered by a holdcnt. Stage-1 termination occurs when vm_refcnt transitions from 1->0. Stage-2 termination occurs when vm_holdcnt transitions from 1->0.
* Should fix rare reported panic under heavy load.
show more ...
|
Revision tags: v3.8.1, v3.6.3, v3.8.0, v3.8.0rc2, v3.9.0, v3.8.0rc, v3.6.2, v3.6.1, v3.6.0, v3.7.1, v3.6.0rc |
|
#
a86ce0cd |
| 20-Sep-2013 |
Matthew Dillon <dillon@apollo.backplane.com> |
hammer2 - Merge Mihai Carabas's VKERNEL/VMM GSOC project into the main tree
* This merge contains work primarily by Mihai Carabas, with some misc fixes also by Matthew Dillon.
* Special note on G
hammer2 - Merge Mihai Carabas's VKERNEL/VMM GSOC project into the main tree
* This merge contains work primarily by Mihai Carabas, with some misc fixes also by Matthew Dillon.
* Special note on GSOC core
This is, needless to say, a huge amount of work compressed down into a few paragraphs of comments. Adds the pc64/vmm subdirectory and tons of stuff to support hardware virtualization in guest-user mode, plus the ability for programs (vkernels) running in this mode to make normal system calls to the host.
* Add system call infrastructure for VMM mode operations in kern/sys_vmm.c which vectors through a structure to machine-specific implementations.
vmm_guest_ctl_args() vmm_guest_sync_addr_args()
vmm_guest_ctl_args() - bootstrap VMM and EPT modes. Copydown the original user stack for EPT (since EPT 'physical' addresses cannot reach that far into the backing store represented by the process's original VM space). Also installs the GUEST_CR3 for the guest using parameters supplied by the guest.
vmm_guest_sync_addr_args() - A host helper function that the vkernel can use to invalidate page tables on multiple real cpus. This is a lot more efficient than having the vkernel try to do it itself with IPI signals via cpusync*().
* Add Intel VMX support to the host infrastructure. Again, tons of work compressed down into a one paragraph commit message. Intel VMX support added. AMD SVM support is not part of this GSOC and not yet supported by DragonFly.
* Remove PG_* defines for PTE's and related mmu operations. Replace with a table lookup so the same pmap code can be used for normal page tables and also EPT tables.
* Also include X86_PG_V defines specific to normal page tables for a few situations outside the pmap code.
* Adjust DDB to disassemble SVM related (intel) instructions.
* Add infrastructure to exit1() to deal related structures.
* Optimize pfind() and pfindn() to remove the global token when looking up the current process's PID (Matt)
* Add support for EPT (double layer page tables). This primarily required adjusting the pmap code to use a table lookup to get the PG_* bits.
Add an indirect vector for copyin, copyout, and other user address space copy operations to support manual walks when EPT is in use.
A multitude of system calls which manually looked up user addresses via the vm_map now need a VMM layer call to translate EPT.
* Remove the MP lock from trapsignal() use cases in trap().
* (Matt) Add pthread_yield()s in most spin loops to help situations where the vkernel is running on more cpu's than the host has, and to help with scheduler edge cases on the host.
* (Matt) Add a pmap_fault_page_quick() infrastructure that vm_fault_page() uses to try to shortcut operations and avoid locks. Implement it for pc64. This function checks whether the page is already faulted in as requested by looking up the PTE. If not it returns NULL and the full blown vm_fault_page() code continues running.
* (Matt) Remove the MP lock from most the vkernel's trap() code
* (Matt) Use a shared spinlock when possible for certain critical paths related to the copyin/copyout path.
show more ...
|
Revision tags: v3.4.3 |
|
#
dc71b7ab |
| 31-May-2013 |
Justin C. Sherrill <justin@shiningsilence.com> |
Correct BSD License clause numbering from 1-2-4 to 1-2-3.
Apparently everyone's doing it: http://svnweb.freebsd.org/base?view=revision&revision=251069
Submitted-by: "Eitan Adler" <lists at eitanadl
Correct BSD License clause numbering from 1-2-4 to 1-2-3.
Apparently everyone's doing it: http://svnweb.freebsd.org/base?view=revision&revision=251069
Submitted-by: "Eitan Adler" <lists at eitanadler.com>
show more ...
|
Revision tags: v3.4.2 |
|
#
2702099d |
| 06-May-2013 |
Justin C. Sherrill <justin@shiningsilence.com> |
Remove advertising clause from all that isn't contrib or userland bin.
By: Eitan Adler <lists@eitanadler.com>
|
Revision tags: v3.4.1, v3.4.0, v3.4.0rc, v3.5.0, v3.2.2, v3.2.1, v3.2.0, v3.3.0, v3.0.3, v3.0.2, v3.0.1, v3.1.0, v3.0.0 |
|
#
a2ee730d |
| 02-Dec-2011 |
Matthew Dillon <dillon@apollo.backplane.com> |
kernel - Refactor the vmspace locking code and several use cases
* Reorder the vnode ref/rele sequence in the exec path so p_textvp is left in a more valid state while being initialized.
* Removi
kernel - Refactor the vmspace locking code and several use cases
* Reorder the vnode ref/rele sequence in the exec path so p_textvp is left in a more valid state while being initialized.
* Removing the vm_exitingcnt test in exec_new_vmspace(). Release various resources unconditionally on the last exiting thread regardless of the state of exitingcnt. This just moves some of the resource releases out of the wait*() system call path and back into the exit*() path.
* Implement a hold/drop mechanic for vmspaces and use them in procfs_rwmem(), vmspace_anonymous_count(), and vmspace_swap_count(), and various other places.
This does a better job protecting the vmspace from deletion while various unrelated third parties might be trying to access it.
* Implement vmspace_free() for other code to call instead of them trying to call sysref_put() directly. Interlock with a vmspace_hold() so final termination processing always keys off the vm_holdcount.
* Implement vm_object_allocate_hold() and use it in a few places in order to allow OBJT_SWAP objects to be allocated atomically, so other third parties (like the swapcache cleaning code) can't wiggle their way in and access a partially initialized object.
* Reorder the vmspace_terminate() code and introduce some flags to ensure that resources are terminated at the proper time and in the proper order.
show more ...
|
#
40cb3ccd |
| 01-Dec-2011 |
Venkatesh Srinivas <me@endeavour.zapto.org> |
Merge branch 'master' of /repository/git/dragonfly
|
#
82354ad8 |
| 01-Dec-2011 |
Matthew Dillon <dillon@apollo.backplane.com> |
kernel - Fix race between procfs / proc sysctls and exec, refactor PHOLD/etc
* During a [v]fork/exec sequence the exec will replace the VM space of the target process. A concurrent 'ps' operation
kernel - Fix race between procfs / proc sysctls and exec, refactor PHOLD/etc
* During a [v]fork/exec sequence the exec will replace the VM space of the target process. A concurrent 'ps' operation could access the target process's vmspace as it was being ripped out, resulting in memory corruption.
* The P_INEXEC test in procfs was insufficient, the exec code itself must also wait for procfs's PHOLD() on the process to go away before it can proceed. This should properly interlock the entire operation.
* Can occur with procfs or non-procfs ps's (via proc sysctls).
* Possibly related to the seg-fault issue we have where the user stack gets corrupted.
* Also revamp PHOLD()/PRELE() and add PSTALL(), changing all manual while() loops waiting on p->p_lock to use PSTALL().
These functions now integrate a wakeup request flag into p->p_lock using atomic ops and no longer tsleep() for 1 tick (or hz ticks, or whatever). Wakeups are issued proactively.
show more ...
|
#
86d7f5d3 |
| 26-Nov-2011 |
John Marino <draco@marino.st> |
Initial import of binutils 2.22 on the new vendor branch
Future versions of binutils will also reside on this branch rather than continuing to create new binutils branches for each new version.
|
#
07cdb1d2 |
| 16-Nov-2011 |
Matthew Dillon <dillon@apollo.backplane.com> |
kernel - Fix races in procfs
* Use cache_copy to acquire a stable textnch
* Do not try to access the vmspace (for process args) for processes in the SIDL or SZOMB state as it may be a moving targ
kernel - Fix races in procfs
* Use cache_copy to acquire a stable textnch
* Do not try to access the vmspace (for process args) for processes in the SIDL or SZOMB state as it may be a moving target even with the process PHOLD()en.
show more ...
|
#
8db21154 |
| 16-Nov-2011 |
Matthew Dillon <dillon@apollo.backplane.com> |
kernel - Attempt to make procfs MPSAFE (3)
* More fixes to silly bugs. Well, I did say 'attempt' :-)
|
#
4643740a |
| 15-Nov-2011 |
Matthew Dillon <dillon@apollo.backplane.com> |
kernel - Major signal path adjustments to fix races, tsleep race fixes, +more
* Refactor the signal code to properly hold the lp->lwp_token. In particular the ksignal() and lwp_signotify() paths.
kernel - Major signal path adjustments to fix races, tsleep race fixes, +more
* Refactor the signal code to properly hold the lp->lwp_token. In particular the ksignal() and lwp_signotify() paths.
* The tsleep() path must also hold lp->lwp_token to properly handle lp->lwp_stat states and interlocks.
* Refactor the timeout code in tsleep() to ensure that endtsleep() is only called from the proper context, and fix races between endtsleep() and lwkt_switch().
* Rename proc->p_flag to proc->p_flags
* Rename lwp->lwp_flag to lwp->lwp_flags
* Add lwp->lwp_mpflags and move flags which require atomic ops (are adjusted when not the current thread) to the new field.
* Add td->td_mpflags and move flags which require atomic ops (are adjusted when not the current thread) to the new field.
* Add some freeze testing code to the x86-64 trap code (default disabled).
show more ...
|
#
b0c15cdf |
| 10-Nov-2011 |
Matthew Dillon <dillon@apollo.backplane.com> |
kernel - Remove ad-hoc increment/decrement of vm->vm_sysref
* Remove the ad-hoc increment/decrement of vm->vm_sysref when pulling data out for a ps. Acquire p->p_token instead.
This is an atte
kernel - Remove ad-hoc increment/decrement of vm->vm_sysref
* Remove the ad-hoc increment/decrement of vm->vm_sysref when pulling data out for a ps. Acquire p->p_token instead.
This is an attempt to determine whether these adhoc operations are responsible for causing a race that results in the seg-fault issue we see on monster.
show more ...
|
Revision tags: v2.12.0, v2.13.0, v2.10.1, v2.11.0, v2.10.0, v2.9.1, v2.8.2, v2.8.1, v2.8.0, v2.9.0, v2.6.3, v2.7.3, v2.6.2, v2.7.2, v2.7.1, v2.6.1, v2.7.0, v2.6.0, v2.5.1, v2.4.1, v2.5.0, v2.4.0 |
|
#
e54488bb |
| 19-Aug-2009 |
Matthew Dillon <dillon@apollo.backplane.com> |
AMD64 - Refactor uio_resid and size_t assumptions.
* uio_resid changed from int to size_t (size_t == unsigned long equivalent).
* size_t assumptions in most kernel code has been refactored to opera
AMD64 - Refactor uio_resid and size_t assumptions.
* uio_resid changed from int to size_t (size_t == unsigned long equivalent).
* size_t assumptions in most kernel code has been refactored to operate in a 64 bit environment.
* In addition, the 2G limitation for VM related system calls such as mmap() has been removed in 32 bit environments. Note however that because read() and write() return ssize_t, these functions are still limited to a 2G byte count in 32 bit environments.
show more ...
|
Revision tags: v2.3.2, v2.3.1, v2.2.1, v2.2.0, v2.3.0 |
|
#
08abcb65 |
| 03-Jan-2009 |
Matthew Dillon <dillon@apollo.backplane.com> |
Merge branch 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into devel
|