History log of /dflybsd-src/sys/vm/vm_object.h (Results 1 – 25 of 74)
Revision Date Author Comments
# 712b6620 21-May-2021 Aaron LI <aly@aaronly.me>

vm: Change 'kernel_object' global to pointer type

Following the previous commits, this commit changes the 'kernel_object'
to pointer type of 'struct vm_object *'. This makes it align better
with 'k

vm: Change 'kernel_object' global to pointer type

Following the previous commits, this commit changes the 'kernel_object'
to pointer type of 'struct vm_object *'. This makes it align better
with 'kernel_map' and simplifies the code a bit.

No functional changes.

show more ...


# cdf89dcf 05-May-2020 Sascha Wildner <saw@online.de>

kernel/vm: Rename VM_PAGER_PUT_* to OBJPC_*.

While here, rename the rest of the VM_PAGER_* flags too.

Suggested-by: dillon


# a7c16d7a 25-Feb-2020 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Simple cache line optimizations

* Reorder struct vm_page, struct vnode, and struct vm_object a bit
to improve cache-line locality.

* Use atomic_fcmpset_*() instead of atomic_cmpset_*() i

kernel - Simple cache line optimizations

* Reorder struct vm_page, struct vnode, and struct vm_object a bit
to improve cache-line locality.

* Use atomic_fcmpset_*() instead of atomic_cmpset_*() in several
places to reduce the inter-cpu cache coherency load a bit.

show more ...


# 567a6398 18-May-2019 Matthew Dillon <dillon@apollo.backplane.com>

kernel - VM rework part 11 - Core pmap work to remove terminal PVs

* Remove pv_entry_t belonging to terminal PTEs. The pv_entry's for
PT, PD, PDP, and PML4 remain. This reduces kernel memory use

kernel - VM rework part 11 - Core pmap work to remove terminal PVs

* Remove pv_entry_t belonging to terminal PTEs. The pv_entry's for
PT, PD, PDP, and PML4 remain. This reduces kernel memory use for
pv_entry's by 99%.

The pmap code now iterates vm_object->backing_list (of vm_map_backing
structures) to run-down pages for various operations.

* Remove vm_page->pv_list. This was one of the biggest sources of
contention for shared faults. However, in this first attempt I
am leaving all sorts of ref-counting intact so the contention has
not been entirely removed yet.

* Current hacks:

- Dynamic page table page removal currently disabled because the
vm_map_backing scan needs to be able to deterministically
run-down PTE pointers. Removal only occurs at program exit.

- PG_DEVICE_IDX probably isn't being handled properly yet.

- Shared page faults not yet optimized.

* So far minor improvements in performance across the board.
This is realtively unoptimized. The buildkernel test improves
by 2% and the zero-fill fault test improves by around 10%.

Kernel memory use is improved (reduced) enormously.

show more ...


# 530e94fc 17-May-2019 Matthew Dillon <dillon@apollo.backplane.com>

kernel - VM rework part 9 - Precursor work for terminal pv_entry removal

* Cleanup the API a bit

* Get rid of pmap_enter_quick()

* Remove unused procedures.

* Document that vm_page_protect() (and

kernel - VM rework part 9 - Precursor work for terminal pv_entry removal

* Cleanup the API a bit

* Get rid of pmap_enter_quick()

* Remove unused procedures.

* Document that vm_page_protect() (and thus the related
pmap_page_protect()) must be called with a hard-busied page. This
ensures that the operation does not race a new pmap_enter() of the page.

show more ...


# 67e7cb85 14-May-2019 Matthew Dillon <dillon@apollo.backplane.com>

kernel - VM rework part 8 - Precursor work for terminal pv_entry removal

* Adjust structures so the pmap code can iterate backing_ba's with
just the vm_object spinlock.

Add a ba.pmap back-point

kernel - VM rework part 8 - Precursor work for terminal pv_entry removal

* Adjust structures so the pmap code can iterate backing_ba's with
just the vm_object spinlock.

Add a ba.pmap back-pointer.

Move entry->start and entry->end into the ba (ba.start, ba.end).
This is replicative of the base entry->ba.start and entry->ba.end,
but local modifications are locked by individual objects to allow
pmap ops to just look at backing ba's iterated via the object.

Remove the entry->map back-pointer.

Remove the ba.entry_base back-pointer.

* ba.offset is now an absolute offset and not additive. Adjust all code
that calculates and uses ba.offset (fortunately it is all concentrated
in vm_map.c and vm_fault.c).

* Refactor ba.start/offset/end modificatons to be atomic with
the necessary spin-locks to allow the pmap code to safely iterate
the vm_map_backing list for a vm_object.

* Test VM system with full synth run.

show more ...


# 5b329e62 11-May-2019 Matthew Dillon <dillon@apollo.backplane.com>

kernel - VM rework part 7 - Initial vm_map_backing index

* Implement a TAILQ and hang vm_map_backing structures off
of the related object. This feature is still in progress
and will eventually

kernel - VM rework part 7 - Initial vm_map_backing index

* Implement a TAILQ and hang vm_map_backing structures off
of the related object. This feature is still in progress
and will eventually be used to allow pmaps to manipulate
vm_page's without pv_entry's.

At the same time, remove all sharing of vm_map_backing.
For example, clips no longer share the vm_map_backing. We
can't share the structures if they are being used to
itemize areas for pmap management.

TODO - reoptimize this at some point.

TODO - not yet quite deterministic enough for pmap
searches (due to clips).

* Refactor vm_object_reference_quick() to again allow
operation on any vm_object whos ref_count is already
at least 1, or which belongs to a vnode. The ref_count
is no longer being used for complex vm_object collapse,
shadowing, or migration code.

This allows us to avoid a number of unnecessary token
grabs on objects during clips, shadowing, and forks.

* Cleanup a few fields in vm_object. Name TAILQ_ENTRY()
elements blahblah_entry instead of blahblah_list.

* Fix an issue with a.out binaries (that are still supported but
nobody uses) where the object refs on the binaries were not
being properly accounted for.

show more ...


# 8492a2fe 10-May-2019 Matthew Dillon <dillon@apollo.backplane.com>

kernel - VM rework part 5 - Cleanup

* Cleanup vm_map_entry_shadow()

* Remove (unused) vmspace_president_count()
Remove (barely used) struct lwkt_token typedef.

* Cleanup the vm_map_aux, vm_map_e

kernel - VM rework part 5 - Cleanup

* Cleanup vm_map_entry_shadow()

* Remove (unused) vmspace_president_count()
Remove (barely used) struct lwkt_token typedef.

* Cleanup the vm_map_aux, vm_map_entry, vm_map, and vm_object
structures

* Adjfustments to in-code documentation

show more ...


# 1c024bc6 10-May-2019 Matthew Dillon <dillon@apollo.backplane.com>

kernel - VM rework part 4 - Implement vm_fault_collapse()

* Add the function vm_fault_collapse(). This function simulates
faults to copy all pages from backing objects into the front
object, al

kernel - VM rework part 4 - Implement vm_fault_collapse()

* Add the function vm_fault_collapse(). This function simulates
faults to copy all pages from backing objects into the front
object, allowing the backing objects to be disconnected
from the map entry.

This function is called under certain conditions from the
vmspace_fork*() code prior to a fork to potentially collapse
the entry's backing objects into the front object. The
caller then disconnects the backing objects, truncating the
list to a single object (the front object).

This optimization is necessary to prevent the backing_ba list
from growing in an unbounded fashion. In addition, being able
to disconnect the graph allows redundant backing store to
be freed more quickly, reducing memory use.

* Add sysctl vm.map_backing_shadow_test (default enabled).
The vmspace_fork*() code now does a quick all-shadowed test on
the first backing object and calls vm_fault_collapse()
if it comes back true, regardless of the chain length.

* Add sysctl vm.map_backing_limit (default 5).
The vmspace_fork*() code calls vm_fault_collapse() when the
ba.backing_ba list exceeds the specified number of entries.

* Performance is a tad faster than the original collapse
code.

show more ...


# 9de48ead 09-May-2019 Matthew Dillon <dillon@apollo.backplane.com>

kernel - VM rework part 2 - Replace backing_object with backing_ba

* Remove the vm_object based backing_object chains and all related
chaining code.

This removes an enormous number of locks fro

kernel - VM rework part 2 - Replace backing_object with backing_ba

* Remove the vm_object based backing_object chains and all related
chaining code.

This removes an enormous number of locks from the VM system and
also removes object-to-object dependencies which requires careful
traversal code. A great deal of complex code has been removed
and replaced with far simpler code.

Ultimately the intention will be to support removal of pv_entry
tracking from vm_pages to gain lockless shared faults, but that
is far in the future. It will require hanging vm_map_backing
structures off of a list based in the object.

* Implement the vm_map_backing structure which is embedded in the
vm_map_entry and then links to additional dynamically allocated
vm_map_backing structures via entry->ba.backing_ba. This structure
contains the object and offset and essentially takes over the
functionality that object->backing_object used to have.

backing objects are now handled via vm_map_backing. In this
commit, fork operations create a fan-in tree to shared subsets
of backings via vm_map_backing. In this particular commit,
these subsets are not collapsed in any way.

* Remove all the vm_map_split and collapse code. Every last line
is gone. It will be reimplemented using vm_map_backing in a
later commit.

This means that as-of this commit both recursive forks and
parent-to-multiple-children forks cause an accumulation of
inefficient lists of backing objects to occur in the parent
and children. This will begin to get addressed in part 3.

* The code no longer releases the vm_map lock (typically shared)
across (get_pages) I/O. There are no longer any chaining locks to
get in the way (hopefully). This means that the code does not
have to re-check as carefully as it did before. However, some
complexity will have to be added back in once we begin to address
the accumulation of vm_map_backing structures.

* Paging performance improved by 30-40%

show more ...


# 6f76a56d 07-May-2019 Matthew Dillon <dillon@apollo.backplane.com>

kernel - VM rework part 1 - Remove shadow_list

* Remove shadow_head, shadow_list, shadow_count.

* This leaves the kernel operational but without collapse optimizations
on 'other' processes when a

kernel - VM rework part 1 - Remove shadow_list

* Remove shadow_head, shadow_list, shadow_count.

* This leaves the kernel operational but without collapse optimizations
on 'other' processes when a prorgam exits.

show more ...


# fcf6efef 02-Mar-2019 Sascha Wildner <saw@online.de>

kernel: Remove numerous #include <sys/thread2.h>.

Most of them were added when we converted spl*() calls to
crit_enter()/crit_exit(), almost 14 years ago. We can now
remove a good chunk of them agai

kernel: Remove numerous #include <sys/thread2.h>.

Most of them were added when we converted spl*() calls to
crit_enter()/crit_exit(), almost 14 years ago. We can now
remove a good chunk of them again for where crit_*() are
no longer used.

I had to adjust some files that were relying on thread2.h
or headers that it includes coming in via other headers
that it was removed from.

show more ...


# 562ffbba 20-Apr-2018 Matthew Dillon <dillon@backplane.com>

kernel - Increase vm_object hash table

* Increase table from 64 to 256 entries.

* Improve the hash algorithm considerably for better coverage.


# 641f3b0a 02-Nov-2017 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Refactor vm_fault and vm_map a bit.

* Allow the virtual copy feature to be disabled via a sysctl.
Default enabled.

* Fix a bug in the virtual copy test. Multiple elements were
not bei

kernel - Refactor vm_fault and vm_map a bit.

* Allow the virtual copy feature to be disabled via a sysctl.
Default enabled.

* Fix a bug in the virtual copy test. Multiple elements were
not being retested after reacquiring the map lock.

* Change the auto-partitioning of vm_map_entry structures from
16MB to 32MB. Add a sysctl to allow the feature to be disabled.
Default enabled.

* Cleanup map->timestamp bumps. Basically we bump it in
vm_map_lock(), and also fix a bug where it was not being
bumped after relocking the map in the virtual copy feature.

* Fix an incorrect assertion in vm_map_split(). Refactor tests
in vm_map_split(). Also, acquire the chain lock for the VM
object in the caller to vm_map_split() instead of in vm_map_split()
itself, allowing us to include the pmap adjustment within the
locked area.

* Make sure OBJ_ONEMAPPING is cleared for nobject in vm_map_split().

* Fix a bug in a call to vm_map_transition_wait() that
double-locked the vm_map in the partitioning code.

* General cleanups in vm/vm_object.c

show more ...


# 46b71cbe 06-Oct-2017 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Refuse to swapoff under certain conditions

* Both tmpfs and vn can't handle swapoff's method of bringing pages
back in from the swap partition being decomissioned.

* Fixing this properly

kernel - Refuse to swapoff under certain conditions

* Both tmpfs and vn can't handle swapoff's method of bringing pages
back in from the swap partition being decomissioned.

* Fixing this properly is fairly involved. The normal swapoff procedure
is to page swap into the related VM object, but tmpfs and vn use their
VM objects ONLY to track swap blocks and not for vm_page manipulation,
so that just won't work. In addition, the swap code may associate
a swap block with a VM object before issuing the write I/O to page
out the data, and the swapoff code's asynchronous pagein might cause
problems.

For now, just make sure that swapoff refuses to remove the partition
under these conditions, so it doesn't blow up tmpfs or vn.

show more ...


# 0062b9ff 26-Jan-2017 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Remove object->agg_pv_list_count

* Remove the object->agg_pv_list_count field. It represents an unnecessary
global cache bounce, was only being used to help report vkernel RSS,
and was

kernel - Remove object->agg_pv_list_count

* Remove the object->agg_pv_list_count field. It represents an unnecessary
global cache bounce, was only being used to help report vkernel RSS,
and wasn't working very well anyway.

show more ...


# 76f1911e 23-Jan-2017 Matthew Dillon <dillon@apollo.backplane.com>

kernel - pmap and vkernel work

* Remove the pmap.pm_token entirely. The pmap is currently protected
primarily by fine-grained locks and the vm_map lock. The intention
is to eventually be able

kernel - pmap and vkernel work

* Remove the pmap.pm_token entirely. The pmap is currently protected
primarily by fine-grained locks and the vm_map lock. The intention
is to eventually be able to protect it without the vm_map lock at all.

* Enhance pv_entry acquisition (representing PTE locations) to include
a placemarker facility for non-existant PTEs, allowing the PTE location
to be locked whether a pv_entry exists for it or not.

* Fix dev_dmmap (struct dev_mmap) (for future use), it was returning a
page index for physical memory as a 32-bit integer instead of a 64-bit
integer.

* Use pmap_kextract() instead of pmap_extract() where appropriate.

* Put the token contention test back in kern_clock.c for real kernels
so token contention shows up as sys% instead of idle%.

* Modify the pmap_extract() API to also return a locked pv_entry,
and add pmap_extract_done() to release it. Adjust users of
pmap_extract().

* Change madvise/mcontrol MADV_INVAL (used primarily by the vkernel)
to use a shared vm_map lock instead of an exclusive lock. This
significantly improves the vkernel's performance and significantly
reduces stalls and glitches when typing in one under heavy loads.

* The new placemarkers also have the side effect of fixing several
difficult-to-reproduce bugs in the pmap code, by ensuring that
shared and unmanaged pages are properly locked whereas before only
managed pages (with pv_entry's) were properly locked.

* Adjust the vkernel's pmap code to use atomic ops in numerous places.

* Rename the pmap_change_wiring() call to pmap_unwire(). The routine
was only being used to unwire (and could only safely be called for
unwiring anyway). Remove the unused 'wired' and the 'entry'
arguments.

Also change how pmap_unwire() works to remove a small race condition.

* Fix race conditions in the vmspace_*() system calls which could lead
to pmap corruption. Note that the vkernel did not trigger any of
these conditions, I found them while looking for another bug.

* Add missing maptypes to procfs's /proc/*/map report.

show more ...


# a17c6c05 18-Jan-2017 Antonio Huete Jimenez <tuxillo@quantumachine.net>

kernel: Add a new vm_object_init()


# 4f077c8a 18-Jan-2017 Antonio Huete Jimenez <tuxillo@quantumachine.net>

kernel: Rename vm_object_init() to vm_object_init1()

- No functional change.


# fde6be6a 03-Jan-2017 Matthew Dillon <dillon@apollo.backplane.com>

kernel - vm_object work

* Adjust OBJT_SWAP object management to be more SMP friendly. The hash
table now uses a combined structure to reduce unnecessary cache
interactions.

* Allocate VM objec

kernel - vm_object work

* Adjust OBJT_SWAP object management to be more SMP friendly. The hash
table now uses a combined structure to reduce unnecessary cache
interactions.

* Allocate VM objects via kmalloc() instead of zalloc. Remove the zalloc
pool for VM objects and use kmalloc(). Early initialization of the kernel
does not have to access vm_object allocation functions until after basic
VM initialization.

* Remove a vm_page_cache console warning that is no longer applicable.
(It could be triggered by the RSS rlimit handling code).

show more ...


# 534ee349 28-Dec-2016 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Implement RLIMIT_RSS, Increase maximum supported swap

* Implement RLIMIT_RSS by forcing pages out to swap if a process's RSS
exceeds the rlimit. Currently the algorith used to choose the

kernel - Implement RLIMIT_RSS, Increase maximum supported swap

* Implement RLIMIT_RSS by forcing pages out to swap if a process's RSS
exceeds the rlimit. Currently the algorith used to choose the pages
is fairly unsophisticated (we don't have the luxury of a per-process
vm_page_queues[] array).

* Implement the swap_user_async sysctl, default off. This sysctl can be
set to 1 to enable asynchronous paging in the RSS code. This is mostly
for testing and is not recommended since it allows the process to eat
memory more quickly than it can be paged out.

* Reimplement vm.swap_burst_read so the sysctl now specifies the number
of pages that are allowed to be burst. Still disabled by default (will
be enabled in a followup commit).

* Fix an overflow in the nswap_lowat and nswap_hiwat calculations.

* Refactor some of the pageout code to support synchronous direct
paging, which the RSS code uses. Thew new code also implements a
feature that will move clean pages to PQ_CACHE, making them immediately
reallocatable.

* Refactor the vm_pageout_deficit variable, using atomic ops.

* Fix an issue in vm_pageout_clean() (originally part of the inactive scan)
which prevented clustering from operating properly on write.

* Refactor kern/subr_blist.c and all associated code that uses to increase
swblk_t from int32_t to int64_t, and to increase the radix supported from
31 bits to 63 bits.

This increases the maximum supported swap from 2TB to some ungodly large
value. Remember that, by default, space for up to 4 swap devices
is preallocated so if you are allocating insane amounts of swap it is
best to do it with four equal-sized partitions instead of one so kernel
memory is efficiently allocated.

* There are two kernel data structures associated with swap. The blmeta
structure which has approximately a 1:8192 ratio (ram:swap) and is
pre-allocated up-front, and the swmeta structure whos KVA is reserved
but not allocated.

The swmeta structure has a 1:341 ratio. It tracks swap assignments for
pages in vm_object's. The kernel limits the number of structures to
approximately half of physical memory, meaning that if you have a machine
with 16GB of ram the maximum amount of swapped-out data you can support
with that is 16/2*341 = 2.7TB. Not that you would actually want to eat
half your ram to do actually do that.

A large system with, say, 128GB of ram, would be able to support
128/2*341 = 21TB of swap. The ultimate limitation is the 512GB of KVM.
The swap system can use up to 256GB of this so the maximum swap currently
supported by DragonFly on a machine with > 512GB of ram is going to be
256/2*341 = 43TB. To expand this further would require some adjustments
to increase the amount of KVM supported by the kernel.

* WARNING! swmeta is allocated via zalloc(). Once allocated, the memory
can be reused for swmeta but cannot be freed for use by other subsystems.
You should only configure as much swap as you are willing to reserve ram
for.

show more ...


# 3b2f3463 18-Jul-2016 zrj <rimvydas.jasinskas@gmail.com>

sys: Various include guard fixes.


# c66c7e2f 25-Jan-2016 zrj <rimvydas.jasinskas@gmail.com>

Correct BSD License clause numbering from 1-2-4 to 1-2-3.


# 15553805 30-Dec-2014 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Fix a major (pageable) memory leak

* Under certain relatively easy to reproduce conditions an extra ref_count
can be added to a VM object during a fork(), preventing the object from
eve

kernel - Fix a major (pageable) memory leak

* Under certain relatively easy to reproduce conditions an extra ref_count
can be added to a VM object during a fork(), preventing the object from
ever being destroyed. It's pages may even be paged out, but the system
will eventually run out of swap space too.

* The actual fix is to assign 'map_object = object' in vm_map_insert()
(see the diff). The rest of this commit is conditionalized debugging
code and code documentation.

* Because this change implements a relatively esoteric feature in the VM
system by allowing an anonymous VM object to be extended to cover an
area even though it might have a gap (so a new VM object does not have
to be allocated), further testing is needed before we can MFC this to
the RELEASE branch.

show more ...


# 99ebfb7c 06-May-2014 Sascha Wildner <saw@online.de>

kernel: Fix some boolean_t vs. int confusion.

When boolean_t is defined to be _Bool instead of int (not part of this
commit), this is what gcc is sad about.


123