| 8078b160 | 03-Jun-2021 |
Aaron LI <aly@aaronly.me> |
pmap: Eliminate a simple macro 'pte_load_clear()'
First, this macro is not used in vkernel64's pmap code. Secondly, this macro is sudden and looks unrelated to other things in the pmap.h header. S
pmap: Eliminate a simple macro 'pte_load_clear()'
First, this macro is not used in vkernel64's pmap code. Secondly, this macro is sudden and looks unrelated to other things in the pmap.h header. So just substitute it in the pmap code and get rid of it.
show more ...
|
| 7e0dbbc6 | 03-Jun-2021 |
Aaron LI <aly@aaronly.me> |
vm/pmap.h: Move vtophys() and vtophys_pte() macros here
The two macros are defined against with pmap_kextract(), which is also declared in this header file, so it's a better place to hold the two ma
vm/pmap.h: Move vtophys() and vtophys_pte() macros here
The two macros are defined against with pmap_kextract(), which is also declared in this header file, so it's a better place to hold the two macros.
In addition, this adjustment avoids the duplicates in both pc64 and vkernel64.
show more ...
|
| 95270b7e | 01-Feb-2017 |
Matthew Dillon <dillon@apollo.backplane.com> |
kernel - Many fixes for vkernel support, plus a few main kernel fixes
REAL KERNEL
* The big enchillada is that the main kernel's thread switch code has a small timing window where it clears t
kernel - Many fixes for vkernel support, plus a few main kernel fixes
REAL KERNEL
* The big enchillada is that the main kernel's thread switch code has a small timing window where it clears the PM_ACTIVE bit for the cpu while switching between two threads. However, it *ALSO* checks and avoids loading the %cr3 if the two threads have the same pmap.
This results in a situation where an invalidation on the pmap in another cpuc may not have visibility to the cpu doing the switch, and yet the cpu doing the switch also decides not to reload %cr3 and so does not invalidate the TLB either. The result is a stale TLB and bad things happen.
For now just unconditionally load %cr3 until I can come up with code to handle the case.
This bug is very difficult to reproduce on a normal system, it requires a multi-threaded program doing nasty things (munmap, etc) on one cpu while another thread is switching to a third thread on some other cpu.
* KNOTE after handling the vkernel trap in postsig() instead of before.
* Change the kernel's pmap_inval_smp() code to take a 64-bit npgs argument instead of a 32-bit npgs argument. This fixes situations that crop up when a process uses more than 16TB of address space.
* Add an lfence to the pmap invalidation code that I think might be needed.
* Handle some wrap/overflow cases in pmap_scan() related to the use of large address spaces.
* Fix an unnecessary invltlb in pmap_clearbit() for unmanaged PTEs.
* Test PG_RW after locking the pv_entry to handle potential races.
* Add bio_crc to struct bio. This field is only used for debugging for now but may come in useful later.
* Add some global debug variables in the pmap_inval_smp() and related paths. Refactor the npgs handling.
* Load the tsc_target field after waiting for completion of the previous invalidation op instead of before. Also add a conservative mfence() in the invalidation path before loading the info fields.
* Remove the global pmap_inval_bulk_count counter.
* Adjust swtch.s to always reload the user process %cr3, with an explanation. FIXME LATER!
* Add some test code to vm/swap_pager.c which double-checks that the page being paged out does not get corrupted during the operation. This code is #if 0'd.
* We must hold an object lock around the swp_pager_meta_ctl() call in swp_pager_async_iodone(). I think.
* Reorder when PG_SWAPINPROG is cleared. Finish the I/O before clearing the bit.
* Change the vm_map_growstack() API to pass a vm_map in instead of curproc.
* Use atomic ops for vm_object->generation counts, since objects can be locked shared.
VKERNEL
* Unconditionally save the FP state after returning from VMSPACE_CTL_RUN. This solves a severe FP corruption bug in the vkernel due to calls it makes into libc (which uses %xmm registers all over the place).
This is not a complete fix. We need a formal userspace/kernelspace FP abstraction. Right now the vkernel doesn't have a kernelspace FP abstraction so if a kernel thread switches preemptively bad things happen.
* The kernel tracks and locks pv_entry structures to interlock pte's. The vkernel never caught up, and does not really have a pv_entry or placemark mechanism. The vkernel's pmap really needs a complete re-port from the real-kernel pmap code. Until then, we use poor hacks.
* Use the vm_page's spinlock to interlock pte changes.
* Make sure that PG_WRITEABLE is set or cleared with the vm_page spinlock held.
* Have pmap_clearbit() acquire the pmobj token for the pmap in the iteration. This appears to be necessary, currently, as most of the rest of the vkernel pmap code also uses the pmobj token.
* Fix bugs in the vkernel's swapu32() and swapu64().
* Change pmap_page_lookup() and pmap_unwire_pgtable() to fully busy the page. Note however that a page table page is currently never soft-busied. Also other vkernel code that busies a page table page.
* Fix some sillycode in a pmap->pm_ptphint test.
* Don't inherit e.g. PG_M from the previous pte when overwriting it with a pte of a different physical address.
* Change the vkernel's pmap_clear_modify() function to clear VTPE_RW (which also clears VPTE_M), and not just VPTE_M. Formally we want the vkernel to be notified when a page becomes modified and it won't be unless we also clear VPTE_RW and force a fault. <--- I may change this back after testing.
* Wrap pmap_replacevm() with a critical section.
* Scrap the old grow_stack() code. vm_fault() and vm_fault_page() handle it (vm_fault_page() just now got the ability).
* Properly flag VM_FAULT_USERMODE.
show more ...
|
| 76f1911e | 23-Jan-2017 |
Matthew Dillon <dillon@apollo.backplane.com> |
kernel - pmap and vkernel work
* Remove the pmap.pm_token entirely. The pmap is currently protected primarily by fine-grained locks and the vm_map lock. The intention is to eventually be able
kernel - pmap and vkernel work
* Remove the pmap.pm_token entirely. The pmap is currently protected primarily by fine-grained locks and the vm_map lock. The intention is to eventually be able to protect it without the vm_map lock at all.
* Enhance pv_entry acquisition (representing PTE locations) to include a placemarker facility for non-existant PTEs, allowing the PTE location to be locked whether a pv_entry exists for it or not.
* Fix dev_dmmap (struct dev_mmap) (for future use), it was returning a page index for physical memory as a 32-bit integer instead of a 64-bit integer.
* Use pmap_kextract() instead of pmap_extract() where appropriate.
* Put the token contention test back in kern_clock.c for real kernels so token contention shows up as sys% instead of idle%.
* Modify the pmap_extract() API to also return a locked pv_entry, and add pmap_extract_done() to release it. Adjust users of pmap_extract().
* Change madvise/mcontrol MADV_INVAL (used primarily by the vkernel) to use a shared vm_map lock instead of an exclusive lock. This significantly improves the vkernel's performance and significantly reduces stalls and glitches when typing in one under heavy loads.
* The new placemarkers also have the side effect of fixing several difficult-to-reproduce bugs in the pmap code, by ensuring that shared and unmanaged pages are properly locked whereas before only managed pages (with pv_entry's) were properly locked.
* Adjust the vkernel's pmap code to use atomic ops in numerous places.
* Rename the pmap_change_wiring() call to pmap_unwire(). The routine was only being used to unwire (and could only safely be called for unwiring anyway). Remove the unused 'wired' and the 'entry' arguments.
Also change how pmap_unwire() works to remove a small race condition.
* Fix race conditions in the vmspace_*() system calls which could lead to pmap corruption. Note that the vkernel did not trigger any of these conditions, I found them while looking for another bug.
* Add missing maptypes to procfs's /proc/*/map report.
show more ...
|