154350Shibler1. Create and use an interrupt stack. 254350Shibler Well actually, use the master SP for kernel stacks instead of 354350Shibler the interrupt SP. Right now we use the interrupt stack for 464862Shibler everything. Allows for more accurate accounting of systime. 564862Shibler In theory, could also allow for smaller kernel stacks but we 664862Shibler only use one page anyway. 754350Shibler 854350Shibler2. Copy/clear primitives could be tuned. 963558Smckusick What is best is highly CPU and cache dependent. One thing to look 1063558Smckusick at are the copyin/copyout primitives. Rather than looping using 1163558Smckusick MOVS instructions, you could map an entire page at a time and use 1263558Smckusick bcopy, MOVE16, or whatever. This would lose big on the VAC models 1363558Smckusick however. 1454350Shibler 1554350Shibler3. Sendsig/sigreturn are pretty bogus. 1654350Shibler Currently we can call a signal handler even if an excpetion 1754350Shibler occurs in the middle of an instruction. This causes the handler 1854350Shibler to return right back to the middle of the offending instruction 1954350Shibler which will most likely lead to another exception/signal. 2054350Shibler Technically, I feel this is the correct behavior but it requires 2154350Shibler saving a lot of state on the user's stack, state that we don't 2254350Shibler really want the user messing with. Other 68k implementations 2354350Shibler (e.g. Sun) will delay signals or abort execution of the current 2454350Shibler instruction to reduce saved state. Even if we stick with the 2554350Shibler current philosophy, the code could be cleaned up. 2654350Shibler 2754350Shibler4. Ditto for AST and software interrupt emulation. 2854350Shibler Both are possibly over-elaborate and inefficiently implemented. 2954350Shibler We could possibly handle them by using an appropriately planted 3054350Shibler PS trace bit. 3154350Shibler 3264862Shibler5. Make use of transparent translation registers on 030/040 MMU. 3354350Shibler With a little rearranging of the KVA space we could use one to 3454350Shibler map the entire external IO space [ 600000 - 20000000 ). Since 3554350Shibler the translation must be 1-1, this would limit the kernel to 6mb 3654350Shibler (some would say that is hardly a limit) or divide it into two 3764862Shibler pieces. Another promising use would be to map physical memory 3864862Shibler within the kernel. This allows a much simpler and more efficient 3964862Shibler implementation of /dev/mem, pmap_zero_page, pmap_copy_page and 4064862Shibler possible even kernel-user cross address space copies. However, 4164862Shibler it does eat up a significant piece of kernel address space. 4254350Shibler 4363558Smckusick6. Create a 32-bit timer. 4463558Smckusick Timers 2 and 3 on the MC6840 clock chip can be concatonated together to 4563558Smckusick get a 32-bit countdown timer. There are at least three uses for this: 4663558Smckusick 1. Monitoring the interval timer ("clock") to detect lost "ticks". 4763558Smckusick (Idea from Scott Marovich) 4863558Smckusick 2. Implement the DELAY macro properly instead of approximating with 4963558Smckusick the current "while (--count);" loop. Because of caches, the current 5063558Smckusick method is potentially way off. 5163558Smckusick 3. Export as a user-mappable timer for high-precision (4us) timing. 5263558Smckusick Note that by doing this we can no longer use timer 3 as a separate 5363558Smckusick statistics/profiling timer. Should be able to compile-time (runtime?) 5463558Smckusick select between the two. 5554350Shibler 5654350Shibler7. Conditional MMU code sould be restructured. 5754350Shibler Right now it reflects the evolutionary path of the code: 320/350 MMU 5854350Shibler was supported and PMMU support was glued on. The latter can be ifdef'ed 5954350Shibler out when not needed, but not all of the former (e.g. ``mmutype'' tests). 6054350Shibler Also, PMMU is made to look like the HP MMU somewhat ham-stringing it. 6154350Shibler Since HP MMU models are dead, the excess baggage should be there (though 6254350Shibler it could be argued that they benefit more from the minor performance 6354350Shibler impact). MMU code should probably not be ifdef'ed on model type, but 6454350Shibler rather on more relevant tags (e.g. MMU_HP, MMU_MOTO). 6554350Shibler 6663558Smckusick8. Redo cache handling. 6754351Shibler There are way too many routines which are specific to particular 6854351Shibler cache types. We should be able to come up with a more coherent 6954351Shibler scheme (though HP 68k boxes have just about every caching scheme 7054351Shibler imaginable: internal/external, physical/virtual, writeback/writethrough) 71*67582Shibler See, for example, Wheeler and Bershad in ASPLOS 92. For more efficient 72*67582Shibler handling of physical caches see also Kessler and Hill in Nov. 92 TOCS. 7365732Shibler 7466535Shibler9. Sort the free page list. 7566535Shibler The DMA hardware on the 300 cannot do scatter/gather IO. For example, 7666535Shibler if an 8k system buffer consists of two non-contiguous physical pages 7766535Shibler it will require two DMA transfers (and hence two interrupts) to do the 7866535Shibler operation. It would take only one transfer if they were physically 7966535Shibler contiguous. By keeping the free list ordered we could potentially 8066535Shibler allocate contiguous pages and reduce the number of interrupts. We can 8166535Shibler consider doing this since pages in the free list are not reclaimed and 8266535Shibler thus we don't have to worry about distorting any LRU behavior. 8365732Shibler---- 8465732ShiblerMike Hibler 8565732ShiblerUniversity of Utah CSS group 8665732Shiblermike@cs.utah.edu 87