27852ebe | 21-Feb-2016 |
David van Moolenbroek <david@minix3.org> |
UDS: full rewrite
This new implementation of the UDS service is built on top of the libsockevent library. It thereby inherits all the advantages that libsockevent brings. However, the fundamental
UDS: full rewrite
This new implementation of the UDS service is built on top of the libsockevent library. It thereby inherits all the advantages that libsockevent brings. However, the fundamental restructuring required for that change also paved the way for resolution of a number of other important open issues with the old UDS code. Most importantly, the rewrite brings the behavior of the service much closer to POSIX compliance and NetBSD compatibility. These are the most important changes:
- due to the use of libsockevent, UDS now supports multiple suspending calls per socket and a large number of standard socket flags and options; - socket address matching is now based on <device,inode> lookups instead of canonized path names, and socket addresses are no longer altered either due to canonization or at connect time; - the socket state machine is now well defined, most importantly resolving the erroneous reset-on-EOF semantics of the old UDS, but also allowing socket reuse; - sockets are now connected before being accepted instead of being held in connecting state, unless the LOCAL_CONNWAIT option is set on either the connecting or the listening socket; - connect(2) on datagram sockets is now supported (needed by syslog), and proper datagram socket disconnect notification is provided; - the receive queue now supports segmentation, associating ancillary data (in-flight file descriptors and credentials) with each segment instead of being kept fully separately; this is a POSIX requirement (and needed by tmux); - as part of the segmentation support, the receive queue can now hold as many packets as can fit, instead of one; - in addition to the flags supported by libsockevent, the MSG_PEEK, MSG_WAITALL, MSG_CMSG_CLOEXEC, MSG_TRUNC, and MSG_CTRUNC send and receive flags are now supported; - the SO_PASSCRED and SO_PEERCRED socket options are replaced by LOCAL_CREDS and LOCAL_PEEREID respectively, now following NetBSD semantics and allowing use of NetBSD libc's getpeereid(3); - memory usage is reduced by about 250 KB due to centralized in-flight file descriptor tracking, with a limit of OPEN_MAX total rather than of OPEN_MAX per socket; - memory usage is reduced by another ~50 KB due to removal of state redundancy, despite the fact that socket path names may now be up to 253 bytes rather than the previous 104 bytes; - compared to the old UDS, there is now very little direct indexing on the static array of sockets, thus allowing dynamic allocation of sockets more easily in the future; - the UDS service now has RMIB support for the net.local sysctl tree, implementing preliminary support for NetBSD netstat(1).
Change-Id: I4a9b6fe4aaeef0edf2547eee894e6c14403fcb32
show more ...
|
491d647a | 25-Jul-2016 |
David van Moolenbroek <david@minix3.org> |
VFS: support for suspending close(2) for sockets
This change effectively adds the VFS side of support for the SO_LINGER socket option, by allowing file descriptor close operations to be suspended (a
VFS: support for suspending close(2) for sockets
This change effectively adds the VFS side of support for the SO_LINGER socket option, by allowing file descriptor close operations to be suspended (and later resumed) by socket drivers. Currently, support is limited to the close(2) system call--in all other cases where file descriptors are closed (dup2, close-on-exec, process exit..), the close operation still completes instantly. As a general policy, the close(2) return value will always indicate that the file descriptor has been closed: either 0, or -1 with errno set to EINPROGRESS. The latter error may be thrown only when a suspended close is interrupted by a signal.
As necessary for UDS, this change also introduces a closenb(2) system call extension, allowing the caller to bypass blocking SO_LINGER close behavior. This extension allows UDS to avoid blocking on closing the last reference to an in-flight file descriptor, in an atomic fashion. The extension is currently part of libsys, but there is no reason why userland would not be allowed to make this call, so it is deliberately not protected from use by userland.
Change-Id: Iec77d6665232110346180017fc1300b1614910b7
show more ...
|
6956dd2b | 05-Oct-2016 |
David van Moolenbroek <david@minix3.org> |
libc: bugfixes for minix's poll(3) wrapper
- clear "revents" fields even when the call times out; - do not call FD_ISSET with a negative file descriptor number.
Change-Id: I7aeaae79e73e39aed127a754
libc: bugfixes for minix's poll(3) wrapper
- clear "revents" fields even when the call times out; - do not call FD_ISSET with a negative file descriptor number.
Change-Id: I7aeaae79e73e39aed127a75495ea08256b18c182
show more ...
|
3ac58492 | 24-Sep-2016 |
David van Moolenbroek <david@minix3.org> |
Add LLVM GCOV coverage support
With this patch, it is now possible to generate coverage information for MINIX3 system services with LLVM. In particular, the system can be built with MKCOVERAGE=yes,
Add LLVM GCOV coverage support
With this patch, it is now possible to generate coverage information for MINIX3 system services with LLVM. In particular, the system can be built with MKCOVERAGE=yes, either with a native "make build" or with crosscompilation. Either way, MKCOVERAGE=yes will build the MINIX3 system services with coverage profiling support, generating a .gcno file for each source module. After a reboot it is possible to obtain runtime coverage data (.gcda files) for individual system services using gcov-pull(8). The combination of the .gcno and .gcda files can then be inspected with llvm-cov(1).
For reasons documented in minix.gcov.mk, only system service program modules are supported for now; system service libraries (libsys etc.) are not included. Userland programs are not affected by MKCOVERAGE.
The heart of this patch is the libsys code that writes data generated by the LLVM coverage hooks into a serialized format using the routines we already had for GCC GCOV. Unfortunately, the new llvm_gcov.c code is LLVM ABI dependent, and may therefore have to be updated later when we upgrade LLVM. The current implementation should support all LLVM versions 3.x with x >= 4.
The rest of this patch is mostly a light cleanup of our existing GCOV infrastructure, with as most visible change that gcov-pull(8) now takes a service label string rather than a PID number.
Change-Id: I6de055359d3d2b3f53e426f3fffb17af7877261f
show more ...
|
232819dd | 04-Jan-2016 |
David van Moolenbroek <david@minix3.org> |
VFS: store process suspension state as union
Previously, VFS would use various subsets of a number of fproc structure fields to store state when the process is blocked (suspended) for various reason
VFS: store process suspension state as union
Previously, VFS would use various subsets of a number of fproc structure fields to store state when the process is blocked (suspended) for various reasons. As a result, there was a fair amount of abuse of fields, hidden state, and confusion as to which fields were used with which suspension states.
Instead, the suspension state is now split into per-state structures, which are then stored in a union. Each of the union's structures should be accessed only right before, during, and right after the fp_blocked_on field is set to the corresponding blocking type. As a result, it is now very clear which fields are in use at which times, and we even save a bit of memory as a side effect.
Change-Id: I5c24e353b6cb0c32eb41c70f89c5cfb23f6c93df
show more ...
|
84ed480e | 27-Feb-2016 |
David van Moolenbroek <david@minix3.org> |
libc: fix local from-source upgrades
Commit git-c38dbb9 inadvertently broke local MINIX3-on-MINIX3 builds, since its libc changes relied on VFS being upgraded already as well. As a result, after ins
libc: fix local from-source upgrades
Commit git-c38dbb9 inadvertently broke local MINIX3-on-MINIX3 builds, since its libc changes relied on VFS being upgraded already as well. As a result, after installing the new libc, networking ceased to work, leading to curl(1) failing later on in the build process. This patch introduces transitional code that is necessary for the build process to complete, after which it is obsolete again.
Change-Id: I93bf29c01d228e3d7efc7b01befeff682954f54d
show more ...
|
c38dbb97 | 21-Feb-2016 |
David van Moolenbroek <david@minix3.org> |
Prepare for switch to native BSD socket API
Currently, the BSD socket API is implemented in libc, translating the API calls to character driver operations underneath. This approach has several issu
Prepare for switch to native BSD socket API
Currently, the BSD socket API is implemented in libc, translating the API calls to character driver operations underneath. This approach has several issues:
- it is inefficient, as most character driver operations are specific to the socket type, thus requiring that each operation start by bruteforcing the socket protocol family and type of the given file descriptor using several system calls; - it requires that libc itself be changed every time system support for a new protocol is added; - various parts of the libc implementations violate the asynchronous signal safety POSIX requirements.
In order to resolve all these issues at once, the plan is to turn the BSD socket calls into system calls, thus making the BSD socket API the "native" ABI, removing the complexity from libc and instead letting VFS deal with the socket calls.
The overall change is going to break all networking functionality. In order to smoothen the transition, this patch introduces the fifteen new BSD socket system calls, and makes libc try these first before falling back on the old behavior. For now, the VFS implementations of the new calls fail such that libc will always use the fallback cases. Later on, when we introduce the actual implementation of the native BSD socket calls, all statically linked programs will automatically use the new ABI, thus limiting actual application breakage.
In other words: by itself, this patch does nothing, except add a bit of transitional overhead that will disappear in the future. The largest part of the patch is concerned with adding full support for the new BSD socket system calls to trace(1) - this early addition has the advantage of making system call tracing output of several socket calls much more readable already.
Both the system call interfaces and the trace(1) support have already been tested using code that will be committed later on.
Change-Id: I3460812be50c78be662d857f9d3d6840f3ca917f
show more ...
|
0df28c9f | 16-Jan-2016 |
David van Moolenbroek <david@minix3.org> |
libc: reorganize vector I/O wrappers
The reorganization allows other libc system call wrappers (namely, sendmsg and recvmsg) to perform I/O vector coalescing as well.
Change-Id: I116b48a6db39439053
libc: reorganize vector I/O wrappers
The reorganization allows other libc system call wrappers (namely, sendmsg and recvmsg) to perform I/O vector coalescing as well.
Change-Id: I116b48a6db39439053280ee805e0dcbdaec667a3
show more ...
|
c33d6ef3 | 14-Jan-2016 |
David van Moolenbroek <david@minix3.org> |
VFS: start off cleanup of pipe2 IPC message
There is no reason to use a single message for nonoverlapping requests and replies combined, and in fact splitting them out allows reuse of messages and a
VFS: start off cleanup of pipe2 IPC message
There is no reason to use a single message for nonoverlapping requests and replies combined, and in fact splitting them out allows reuse of messages and avoids various problems with field layouts. Since the upcoming socketpair(2) system call will be using the same reply as pipe2(2), split up the single message used for the latter. In order to keep the used parts of messages at the front, start a transitional phase to move the pipe(2) flags field to the front of its request.
Change-Id: If3f1c3d348ec7e27b7f5b7147ce1b9ef490dfab9
show more ...
|
d991a2be | 09-Oct-2015 |
David van Moolenbroek <david@minix3.org> |
Retire sysuname(2), synchronize sys/utsname.h
Now that uname(3) uses sysctl(2), we no longer need sysuname(2). Backward compatibility is retained for old statically linked binaries for a short while
Retire sysuname(2), synchronize sys/utsname.h
Now that uname(3) uses sysctl(2), we no longer need sysuname(2). Backward compatibility is retained for old statically linked binaries for a short while.
Also remove the now-obsolete MINIX3-specific "arch" field from the utsname structure. While this is an ABI break at the libc level, it should pose no problems in practice, because:
- statically linked programs (i.e., all of the base system) are not affected, as they will use headers synchronized with libc; - the structure is getting smaller, thus, older dynamically linked programs (typically in pkgsrc) using the new libc will end up with garbage in the "arch" field, but it is unlikely they will use this field anyway, since it was specific to MINIX3; - new dynamically linked programs using an old libc could end up with memory corruption, but this is not a scenario that is expected to occur in the first place - certainly not with programs from pkgsrc.
Change-Id: I29c76576f509feacc8f996f0bd353ca8961d4917
show more ...
|