Skip to content

Add kqueue backend#114

Merged
mvandeberg merged 1 commit intocppalliance:developfrom
mvandeberg:feature/kqueue
Feb 9, 2026
Merged

Add kqueue backend#114
mvandeberg merged 1 commit intocppalliance:developfrom
mvandeberg:feature/kqueue

Conversation

@mvandeberg
Copy link
Contributor

@mvandeberg mvandeberg commented Feb 9, 2026

Summary by CodeRabbit

  • New Features
    • Kqueue now becomes the default I/O context on BSD/macOS.
    • Added a complete kqueue-backed backend: reactor/scheduler, async socket I/O, and connection accept support for native, edge-triggered event handling and improved concurrency/cancellation semantics.

@mvandeberg mvandeberg changed the title Add kqueue backend and fix ARM atomic memory access Add kqueue backend Feb 9, 2026
@coderabbitai
Copy link

coderabbitai bot commented Feb 9, 2026

📝 Walkthrough

Walkthrough

Adds a BSD kqueue-based asynchronous I/O backend: kqueue_context, a kqueue_scheduler reactor, kqueue operation types, kqueue socket and acceptor services, and switches io_context to alias kqueue_context when BOOST_COROSIO_HAS_KQUEUE is defined.

Changes

Cohort / File(s) Summary
Context & Selection
include/boost/corosio/io_context.hpp, include/boost/corosio/kqueue_context.hpp, src/corosio/src/kqueue_context.cpp, perf/common/backend_selection.hpp
Adds kqueue_context, exposes it when BOOST_COROSIO_HAS_KQUEUE, wires backend selection to recognize "kqueue".
Scheduler
src/corosio/src/detail/kqueue/scheduler.hpp, src/corosio/src/detail/kqueue/scheduler.cpp
New kqueue_scheduler implementing reactor loop, descriptor registration, work accounting, signaling, and run/poll/wait APIs.
Operation Core
src/corosio/src/detail/kqueue/op.hpp
Adds descriptor_state and kqueue_op hierarchy (connect/read/write/accept) with lifecycle, cancellation, and completion mechanics.
Socket Backend
src/corosio/src/detail/kqueue/sockets.hpp, src/corosio/src/detail/kqueue/sockets.cpp
Implements kqueue_socket_impl and kqueue_socket_service: connect/read/write, socket options, cancellation, registration with scheduler, and endpoint management.
Acceptor Backend
src/corosio/src/detail/kqueue/acceptors.hpp, src/corosio/src/detail/kqueue/acceptors.cpp
Implements kqueue_acceptor_impl and kqueue_acceptor_service: listen/open, accept flow, cancellation, and peer socket integration.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Client
participant kqueue_context
participant kqueue_scheduler
participant kqueue_socket_service
participant Kernel
Client->>kqueue_context: create socket / initiate op
kqueue_context->>kqueue_socket_service: create_impl / open_socket
kqueue_socket_service->>kqueue_scheduler: register_descriptor(fd, desc)
kqueue_socket_service->>Kernel: socket non-blocking / connect()/listen()
Kernel-->>kqueue_scheduler: kqueue event (EVFILT_READ/WRITE/USER)
kqueue_scheduler->>kqueue_socket_service: post deferred completions / enqueue ops
kqueue_socket_service->>kqueue_socket_service: invoke kqueue_op.operator() -> complete
kqueue_socket_service-->>Client: resume coroutine / return result

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related issues

Possibly related PRs

Poem

"I hopped into the reactor's queue,
kqueue bells rang, handlers flew,
sockets danced, accepts took flight,
corosio hums through BSD night,
a rabbit cheers — async delight!" 🐰

🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add kqueue backend' is clear, concise, and directly describes the main change: introducing kqueue as a new I/O backend for the Boost Corosio project.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

No actionable comments were generated in the recent review. 🎉

🧹 Recent nitpick comments
src/corosio/src/detail/kqueue/scheduler.cpp (1)

1113-1125: task_op_ sentinel not re-pushed on exception — verify this is intentional.

If run_task throws (fatal kevent error at line 983), the catch block resets task_running_ but does not push task_op_ back into completed_ops_. This means no thread will ever become the reactor again. For a fatal kqueue error this is arguably correct (the scheduler is broken), but if the user catches the exception from run() and attempts to restart, the scheduler will hang — all threads will wait on the condvar with no reactor ever running.

If the intent is to make the scheduler unusable after a fatal kevent error, a brief comment explaining this would help future maintainers.

src/corosio/src/detail/kqueue/sockets.cpp (1)

844-895: Consider using SOCK_NONBLOCK | SOCK_CLOEXEC on FreeBSD to reduce syscalls.

The comment at line 844 already notes this. On FreeBSD, socket(AF_INET, SOCK_STREAM | SOCK_NONBLOCK | SOCK_CLOEXEC, 0) would collapse 3 syscalls (socket + 2× fcntl) into 1. A #ifdef for FreeBSD (or a runtime check) could be added as a follow-up. Same applies to accept4() in the accept path.

src/corosio/src/detail/kqueue/acceptors.cpp (1)

101-133: SO_NOSIGPIPE set twice in the async accept path.

For asynchronous accepts, SO_NOSIGPIPE is first set in kqueue_accept_op::perform_io() (in op.hpp, lines 411–419) and then set again here in operator()() at line 124. The sync accept path only hits this code in operator()(), so it's correctly handled.

You could remove SO_NOSIGPIPE from perform_io() since operator()() always runs for both paths, keeping the setup in one place. Alternatively, removing it here would require adding it to the sync accept path in accept(). The first option is cleaner.

src/corosio/src/detail/kqueue/op.hpp (1)

382-428: SO_NOSIGPIPE in perform_io() is redundant with operator()() in acceptors.cpp.

As noted in the acceptors.cpp review, the async path sets SO_NOSIGPIPE here and again in kqueue_accept_op::operator()(). Removing it from perform_io() would consolidate socket-option setup in operator()(), where all accepted-fd post-processing already lives.

Non-blocking and FD_CLOEXEC setup can remain here since operator()() doesn't duplicate those.

src/corosio/src/detail/kqueue/sockets.hpp (1)

142-142: Consider a defensive assertion in set_socket.

If set_socket is called while the socket already holds a valid file descriptor (fd_ >= 0), the old descriptor silently leaks. A debug-only assertion would catch accidental misuse without adding runtime cost:

Suggested assertion
-    void set_socket(int fd) noexcept { fd_ = fd; }
+    void set_socket(int fd) noexcept
+    {
+        BOOST_ASSERT(fd_ == -1);
+        fd_ = fd;
+    }

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link

codecov bot commented Feb 9, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 79.70%. Comparing base (de218ef) to head (515ae24).
⚠️ Report is 1 commits behind head on develop.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #114      +/-   ##
===========================================
- Coverage    79.96%   79.70%   -0.27%     
===========================================
  Files           65       65              
  Lines         5661     5661              
===========================================
- Hits          4527     4512      -15     
- Misses        1134     1149      +15     

see 2 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update de218ef...515ae24. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
include/boost/corosio/io_context.hpp (1)

47-48: ⚠️ Potential issue | 🟡 Minor

Stale [future] tags in the docstring.

Line 47 says kqueue_context (kqueue) [future] and line 48 says select_context (select) [future], but both are now implemented. These tags should be removed to avoid confusion.

Proposed fix
-    - BSD/macOS: `kqueue_context` (kqueue) [future]
-    - Other POSIX: `select_context` (select) [future]
+    - BSD/macOS: `kqueue_context` (kqueue)
+    - Other POSIX: `select_context` (select)
🤖 Fix all issues with AI agents
In `@src/corosio/src/detail/kqueue/acceptors.cpp`:
- Around line 192-196: The coroutine resume is using saved_ex.dispatch(saved_h)
which differs from other acceptor ops that call resume_coro(saved_ex, saved_h);
change the resume call in this scope to use resume_coro(saved_ex, saved_h)
instead of saved_ex.dispatch(saved_h), preserving the existing move of ex and h
(saved_ex and saved_h) and keeping prevent_premature_destruction (impl_ptr)
alive; mirror the pattern used in kqueue_op::operator()() and
kqueue_connect_op::operator()() so any extra behavior in resume_coro is applied
consistently.

In `@src/corosio/src/detail/kqueue/scheduler.cpp`:
- Around line 982-983: The throw from throw_system_error can unwind through
run_task and leave task_running_ stuck true; modify do_one so the call to
run_task (the section that sets task_running_ around line 1100) is protected by
a scope guard (RAII or try/finally) that will clear task_running_ on exit
whether run_task returns normally or throws; ensure the guard is constructed
immediately after task_running_ is set and that its destructor (or
catch/finally) resets task_running_ = false so task_running_ cannot remain true
if an exception from kevent propagates.
🧹 Nitpick comments (11)
src/corosio/src/detail/kqueue/acceptors.hpp (2)

119-119: Consider using native_handle_type for consistency with the socket impl.

kqueue_socket_impl::native_handle() (in sockets.hpp line 118) returns native_handle_type, while this acceptor accessor returns raw int. Although this is a non-virtual, non-override accessor on an internal type, using the same typedef would be more consistent across the kqueue backend.


194-201: Move operations are implicitly deleted but not explicitly declared.

kqueue_context (line 81-82 of kqueue_context.hpp) explicitly deletes both copy and move. Here only copy is deleted; move is implicitly deleted due to the deleted copy. Consider adding explicit = delete for move for consistency and clarity.

src/corosio/src/detail/kqueue/sockets.hpp (3)

171-184: Encapsulation inconsistency with kqueue_acceptor_state.

kqueue_acceptor_state (in acceptors.hpp lines 172-187) uses private members with friend class kqueue_acceptor_service, while kqueue_socket_state exposes all members publicly. Consider aligning the two for consistency — either both use friend or both use public access.


142-142: set_socket has no precondition guard against leaking an existing fd.

If fd_ is already valid (≥ 0) when set_socket is called, the old fd leaks. Currently this is only called from accept after create_impl() where fd_ is -1, but a defensive check or assertion would prevent future misuse.

Proposed fix
-    void set_socket(int fd) noexcept { fd_ = fd; }
+    void set_socket(int fd) noexcept
+    {
+        BOOST_ASSERT(fd_ < 0 && "set_socket called on already-open socket");
+        fd_ = fd;
+    }

191-198: Same optional nit as the acceptor service: explicit move deletion for consistency.

kqueue_context deletes both copy and move explicitly. This service only deletes copy; move is implicitly deleted. Consider making it explicit.

src/corosio/src/kqueue_context.cpp (1)

44-45: Narrowing unsignedint for the concurrency hint.

static_cast<int>(concurrency_hint) would produce implementation-defined behavior if the value exceeds INT_MAX. This is practically unreachable via the default constructor (hardware thread counts won't hit INT_MAX), but an explicit user-supplied hint could theoretically trigger it. A clamp or assertion would make the contract clear.

Proposed defensive clamp
-    sched_ = &make_service<detail::kqueue_scheduler>(
-        static_cast<int>(concurrency_hint));
+    sched_ = &make_service<detail::kqueue_scheduler>(
+        static_cast<int>(std::min(concurrency_hint,
+            static_cast<unsigned>(std::numeric_limits<int>::max()))));
src/corosio/src/detail/kqueue/sockets.cpp (1)

298-355: Repeated EAGAIN park-retry-cancel pattern across do_read_io/do_write_io.

The work_started() → check read_ready → park or retry → cancellation check pattern at lines 298–355 and 382–438 is structurally identical to the one in connect() and acceptors.cpp::accept(). Each differs only in which desc_state_ slot is used (read_op, write_op, connect_op).

This isn't a correctness issue, but consider extracting a helper template or function that takes the slot pointer as a parameter to reduce the ~50-line duplication per call site. Not urgent for this PR.

Also applies to: 382-438

src/corosio/src/detail/kqueue/acceptors.cpp (2)

100-132: Accepted socket setup in operator() duplicates perform_io socket configuration.

kqueue_accept_op::perform_io() in op.hpp (lines 392–419) already sets non-blocking, close-on-exec, and SO_NOSIGPIPE on the accepted fd. Then operator()() here at lines 121–132 sets SO_NOSIGPIPE again. For the synchronous accept path (where perform_io isn't called), operator() is the only place these are set — but non-blocking and cloexec are set in accept() at lines 239–257, while SO_NOSIGPIPE is deferred to operator().

This split means socket flag setup is spread across three locations. Consider centralizing the accepted-fd configuration (non-blocking, cloexec, SO_NOSIGPIPE) into a single helper to avoid divergence.

Example helper
// In a shared header or anonymous namespace:
inline std::error_code configure_accepted_fd(int fd) noexcept
{
    int flags = ::fcntl(fd, F_GETFL, 0);
    if (flags == -1 || ::fcntl(fd, F_SETFL, flags | O_NONBLOCK) == -1)
        return make_err(errno);
    if (::fcntl(fd, F_SETFD, FD_CLOEXEC) == -1)
        return make_err(errno);
    int one = 1;
    if (::setsockopt(fd, SOL_SOCKET, SO_NOSIGPIPE, &one, sizeof(one)) == -1)
        return make_err(errno);
    return {};
}

This could then be used in perform_io(), accept() sync path, and operator()().


561-567: socket_service() uses dynamic_cast — consider performance for hot paths.

dynamic_cast at line 566 involves RTTI. If socket_service() is called per-accept (which it is, at line 104–106 of operator()), this happens on every accepted connection. In most builds, RTTI overhead is negligible, but for high-throughput acceptors, a cached pointer could avoid the repeated cast.

src/corosio/src/detail/kqueue/op.hpp (2)

382-428: kqueue_accept_op::perform_io configures the accepted fd inline, duplicating accept() and operator().

As noted in the acceptors.cpp review, the accepted fd configuration (non-blocking, cloexec, SO_NOSIGPIPE) is performed here in perform_io(), and again partially in acceptors.cpp::accept() (sync path) and operator()(). Since operator() always runs after perform_io(), the SO_NOSIGPIPE in operator() is redundant for the async path.

Consider whether perform_io() should handle ALL fd configuration (removing it from operator()), or whether a shared helper should be used. This would consolidate the logic and prevent inconsistencies if requirements change.


30-43: Platform-specific headers in an .hpp file under src/.

Lines 30–43 include <unistd.h>, <errno.h>, <fcntl.h>, <netinet/in.h>, <sys/socket.h>, and <sys/uio.h>. This is permitted since the file is under src/ and guarded by BOOST_COROSIO_HAS_KQUEUE, but these headers are included because perform_io() implementations are defined inline in this header. If any of these ops' perform_io() were moved to .cpp files, the POSIX includes could be removed from the header.

Not blocking — just noting the trade-off between inline performance and header hygiene.

@cppalliance-bot
Copy link

cppalliance-bot commented Feb 9, 2026

An automated preview of the documentation is available at https://114.corosio.prtest3.cppalliance.org/index.html

If more commits are pushed to the pull request, the docs will rebuild at the same URL.

2026-02-09 21:41:20 UTC

@cppalliance-bot
Copy link

cppalliance-bot commented Feb 9, 2026

GCOVR code coverage report https://114.corosio.prtest3.cppalliance.org/gcovr/index.html
LCOV code coverage report https://114.corosio.prtest3.cppalliance.org/genhtml/index.html
Coverage Diff Report https://114.corosio.prtest3.cppalliance.org/diff-report/index.html

Build time: 2026-02-09 21:47:06 UTC

@mvandeberg mvandeberg force-pushed the feature/kqueue branch 3 times, most recently from 794b9cb to 17f4bcf Compare February 9, 2026 20:47
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@src/corosio/src/detail/kqueue/scheduler.cpp`:
- Around line 754-771: The kevent() call in
kqueue_scheduler::interrupt_reactor() currently ignores its return value and can
silently fail; update interrupt_reactor() to check kevent(...)'s return (and/or
errno) and handle failures by emitting a diagnostic (e.g., debug log or assert)
including errno/message so a failed trigger is visible; locate the kevent
invocation near EV_SET(&ev, 0, EVFILT_USER, 0, NOTE_TRIGGER, 0, nullptr) and
user_event_armed_/kq_fd_ and add a conditional that logs/asserts when kevent
returns -1 (or unexpected value) with the error details.
- Around line 1054-1069: The race is that
completed_ops_.splice(ctx->private_queue) can expose work before
outstanding_work_ is incremented; update outstanding_work_ by adding
ctx->private_outstanding_work (and then zero ctx->private_outstanding_work)
before calling completed_ops_.splice(ctx->private_queue) so the global counter
reflects the incoming work (matching the drain_thread_queue pattern), then
proceed to splice and signal via maybe_unlock_and_signal_one; this prevents
task_cleanup's destructor later from double-counting and avoids premature
stop()/interrupt_reactor() when work_finished() runs concurrently.

In `@src/corosio/src/detail/kqueue/sockets.cpp`:
- Around line 361-443: kqueue_socket_impl::do_write_io() must explicitly handle
the case when ::writev returns 0 to avoid using a stale errno; add an early
check after ssize_t n = ::writev(...) to detect n == 0 and treat it as a closed
peer (e.g. call op.complete(EPIPE, 0); svc_.post(&op); return;), mirroring the
do_read_io() pattern so we never branch on an indeterminate errno value.
🧹 Nitpick comments (7)
src/corosio/src/detail/kqueue/acceptors.cpp (2)

83-197: Accept completion handler: missing deregister_descriptor on SO_NOSIGPIPE failure path.

When SO_NOSIGPIPE fails (line 124), the code closes the fd (line 128) and destroys the impl (line 130), but doesn't explicitly call deregister_descriptor before closing. While kqueue auto-removes closed fds, there's a brief window where the reactor could deliver events for this fd to the descriptor_state that's about to be destroyed. In practice this is safe because destroy_impl drops the last shared_ptr and no ops are parked, but adding deregister_descriptor before ::close() would be more defensive.

Suggested fix
                 if (::setsockopt(accepted_fd, SOL_SOCKET, SO_NOSIGPIPE, &one, sizeof(one)) == -1)
                 {
                     if (ec_out)
                         *ec_out = make_err(errno);
+                    socket_svc->scheduler().deregister_descriptor(accepted_fd);
                     ::close(accepted_fd);
                     accepted_fd = -1;
                     socket_svc->destroy_impl(impl);

234-268: Consider using accept4() on FreeBSD to avoid the fcntl TOCTOU window.

The comment on line 234 notes FreeBSD supports accept4() with SOCK_NONBLOCK | SOCK_CLOEXEC. Using it (with a compile-time or runtime check) would atomically set both flags, eliminating the fcntl TOCTOU window and the three extra syscalls. This could be deferred to a follow-up optimization.

src/corosio/src/detail/kqueue/op.hpp (1)

382-428: SO_NOSIGPIPE is set redundantly for the async accept path.

perform_io() sets SO_NOSIGPIPE on the accepted fd (lines 412–419), but kqueue_accept_op::operator()() in acceptors.cpp (lines 122–133) sets it again on the same fd. The synchronous accept path in accept() (acceptors.cpp lines 237–268) only relies on operator()() for this. The duplicate in perform_io() is harmless (setsockopt is idempotent) but wastes a syscall.

Consider removing SO_NOSIGPIPE from perform_io() and relying solely on operator()() for consistency with the synchronous path, or removing it from operator()() and adding it to the synchronous path — either way, setting it in exactly one place.

src/corosio/src/detail/kqueue/sockets.hpp (2)

148-162: Public internal members are documented but could benefit from encapsulation.

The conn_, rd_, wr_, desc_state_, and initiator members are public with a clear comment explaining they're for internal reactor/scheduler integration. Consider making them private with friend class kqueue_scheduler; friend struct descriptor_state; to prevent accidental misuse, though the current approach is acceptable for an internal detail class.


171-184: kqueue_socket_state data members are public without a friend declaration.

Unlike kqueue_acceptor_state (which uses friend class kqueue_acceptor_service + private: data), kqueue_socket_state exposes sched_, mutex_, socket_list_, and socket_ptrs_ as public. For consistency with the acceptor counterpart, consider making these private with a friend declaration.

Suggested change
 class kqueue_socket_state
 {
+    friend class kqueue_socket_service;
+
 public:
     explicit kqueue_socket_state(kqueue_scheduler& sched) noexcept
         : sched_(sched)
     {
     }
 
+private:
     kqueue_scheduler& sched_;
     std::mutex mutex_;
     intrusive_list<kqueue_socket_impl> socket_list_;
     std::unordered_map<kqueue_socket_impl*, std::shared_ptr<kqueue_socket_impl>> socket_ptrs_;
 };
src/corosio/src/detail/kqueue/scheduler.cpp (2)

906-930: work_cleanup may leave lock in an unexpected state for the caller.

Line 920 re-acquires the lock if the private queue is non-empty, but does not re-acquire it if the queue is empty. The caller (do_one at Line 1132) returns immediately after work_cleanup destructs, and run() at Line 570 conditionally re-locks with if (!lock.owns_lock()). This works but creates a subtle contract: work_cleanup may or may not leave the lock held, and do_one's caller must always check. The existing code handles this correctly, but a brief comment at Line 920 noting "lock left held for do_one's caller to check" would help future maintainers.


440-448: Use-after-free pattern is safe but fragile — consider a comment or alternative.

Line 442 copies the coroutine handle, Line 443 deletes this, then Line 447 issues an atomic_thread_fence. The existing comment on the fence is good, but the delete this + use-local-copy pattern could trip up future maintainers. The comment focuses on the fence semantics but doesn't call out that the delete this is intentional and safe because h is a local copy. A one-liner like // Safe: h is a stack copy; 'this' is no longer accessed after delete. would help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants