Edge-Triggered Epoll and Backend Improvements by sgerbino · Pull Request #96 · cppalliance/corosio

sgerbino · 2026-01-31T02:48:32Z

Summary

This PR introduces edge-triggered epoll (EPOLLET) with persistent descriptor registration, significantly improving latency under concurrent load while maintaining throughput parity. It also includes bug fixes, documentation improvements, and new benchmarking infrastructure.

Key Changes

1. Edge-Triggered Epoll Implementation

Switched from level-triggered to edge-triggered epoll with persistent registration:

Persistent registration: File descriptors are registered once with epoll and stay registered until closed, eliminating repeated epoll_ctl calls
Edge-triggered mode: Uses EPOLLET flag for more efficient event notification
Readiness caching: Atomic read_ready/write_ready flags cache edge events that arrive before an operation is registered, preventing missed events
Memory ordering: Uses seq_cst for critical synchronization between operation registration and reactor event delivery

2. Performance Results

Latency improvements (lower is better):

Test	develop	This PR	Change
1 pair p99	8.58 µs	4.96 µs	-42%
4 pairs p99	24.32 µs	19.18 µs	-21%
16 pairs p99	131.06 µs	71.46 µs	-45%

HTTP Server (concurrent connections):

Test	develop	This PR	Change
16 connections throughput	203.99 Kops/s	233.33 Kops/s	+14%
4 threads, 32 connections	246.14 Kops/s	292.27 Kops/s	+19%
16 connections p99 latency	118.20 µs	82.85 µs	-30%

3. Bug Fixes

Use-after-free in select backend: Fixed by setting impl_ptr = shared_from_this() immediately after work_started() in async operation paths
Timer starvation in select scheduler: Added timer processing in do_one() loop to prevent timers from starving when handlers are continuously posted

4. Documentation Improvements

Renamed signal_set::async_wait() to wait() for consistency with timer::wait()
Changed documentation to reference capy::cond::canceled instead of capy::error::canceled
Added usage examples showing correct error condition comparison pattern
Removed line divider comments and redundant section headers per coding standards
Added brief Javadoc to implementation classes

5. New Benchmarking Infrastructure

HTTP server benchmark: New benchmark comparing corosio vs Asio HTTP server performance
Multi-backend support: Benchmarks now support --backend flag to select epoll or select
JSON output: Added --json flag for machine-readable benchmark results
Selective execution: Added --only flag to run specific benchmark suites

6. Multi-Backend Test Support

Socket stress tests now run on both epoll and select backends:

boost.corosio.socket_stress.* - epoll backend
boost.corosio.socket_stress.*.select - select backend

Summary by CodeRabbit

Release Notes

API Changes
- Signal set asynchronous wait method renamed from async_wait() to wait()
Improvements
- Enhanced cancellation error handling with updated references to capy::cond::canceled
- Improved timer processing to prevent starvation in event loops
- Better lifecycle management and state consistency for socket operations
Documentation
- Updated examples demonstrating the new wait() API usage
- Clarified cancellation behavior with detailed examples and error code comparisons

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Enhance the benchmark infrastructure with machine-readable output and improved usability for CI/automation workflows. Changes to bench/common/benchmark.hpp: - Add metric, benchmark_result, and result_collector classes - result_collector serializes to JSON with metadata (backend, timestamp) - benchmark_result supports fluent API for adding metrics - Add add_latency_stats() helper for statistics objects Changes to all benchmark executables: - Add --output <file> to write JSON results (stdout still works) - Add --bench <name> to run a single benchmark instead of all - Add --help with list of available benchmarks - Refactor to use run_benchmarks() function for consistent structure - Add descriptive comments explaining what each benchmark measures and why it's useful JSON output format: { "metadata": {"backend": "epoll", "timestamp": "..."}, "benchmarks": [ {"name": "...", "metric": value, ...} ] } This enables: - Programmatic consumption of results (CI, regression tracking) - Quick iteration by running only specific benchmarks - Better documentation of benchmark purpose

Add a mock HTTP server benchmark that measures request throughput using read_until with dynamic buffers. Both implementations use equivalent composed operations for fair comparison. Benchmark scenarios: - single_conn: Single connection, sequential requests - concurrent: Multiple concurrent connections (1, 4, 16, 32) - multithread: Multi-threaded run() with varying thread counts

Replace per-operation epoll_ctl(ADD)/epoll_ctl(DEL) with persistent registration using EPOLLONESHOT. File descriptors are registered once and re-armed via epoll_ctl(MOD) when operations need to wait. - Add descriptor_data struct to track per-fd registration state - Implement lazy registration (register on first wait, not on open) - Use EPOLLONESHOT to auto-disarm after events - Apply to both sockets and acceptors - Remove legacy per-operation registration code paths Reduces epoll_ctl calls from 2 to 1 per waiting I/O operation.

- Add bench/common/backend_selection.hpp with shared dispatch_backend() utility for runtime backend selection - Update all corosio benchmarks to support --backend and --list options - Change make_socket_pair to accept basic_io_context& for polymorphic context support, enabling benchmarks to use any backend type

Switch from level-triggered to edge-triggered epoll (EPOLLET) mode. This reduces epoll_ctl syscalls by registering descriptors once with all events rather than modifying per-operation. Atomic operations coordinate between reactor and cancellation paths to prevent races. Readiness caching handles edge events that arrive before operations are registered. Also enables socket_stress tests on Linux.

Enable stress tests to run on both epoll and select backends using template-based test implementations. Fix timer starvation in the select scheduler that caused tests to hang when synchronous I/O completions were continuously posted without going through the reactor. Timers are now processed at the start of each do_one() iteration, matching the epoll scheduler behavior.

Set impl_ptr immediately after work_started() to keep the socket/acceptor impl alive while the operation is pending. Previously, impl_ptr was only set in sync completion and cancel paths, leaving a window where the reactor could complete an operation after the impl was destroyed. Fixes segfault in accept stress test on macOS.

Replace explicit atomic_thread_fence(seq_cst) with seq_cst on the atomic operations themselves. The fence after every operation registration was expensive (10-100 cycles). Moving the ordering guarantee into the store and exchange operations achieves the same synchronization with less overhead.

Remove line divider comments and section headers that just repeat function names. Add brief Javadoc to impl classes. Preserves meaningful comments that explain why, not what.

- Rename async_wait() to wait() for consistency with timer - Change documentation to use capy::cond::canceled instead of capy::error::canceled for error condition comparisons - Add usage examples showing correct cancellation checking pattern - Update all references in tests, docs, and implementation comments

coderabbitai · 2026-01-31T02:48:46Z

📝 Walkthrough

Walkthrough

This PR refactors the epoll backend to use persistent per-descriptor registration instead of per-operation state tracking, introduces a new descriptor_data structure for managing descriptor-level state, and renames the public signal wait API from async_wait() to wait(). Documentation is updated to reference capy::cond::canceled instead of capy::error::canceled.

Changes

Cohort / File(s)	Summary
Signal Set API and Documentation `include/boost/corosio/signal_set.hpp`, `include/boost/corosio/timer.hpp`, `src/corosio/src/detail/posix/signals.cpp`, `src/corosio/src/detail/win/signals.cpp`	Renamed public method from `async_wait()` to `wait()`. Updated documentation blocks to reference `capy::cond::canceled` instead of `capy::error::canceled` with expanded examples demonstrating cancellation handling and stop_token behavior.
Epoll Backend: Core Data Structures `src/corosio/src/detail/epoll/op.hpp`	Replaced tri-state `registration_state` enum with persistent-registration model. Introduced `descriptor_data` struct containing per-descriptor state: atomic operation pointers (`read_op`, `write_op`, `connect_op`), readiness flags, registered events, and file descriptor. Updated lifecycle commentary to reflect persistent-registration semantics.
Epoll Backend: Scheduler `src/corosio/src/detail/epoll/scheduler.hpp`, `src/corosio/src/detail/epoll/scheduler.cpp`	Refactored scheduler API from op-centric to descriptor-centric: renamed `register_fd()` → `register_descriptor()`, `modify_fd()` → `update_descriptor_events()`, `unregister_fd()` → `deregister_descriptor()`. Reworked event loop to operate on descriptor-level state with per-operation atomic handoffs. Enhanced error handling for `EPOLLERR`/`EPOLLHUP`. Added `eventfd_armed_` tracking and increased epoll wait buffer capacity.
Epoll Backend: Acceptors `src/corosio/src/detail/epoll/acceptors.hpp`, `src/corosio/src/detail/epoll/acceptors.cpp`	Added `update_epoll_events()` method and `descriptor_data` member. Extended class to inherit from `intrusive_list<epoll_acceptor_impl>::node`. On accept, initializes descriptor state and routes pending operations through shared post/finish mechanism with EAGAIN/EWOULDBLOCK retry handling.
Epoll Backend: Sockets `src/corosio/src/detail/epoll/sockets.hpp`, `src/corosio/src/detail/epoll/sockets.cpp`	Added `update_epoll_events()` method and `descriptor_data` member. Extended class to inherit from `intrusive_list<epoll_socket_impl>::node`. Reworked socket lifecycle to use per-operation atomic descriptor pointers and atomic exchanges for cancellation/claiming instead of registration_state logic.
Select Backend: Acceptors `src/corosio/src/detail/select/acceptors.hpp`, `src/corosio/src/detail/select/acceptors.cpp`	Minor documentation updates and lifecycle fixes. Added `impl_ptr` assignment in EAGAIN/EWOULDBLOCK path to ensure acceptor_impl remains alive for pending operations.
Select Backend: Sockets `src/corosio/src/detail/select/sockets.hpp`, `src/corosio/src/detail/select/sockets.cpp`	Extended class to inherit from `intrusive_list<select_socket_impl>::node`. Added `impl_ptr` assignments in connect/read/write paths when operations enter waiting state to guarantee socket_impl lifetime during async operations. Added documentation comments.
Select Backend: Scheduler and Cleanup `src/corosio/src/detail/select/scheduler.cpp`, `src/corosio/src/detail/select/op.hpp`	Added timer-expiry processing in `do_one()` to prevent timer starvation by checking and processing expired timers before popping completed operations. Removed separator comments from operation struct definitions.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Implement single reactor scheduler for epoll backend #49: Coordinated refactoring of epoll backend data structures and scheduler APIs, updating op.hpp, scheduler, sockets, and acceptors with descriptor-centric persistent-registration model.
Take a copy of capy's intrusive list as an implementation detail. #46: Modifications to intrusive-list inheritance across socket, acceptor, and signal implementation classes.
Implement socket options API with named methods #79: Changes to epoll socket implementation class signature and member additions (descriptor_data, update_epoll_events).

Poem

🐰 From async to wait, the signal does flow,
Descriptors persistent, through epoll they go,
No more per-op states in chaotic array,
Just one home per fd—a cleaner way!
Timers won't starve, and sockets won't tire,
The refactor burns bright like a rabbit's desire! ✨

🚥 Pre-merge checks | ✅ 2

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Edge-Triggered Epoll and Backend Improvements' directly corresponds to the main changes: introducing edge-triggered epoll implementation with persistent descriptor registration and various backend improvements across multiple scheduler implementations.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

cppalliance-bot · 2026-01-31T02:53:17Z

An automated preview of the documentation is available at https://96.corosio.prtest3.cppalliance.org/index.html

If more commits are pushed to the pull request, the docs will rebuild at the same URL.

2026-01-31 02:53:16 UTC

codecov · 2026-01-31T02:53:20Z

Codecov Report

❌ Patch coverage is 61.37339% with 90 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.21%. Comparing base (c2b933e) to head (965a3f2).
⚠️ Report is 12 commits behind head on develop.

Files with missing lines	Patch %	Lines
src/corosio/src/detail/epoll/sockets.cpp	51.80%	40 Missing ⚠️
src/corosio/src/detail/epoll/scheduler.cpp	65.59%	32 Missing ⚠️
src/corosio/src/detail/epoll/acceptors.cpp	60.00%	16 Missing ⚠️
src/corosio/src/detail/select/sockets.cpp	66.66%	1 Missing ⚠️
src/corosio/src/test/socket_pair.cpp	87.50%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop      #96      +/-   ##
===========================================
+ Coverage    82.14%   82.21%   +0.06%     
===========================================
  Files           56       58       +2     
  Lines         4951     5195     +244     
===========================================
+ Hits          4067     4271     +204     
- Misses         884      924      +40

Files with missing lines	Coverage Δ
include/boost/corosio/signal_set.hpp	`93.54% <100.00%> (ø)`
include/boost/corosio/timer.hpp	`94.44% <ø> (ø)`
src/corosio/src/detail/epoll/acceptors.hpp	`100.00% <ø> (ø)`
src/corosio/src/detail/epoll/op.hpp	`84.68% <ø> (+0.75%)`	⬆️
src/corosio/src/detail/epoll/sockets.hpp	`91.66% <ø> (ø)`
src/corosio/src/detail/posix/signals.cpp	`89.55% <ø> (ø)`
src/corosio/src/detail/select/acceptors.cpp	`65.72% <100.00%> (+2.16%)`	⬆️
src/corosio/src/detail/select/acceptors.hpp	`100.00% <ø> (ø)`
src/corosio/src/detail/select/op.hpp	`74.80% <ø> (+3.81%)`	⬆️
src/corosio/src/detail/select/scheduler.cpp	`73.62% <100.00%> (+0.29%)`	⬆️
... and 6 more

... and 6 files with indirect coverage changes

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c2b933e...965a3f2. Read the comment docs.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

sgerbino added 10 commits January 30, 2026 22:39

Clean up documentation in epoll and select backends

d21c12c

Remove line divider comments and section headers that just repeat function names. Add brief Javadoc to impl classes. Preserves meaningful comments that explain why, not what.

sgerbino merged commit a71204e into cppalliance:develop Jan 31, 2026
17 of 18 checks passed

sgerbino deleted the pr/epoll-improvements branch January 31, 2026 03:10

This was referenced Feb 3, 2026

Epoll #101

Merged

Epoll optimizations #106

Merged

Benchmark enhancements #107

Merged

Add kqueue backend #114

Merged

coderabbitai bot mentioned this pull request Feb 18, 2026

Document move preconditions and awaitable lifetime requirements acros… #152

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Edge-Triggered Epoll and Backend Improvements#96

Edge-Triggered Epoll and Backend Improvements#96
sgerbino merged 10 commits intocppalliance:developfrom
sgerbino:pr/epoll-improvements

sgerbino commented Jan 31, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 31, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

cppalliance-bot commented Jan 31, 2026

Uh oh!

codecov bot commented Jan 31, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sgerbino commented Jan 31, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key Changes

1. Edge-Triggered Epoll Implementation

2. Performance Results

3. Bug Fixes

4. Documentation Improvements

5. New Benchmarking Infrastructure

6. Multi-Backend Test Support

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Jan 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

cppalliance-bot commented Jan 31, 2026

Uh oh!

codecov bot commented Jan 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sgerbino commented Jan 31, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 31, 2026 •

edited

Loading

codecov bot commented Jan 31, 2026 •

edited

Loading