-
Notifications
You must be signed in to change notification settings - Fork 108
Replacing Dask with Ray #100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
32 commits
Select commit
Hold shift + click to select a range
df06331
dask-dependencies
recursix b4231ca
minor
recursix 765c746
replace with ray
recursix f3f9843
adjust tests and move a few things
recursix 0227afa
markdown report
recursix 3eab2ed
automatic relaunch
recursix 98a8836
add dependencies
recursix ff1ce46
reformat
recursix 3e2994c
fix unit-test
recursix bd45ae7
catch timeout
recursix ba3c4b8
Merge branch 'dask-dependencies' of github.com:ServiceNow/AgentLab in…
recursix 76e0507
fixing bugs and making things work
recursix b0b92ba
adress comments and black format
recursix af41a5b
new dependencies viewer
recursix b6295f6
Update benchmark to use visualwebarena instead of webarena
recursix 351f30c
Fix import and uncomment code in get_ray_url.py
recursix 0f4ca46
Add ignore_dependencies option to Study and _agents_on_benchmark func…
recursix 6ea1772
Update load_most_recent method to include contains parameter
recursix 418a05d
Update load_most_recent method to accept contains parameter and add w…
recursix b8580e6
Refactor backend preparation in Study class and improve logging for i…
recursix e910ae3
finallly some results with claude on webarena
recursix 2e28f2e
Add warnings for Windows timeouts and clarify parallel backend option…
recursix aece13b
black
recursix 209c8d0
ensure timeout is int (For the 3rd time?)
recursix 54c4681
Merge branch 'dev' into dask-dependencies
recursix 729ec92
Refactor timeout handling in context manager; update test to reduce a…
recursix 4f937f2
black
recursix 366afbc
Change parallel backend from "joblib" to "ray" in run_experiments fun…
recursix 7b1ecf1
Update src/agentlab/experiments/study.py
recursix 8e8f07a
Update src/agentlab/analyze/inspect_results.py
recursix a922678
Refactor logging initialization and update layout configurations in d…
recursix c3b61ba
Merge branch 'dev' into dask-dependencies
recursix File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -20,3 +20,4 @@ gradio>=5 | |
| gitpython # for the reproducibility script | ||
| requests | ||
| matplotlib | ||
| ray[default] | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -4,6 +4,12 @@ | |
| from tqdm import tqdm | ||
| import logging | ||
| from browsergym.experiments.loop import ExpArgs | ||
| from contextlib import contextmanager | ||
| import signal | ||
| import sys | ||
| from time import time, sleep | ||
|
|
||
| logger = logging.getLogger(__name__) # Get logger based on module name | ||
|
|
||
|
|
||
| # TODO move this to a more appropriate place | ||
|
|
@@ -19,8 +25,148 @@ | |
| RESULTS_DIR.mkdir(parents=True, exist_ok=True) | ||
|
|
||
|
|
||
| def run_exp(exp_arg: ExpArgs, *dependencies, avg_step_timeout=60): | ||
| """Run exp_args.run() with a timeout and handle dependencies.""" | ||
| episode_timeout = _episode_timeout(exp_arg, avg_step_timeout=avg_step_timeout) | ||
| with timeout_manager(seconds=episode_timeout): | ||
| return exp_arg.run() | ||
|
|
||
|
|
||
| def _episode_timeout(exp_arg: ExpArgs, avg_step_timeout=60): | ||
| """Some logic to determine the episode timeout.""" | ||
| max_steps = getattr(exp_arg.env_args, "max_steps", None) | ||
| if max_steps is None: | ||
| episode_timeout_global = 10 * 60 * 60 # 10 hours | ||
| else: | ||
| episode_timeout_global = exp_arg.env_args.max_steps * avg_step_timeout | ||
|
|
||
| episode_timeout_exp = getattr(exp_arg, "episode_timeout", episode_timeout_global) | ||
|
|
||
| return min(episode_timeout_global, episode_timeout_exp) | ||
|
|
||
|
|
||
| @contextmanager | ||
| def timeout_manager(seconds: int = None): | ||
| """Context manager to handle timeouts.""" | ||
|
|
||
| if isinstance(seconds, float): | ||
| seconds = max(1, int(seconds)) # make sure seconds is at least 1 | ||
|
|
||
| if seconds is None or sys.platform == "win32": | ||
| try: | ||
| logger.warning("Timeouts are not supported on Windows.") | ||
| yield | ||
| finally: | ||
| pass | ||
| return | ||
|
|
||
| def alarm_handler(signum, frame): | ||
|
|
||
| logger.warning( | ||
| f"Operation timed out after {seconds}s, sending SIGINT and raising TimeoutError." | ||
| ) | ||
| # send sigint | ||
| os.kill(os.getpid(), signal.SIGINT) | ||
|
|
||
| # Still raise TimeoutError for immediate handling | ||
| raise TimeoutError(f"Operation timed out after {seconds} seconds") | ||
|
|
||
| previous_handler = signal.signal(signal.SIGALRM, alarm_handler) | ||
| signal.alarm(seconds) | ||
|
|
||
| try: | ||
| yield | ||
| finally: | ||
| signal.alarm(0) | ||
| signal.signal(signal.SIGALRM, previous_handler) | ||
|
Comment on lines
+77
to
+81
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is black magic to me. I'll just trust you :) |
||
|
|
||
|
|
||
| def add_dependencies(exp_args_list: list[ExpArgs], task_dependencies: dict[str, list[str]] = None): | ||
recursix marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| """Add dependencies to a list of ExpArgs. | ||
|
|
||
| Args: | ||
| exp_args_list: list[ExpArgs] | ||
| A list of experiments to run. | ||
| task_dependencies: dict | ||
| A dictionary mapping task names to a list of task names that they | ||
| depend on. If None or empty, no dependencies are added. | ||
|
|
||
| Returns: | ||
| list[ExpArgs] | ||
| The modified exp_args_list with dependencies added. | ||
| """ | ||
|
|
||
| if task_dependencies is None or all([len(dep) == 0 for dep in task_dependencies.values()]): | ||
recursix marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| # nothing to be done | ||
| return exp_args_list | ||
|
|
||
| for exp_args in exp_args_list: | ||
| exp_args.make_id() # makes sure there is an exp_id | ||
|
|
||
| exp_args_map = {exp_args.env_args.task_name: exp_args for exp_args in exp_args_list} | ||
| if len(exp_args_map) != len(exp_args_list): | ||
| raise ValueError( | ||
| ( | ||
| "Task names are not unique in exp_args_map, " | ||
| "you can't run multiple seeds with task dependencies." | ||
| ) | ||
| ) | ||
|
|
||
| for task_name in exp_args_map.keys(): | ||
| if task_name not in task_dependencies: | ||
| raise ValueError(f"Task {task_name} is missing from task_dependencies") | ||
|
|
||
| # turn dependencies from task names to exp_ids | ||
| for task_name, exp_args in exp_args_map.items(): | ||
| exp_args.depends_on = tuple( | ||
| exp_args_map[dep_name].exp_id for dep_name in task_dependencies[task_name] | ||
| ) | ||
|
|
||
| return exp_args_list | ||
|
|
||
|
|
||
| # Mock implementation of the ExpArgs class with timestamp checks for unit testing | ||
| class MockedExpArgs: | ||
recursix marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| def __init__(self, exp_id, depends_on=None): | ||
| self.exp_id = exp_id | ||
| self.depends_on = depends_on if depends_on else [] | ||
| self.start_time = None | ||
| self.end_time = None | ||
| self.env_args = None | ||
|
|
||
| def run(self): | ||
| self.start_time = time() | ||
|
|
||
| # # simulate playright code, (this was causing issues due to python async loop) | ||
| # import playwright.sync_api | ||
|
|
||
| # pw = playwright.sync_api.sync_playwright().start() | ||
| # pw.selectors.set_test_id_attribute("mytestid") | ||
| sleep(3) # Simulate task execution time | ||
| self.end_time = time() | ||
| return self | ||
|
|
||
|
|
||
| def make_seeds(n, offset=42): | ||
| raise DeprecationWarning("This function will be removed. Comment out this error if needed.") | ||
| return [seed + offset for seed in range(n)] | ||
|
|
||
|
|
||
| def order(exp_args_list: list[ExpArgs]): | ||
| raise DeprecationWarning("This function will be removed. Comment out this error if needed.") | ||
| """Store the order of the list of experiments to be able to sort them back. | ||
|
|
||
| This is important for progression or ablation studies. | ||
| """ | ||
| for i, exp_args in enumerate(exp_args_list): | ||
| exp_args.order = i | ||
| return exp_args_list | ||
|
|
||
|
|
||
| # This was an old function for filtering some issue with the experiments. | ||
| def hide_some_exp(base_dir, filter: callable, just_test): | ||
| """Move all experiments that match the filter to a new name.""" | ||
| raise DeprecationWarning("This function will be removed. Comment out this error if needed.") | ||
| exp_list = list(yield_all_exp_results(base_dir, progress_fn=None)) | ||
|
|
||
| msg = f"Searching {len(exp_list)} experiments to move to _* expriments where `filter(exp_args)` is True." | ||
|
|
@@ -38,17 +184,3 @@ def hide_some_exp(base_dir, filter: callable, just_test): | |
| _move_old_exp(exp.exp_dir) | ||
| filtered_out.append(exp) | ||
| return filtered_out | ||
|
|
||
|
|
||
| def make_seeds(n, offset=42): | ||
| return [seed + offset for seed in range(n)] | ||
|
|
||
|
|
||
| def order(exp_args_list: list[ExpArgs]): | ||
| """Store the order of the list of experiments to be able to sort them back. | ||
|
|
||
| This is important for progression or ablation studies. | ||
| """ | ||
| for i, exp_args in enumerate(exp_args_list): | ||
| exp_args.order = i | ||
| return exp_args_list | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| import ray | ||
|
|
||
| context = ray.init(address="auto", ignore_reinit_error=True) | ||
|
|
||
| print(context) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.