Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
f6f1680
Update unit_tests.yml (#101)
gasse Oct 31, 2024
490a142
request is done once and then reused
TLSDC Nov 7, 2024
7b1ec6e
Merge branch 'dev' into tracking_fix
TLSDC Nov 7, 2024
8b10f18
Patching minor stuff (#69)
TLSDC Oct 16, 2024
c15e934
Improve agent xray app (#70)
xhluca Oct 17, 2024
3e1fb4d
added tmlr definitive config (#71)
TLSDC Oct 17, 2024
207cff5
downgrading gradio version (#77)
TLSDC Oct 19, 2024
ffc3df3
Study refactor (#73)
recursix Oct 20, 2024
9844bd8
adding message class and updating generic agent accordingly (#68)
TLSDC Oct 21, 2024
7a4e98c
version bump
TLSDC Oct 21, 2024
4e9cbe0
Updating generic_agent to fit use BGym's goal_object (#83)
TLSDC Oct 22, 2024
4b4efc4
Minor revert (#86)
TLSDC Oct 22, 2024
7eda60f
Add tabs (#84)
recursix Oct 22, 2024
f4d83a2
Fix reproduce study (#87)
recursix Oct 23, 2024
4efe47e
upgrading gradio dependency (#88)
TLSDC Oct 23, 2024
77b076c
bgym update (#90)
TLSDC Oct 23, 2024
470f1be
Workarena TMLR experiments (#89)
TLSDC Oct 24, 2024
d5a2f7f
handling sequntial in VWA (#91)
recursix Oct 24, 2024
e6c8df8
Tmlr workarena (#92)
TLSDC Oct 24, 2024
f889272
tmp
recursix Oct 24, 2024
8eea0b8
reformat
recursix Oct 24, 2024
7d0c48d
adding assistantbench to reproducibility_util.py
TLSDC Oct 24, 2024
2e29559
gitignore (#97)
gasse Oct 30, 2024
bd28f00
Vision fix (#105)
TLSDC Nov 5, 2024
1b8bbff
L2 tmlr (#93)
TLSDC Nov 5, 2024
6b42824
Replacing Dask with Ray (#100)
recursix Nov 6, 2024
f5ab035
switching to 2 for loops in _agents_on_benchmark (#107)
TLSDC Nov 6, 2024
760a6b1
yet another way to kill timedout jobs (#108)
recursix Nov 6, 2024
14564ee
request is done once and then reused
TLSDC Nov 7, 2024
fbd1e55
switched to caching original function bc it doesnt break to tests
TLSDC Nov 7, 2024
f6c6562
added a catch for some openrouter under-the-hood error
TLSDC Nov 7, 2024
801ce6d
Merge branch 'tracking_fix' of github.com:ServiceNow/AgentLab into tr…
TLSDC Nov 7, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .github/workflows/unit_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,9 @@ jobs:
- name: Install Playwright
run: playwright install chromium --with-deps

- name: Download WebArena / VisualWebArena ressource files
run: python -c 'import nltk; nltk.download("punkt_tab")'

- name: Fetch MiniWob
uses: actions/checkout@v4
with:
Expand All @@ -58,4 +61,4 @@ jobs:
- name: Run AgentLab Unit Tests
env:
MINIWOB_URL: "http://localhost:8080/miniwob/"
run: pytest -n 5 --durations=10 -m 'not pricy' -v tests/
run: pytest -n 5 --durations=10 -m 'not pricy' -v tests/
10 changes: 10 additions & 0 deletions src/agentlab/llm/chat_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -208,6 +208,10 @@ def handle_error(error, itr, min_retry_wait_time, max_retry):
return error_type


class OpenRouterError(openai.OpenAIError):
pass


class ChatModel(AbstractChatModel):
def __init__(
self,
Expand Down Expand Up @@ -274,6 +278,12 @@ def __call__(self, messages: list[dict]) -> dict:
temperature=self.temperature,
max_tokens=self.max_tokens,
)

if completion.usage is None:
raise OpenRouterError(
"The completion object does not contain usage information. This is likely a bug in the OpenRouter API."
)

self.success = True
break
except openai.OpenAIError as e:
Expand Down
2 changes: 2 additions & 0 deletions src/agentlab/llm/tracking.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
from functools import cache
import os
import threading
from contextlib import contextmanager
Expand Down Expand Up @@ -61,6 +62,7 @@ def wrapper(self, obs):
return wrapper


@cache
def get_pricing_openrouter():
api_key = os.getenv("OPENROUTER_API_KEY")
assert api_key, "OpenRouter API key is required"
Expand Down