add support for llama2 and claude via bedrock#54
Conversation
There was a problem hiding this comment.
Haven't been able to get the bedrock bits working locally, but I did my mistral experiment based on this, and seems to be working. So.... LGTM!
Python usage note... I don't think its the same PR, and there's serious downsides too (like where do you set default values) but some use of *args and **kwargs could potentially DRY up ai_label_student_work, and make more visible the cases where args/kwargs are changed/overridden before being passed into the model, e.g.:
def ai_label_student_work(self, *args, **kwargs):
if llm_model.startswith("gpt"):
return self.openai_label_student_work(*args, **kwargs)
Obviously has some downsides too as you suddenly can't see what ai_label_student_work takes params-wise, but I've seen this pattern help a lot of scientific and data sci code if there are lots of layers, and lots of params, and you add/remove params and suddenly have to pipe them around everywhere. May or may not be useful, YMMV.
can you please try running
exciting! How are mistral results looking so far?
I agree that this is a problem, and ideas for pythonic solutions are always welcome as I am still very new at python. Thank you for flagging. |
|
did an accuracy regression test run before merging to confirm no regression on |
Follows #53
Adds support for the following models via bedrock:
anthropic.claude-v2meta.llama2-13b-chat-v1meta.llama2-70b-chat-v1This enables rubric tester to evaluate the following experiments in
s3://cdo-ai/teaching_assistant/experiments/:ai-rubrics-json-llama2ai-rubrics-json-reason-llama2ai-rubrics-json-reason-claudeCost warning: we pay cash (not AWS credits) for the use of these models. a complete test run with Claude costs about $4. See README updates in #53