Skip to content

add support for llama2 and claude via bedrock#54

Merged
davidsbailey merged 8 commits intomainfrom
bedrock-models
Feb 29, 2024
Merged

add support for llama2 and claude via bedrock#54
davidsbailey merged 8 commits intomainfrom
bedrock-models

Conversation

@davidsbailey
Copy link
Member

@davidsbailey davidsbailey commented Feb 16, 2024

Follows #53

Adds support for the following models via bedrock:

  • anthropic.claude-v2
  • meta.llama2-13b-chat-v1
  • meta.llama2-70b-chat-v1

This enables rubric tester to evaluate the following experiments in s3://cdo-ai/teaching_assistant/experiments/:

  • ai-rubrics-json-llama2
  • ai-rubrics-json-reason-llama2
  • ai-rubrics-json-reason-claude

Cost warning: we pay cash (not AWS credits) for the use of these models. a complete test run with Claude costs about $4. See README updates in #53

This was referenced Feb 16, 2024
@snickell snickell mentioned this pull request Feb 20, 2024
Base automatically changed from handle-json-response to main February 26, 2024 18:13
@davidsbailey davidsbailey marked this pull request as ready for review February 26, 2024 18:16
@davidsbailey davidsbailey requested review from a team and snickell February 26, 2024 18:20
Copy link
Collaborator

@snickell snickell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't been able to get the bedrock bits working locally, but I did my mistral experiment based on this, and seems to be working. So.... LGTM!

Python usage note... I don't think its the same PR, and there's serious downsides too (like where do you set default values) but some use of *args and **kwargs could potentially DRY up ai_label_student_work, and make more visible the cases where args/kwargs are changed/overridden before being passed into the model, e.g.:

def ai_label_student_work(self, *args, **kwargs):
        if llm_model.startswith("gpt"):
            return self.openai_label_student_work(*args, **kwargs)

Obviously has some downsides too as you suddenly can't see what ai_label_student_work takes params-wise, but I've seen this pattern help a lot of scientific and data sci code if there are lots of layers, and lots of params, and you add/remove params and suddenly have to pipe them around everywhere. May or may not be useful, YMMV.

@davidsbailey
Copy link
Member Author

Haven't been able to get the bedrock bits working locally,

can you please try running bin/aws_llama_test.py and share your output?

but I did my mistral experiment based on this, and seems to be working. So.... LGTM!

exciting! How are mistral results looking so far?

Python usage note... I don't think its the same PR, and there's serious downsides too (like where do you set default values) but some use of *args and **kwargs could potentially DRY up ai_label_student_work, and make more visible the cases where args/kwargs are changed/overridden before being passed into the model, e.g.:

def ai_label_student_work(self, *args, **kwargs):
        if llm_model.startswith("gpt"):
            return self.openai_label_student_work(*args, **kwargs)

Obviously has some downsides too as you suddenly can't see what ai_label_student_work takes params-wise, but I've seen this pattern help a lot of scientific and data sci code if there are lots of layers, and lots of params, and you add/remove params and suddenly have to pipe them around everywhere. May or may not be useful, YMMV.

I agree that this is a problem, and ideas for pythonic solutions are always welcome as I am still very new at python. Thank you for flagging.

@snickell snickell mentioned this pull request Feb 29, 2024
@davidsbailey
Copy link
Member Author

did an accuracy regression test run before merging to confirm no regression on experiments/ai-rubrics-json-reason-gpt-4-turbo

@davidsbailey davidsbailey merged commit 5c1fc9d into main Feb 29, 2024
@davidsbailey davidsbailey deleted the bedrock-models branch February 29, 2024 19:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants