-
-
Notifications
You must be signed in to change notification settings - Fork 211
Rework local openml directory #987
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
openml/config.py
Outdated
| try: | ||
| os.mkdir(expanded_openml_dir) | ||
| except PermissionError: | ||
| warnings.warn( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer using logging module here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the code to use logging instead of warnings.
openml/utils.py
Outdated
| os.makedirs(cache_dir, exist_ok=True) | ||
| except Exception as e: | ||
| raise openml.exceptions.OpenMLCacheException( | ||
| f"Cannot create cache directory {cache_dir} due to exception {e}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You do not need to have due to exception {e} because you have from e later on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, makes sense.
|
Not fully. My main issue is that OpenML initializes stuff at package import and not on first use. So people who would not even use OpenML codepath in our package (which depends on I do not think you should be calling in You should call those at first call which requires any of those. See: If cache directory is not writable, this will fail, no? And it is called from |
|
For reference, current patch we are using: https://gitlab.com/datadrivendiscovery/images/-/blob/master/libs/patches/openml.patch |
|
@mfeurer: Thanks for the notification. Looks fine by me. |
|
@mfeurer Would you have time to look at this or should we ask someone else? |
|
Alright, thanks for your suggestions @mitar. I tried to incorporate them as much as possible, but I'm afraid I didn't do all you asked for.
I agree that this might be nicer. On the other hand, I think the current solution is more in the spirit of 'fail fast' and gives a warning as quickly as possible. In any case, I currently don't have the time to work on this, wouldn't know how to achieves this and think the changes in this PR cover most use cases and issues and will for now leave it is at is (unless there's an issue with my recent changes).
I made this conditional on the cache directory being writable, so there shouldn't be any failure from this. Tests might fail because the server is very flaky at the moment, so we probably have to rebase this on #1000 and rerun. |
783a756 to
d58bfa5
Compare
|
@PGijsbers do you think you could have a look at the Windows failures? I believe it'll be much easier to debug for you on a Windows machine than for me by pushing to github |
mitar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. This looks great now.
Closes #883 #884 #906 #972
What does this PR implement/fix? Explain your changes.
How should this PR be tested?
Unit tests.
Any other comments?
@mitar does this resolve the issues you had with the package?
@PhMueller does this help you using OpenML inside HPOlib?