Skip to content

Parameter types#12

Closed
t-muser wants to merge 25 commits intomasterfrom
parameter-types
Closed

Parameter types#12
t-muser wants to merge 25 commits intomasterfrom
parameter-types

Conversation

@t-muser
Copy link
Owner

@t-muser t-muser commented Oct 6, 2024

No description provided.

@t-muser
Copy link
Owner Author

t-muser commented Oct 7, 2024

@phi-friday would you be interested in helping me with this PR? It implements optimizing over different parameter types, it's pretty much fully functional but I was hoping that it could be implemented in a cleaner way. I know this is not what you're usually interested in, so no worries if you're not interested.

@phi-friday
Copy link

@phi-friday would you be interested in helping me with this PR? It implements optimizing over different parameter types, it's pretty much fully functional but I was hoping that it could be implemented in a cleaner way. I know this is not what you're usually interested in, so no worries if you're not interested.

I can help, but I'm not sure how - I don't really understand how Bayesian optimization works. All I know is that if I specify a range, I can get an optimized value.
I think I can only help with static type checking or some performance issues, is that what you need help with?

@phi-friday
Copy link

phi-friday commented Oct 7, 2024

I did a little digging.

  1. it seems that real numbers or integers return a scalar value in to_float or to_param, but CategoricalParameter should return an array.
    This causes problems with the type of params_to_array and array_to_params in TargetSpace (since they are not scalars).
    It seems like make_masks solves this problem, is that correct?
    If so, we need to fix the types returned by params_to_array and array_to_params, and also look at other functions that use this function.

Ignoring the type is one way to work around the problem. However, if you do this, you need to be more careful about possible errors at runtime.

  1. to_param seems to be considering a zero-dimensional array, and this function seems to be used only internally. If so, wouldn't it be sufficient to pass a scalar value when passing it, rather than considering the zero-dimensional case?

@phi-friday phi-friday mentioned this pull request Oct 7, 2024
@t-muser
Copy link
Owner Author

t-muser commented Oct 7, 2024

I can help, but I'm not sure how - I don't really understand how Bayesian optimization works. All I know is that if I specify a range, I can get an optimized value.
I think I can only help with static type checking or some performance issues, is that what you need help with?

That's understandable. I was hoping for a more big-picture review of the PR, but if you don't feel comfortable with that, no worries!
The PR is to add support for optimization over non-float parameters. I added a notebook that somewhat shows how this works. I will quickly explain the idea, and then you can decide if it sounds too far out of your comfort zone:
The PR is based on this paper where the idea is that one transforms the float representation of a non-float parameter before calculating the kernel (e.g. for an int parameter, by rounding, for a categorical parameter by ensuring a one-hot-encoded version etc.). This means that there are a number of different representations of a parameter.

Consider a fruit parameter with values Apple, Banana and Orange.

  • a "canonical" version: "Apple" "Banana" or "Orange"
  • a float-version (in this case, one-hot-encoded) [1, 0, 0], [0, 1, 0], [0, 0, 1]
    However, during the optimization process you might not get such "clean" versions of the float representation. Instead, the gradient-optimization might give you a point like this: [0.323, 0.758, 0.163]. The idea is then to map this to a corresponding float version (in this case, by checking what the maximum value in the vector is and constructing a one-hot encoded version) [0, 1, 0].

This doesn't sound so hard, but there are a few more implications.
Before, every parameter was one-dimensional. Now, suddenly, a parameter can span multiple dimensions, so we need to make clear which part of a registered point belongs to which parameter. For this, I added the masking.

Sidenote: Tests currently fail because the kernel wrapping is not pickleable. That's fine, I think we need a better method for serialization anyways, no need to worry about this.

@phi-friday
Copy link

I will quickly explain the idea, and then you can decide if it sounds too far out of your comfort zone:

I understand the idea: convert a value that was not represented by a real number to a real number, and convert it to the form you expect as a return value (whether it's an integer or a category value).

I also understand the use of one-hot encoding to compute the category value, and how this necessitates a vector rather than a scalar.

And if this is the implementation based on the paper, I have no reason to doubt it.

To implement this, you've defined to_float or to_param, and I think the key to this working as intended is whether make_masks is well defined (This should work well in both scalar and vector situations).

Am I correct that you want me to focus on the implementation of masking like make_masks and the new files like parameter.py?

@phi-friday
Copy link

phi-friday commented Oct 8, 2024

I'm not sure I've checked it well enough, but everything looks good except for the two things I mentioned (serialization, not sure where the domain attribute was assigned).

I think it would be better if the value returned by make_mask returned a two-dimensional array rather than a dictionary of one-dimensional arrays, but that seems to be a matter of preference.

And this is off topic, but the mix of _masks and mask makes the code a bit dizzying to read. It seems better to use only one or the other consistently. Of course, this is also a matter of preference.

It also looks like we need to change to minimizing loop statements later on,
and I don't think this is going to be solved with a partial fix,
so I'll look into this further if this PR merged.

See #13 for more information about typehints.
(I seem to recall that this PR only fixed typehints, with some runtime changes.
Like using repr() instead of __repr__(), defining a __reduce__ method, etc...)

* fix: import error from exception module (bayesian-optimization#525)

* fix: replace list with sequence (bayesian-optimization#524)

* Fix min window type check (bayesian-optimization#523)

* fix: replace dict with Mapping

* fix: replace list with Sequence

* fix: add type hint

* fix: does not accept None

* Change docs badge (bayesian-optimization#527)

* fix: parameter, target_space

* fix: constraint, bayesian_optimization

* fix: ParamsType

---------

Co-authored-by: till-m <36440677+till-m@users.noreply.github.com>
Bounds = Union[FloatBounds, IntBounds, CategoricalBounds]
BoundsMapping = Mapping[str, Bounds]

# FIXME: categorical parameters can be of any type.
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phi-friday Do you have a better idea on how to handle this?

@t-muser
Copy link
Owner Author

t-muser commented Oct 9, 2024

serialization, defining a __reduce__ method

I think there should be a separate PR at some point that overhauls the Observer pattern, the ScreenLogger and replaces the JSON logger with some other way to serialize things. I would suggest to not worry about that now.

I think it would be better if the value returned by make_mask returned a two-dimensional array rather than a dictionary of one-dimensional arrays, but that seems to be a matter of preference.

so 2-dim of shape (1, n) instead of 1-dim of shape (n,)?

And this is off topic, but the mix of _masks and mask makes the code a bit dizzying to read. It seems better to use only one or the other consistently. Of course, this is also a matter of preference.

should be fixed -- you're right of course!

repr

This is what I called the method that defines how a parameter value is printed, e.g. for use by the ScreenLogger. I could also call it to_string/to_str or something if you think that's better!

@phi-friday
Copy link

phi-friday commented Oct 9, 2024

I think there should be a separate PR at some point that overhauls the Observer pattern, the ScreenLogger and replaces the JSON logger with some other way to serialize things. I would suggest to not worry about that now.

What I was concerned about was the WrappedKernel. I have defined __reduce__ in #13 , but I haven't checked it.

so 2-dim of shape (1, n) instead of 1-dim of shape (n,)?

Currently, the mask is organized like this

{
    "a": [True, False, False, False],
    "b": [False, True, False, False],
    "c": [False, False, True, True]
}

If so, we could probably construct it as something like this

[[True, False, False]
[False, True, False]
[False, False, True]
[False, False, True]]

This is just a matter of preference, so you don't need to worry about it.

This is what I called the method that defines how a parameter value is printed, e.g. for use by the ScreenLogger. I could also call it to_string/to_str or something if you think that's better!

The modification I made was to call the magic method __repr__ directly. It is recommended to call the built-in function repr rather than calling __repr__ directly.

PS: There is absolutely nothing wrong with defining a repr method.
Some might question the name overlap with the built-in function, but that's purely a matter of convention.

@codecov-commenter
Copy link

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 97.26444% with 9 lines in your changes missing coverage. Please review.

Project coverage is 96.04%. Comparing base (8206277) to head (9b1fbc1).
Report is 15 commits behind head on master.

Files with missing lines Patch % Lines
bayes_opt/target_space.py 96.15% 5 Missing ⚠️
bayes_opt/acquisition.py 88.88% 3 Missing ⚠️
bayes_opt/bayesian_optimization.py 93.33% 1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master      #12      +/-   ##
==========================================
- Coverage   96.18%   96.04%   -0.14%     
==========================================
  Files          10       12       +2     
  Lines         865     1139     +274     
==========================================
+ Hits          832     1094     +262     
- Misses         33       45      +12     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@t-muser t-muser closed this Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants