Custom OpenMLFlow

I want to register a flow referring to the setup we used to conduct the [automl benchmark](https://github.com/openml/automlbenchmark) presented at the ICML workshop this year. We want to do this to more easily share results through OpenML. I want to run this by you all.

Now, I know that creating your own flow in `openml-python` is discouraged ('Flows should not be generated manually'), but for this scenario I don't really see a better way? Other than using e.g. `openml-r`.

I want to describe that we used the code at our Github repo, and a specific tag, and provide a url to the download for the code. I figured to tag the used AutoML tool (in this case `auto-sklearn`).
There is no direct way to instantiate the script with the right settings currently, but flow below should help/be enough to configure a reproducible run. Again, the flow is created mainly to link and share results, and provide the best possible pointers for reproducing the flow.

Sketching out how I think the flow should look like:
```python
auto_sklearn_flow = openml.flows.get_flow(15275)  # auto-sklearn 0.5.1
amlb_flow = openml.flows.OpenMLFlow(
	name='automlbenchmark_autosklearn',
	description='Auto-sklearn as set up by the AutoML Benchmark',
	external_version='amlb==0.9',
	parameters=OrderedDict(
		time='240',
		memory='32',
		cores='8'
	),
	parameters_meta_info=OrderedDict(
		time=dict(data_type='int', description='time in minutes'),
		memory=dict(data_type='int', description='memory in gigabytes'),
		cores=dict(data_type='int', description='number of available cores')
	),
	language='English',
	components=OrderedDict(automl_tool=auto_sklearn_flow),	
)
```

I would still need to find a place for the following information:
```
source: https://github.com/openml/automlbenchmark/releases/tag/v0.9
subfolder of particular interest: /frameworks/autosklearn
```
I do see `url` fields in `OpenMLFlow.__init__` but they are set by the server.

I think there are a couple of questions here:
 - Do you think I should even try doing this through the Python API?

And if so,
 - Should I create only one flow for the entire benchmark, and take the 'automl tool'  as a parameter? I think it's more generic, but might make the link to the automl tool's flow harder to find.
 - Where to include the code-url.
 - Any other things I need to know?




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Custom OpenMLFlow #753

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Custom OpenMLFlow #753

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions