Skip to content

Conversation

@yifuzhou
Copy link
Contributor

@yifuzhou yifuzhou commented Apr 1, 2021

Description

In current cluster version, if we have more than 2,000 timeseries, when we commend SHOW TIMESERIES, it will only display 2,000 timeseries, the same as the SHOW DEVICES...

In the default settings, if we do not set any limit, the default plan.limit will be 1000(set as the fetchsize). This is weird, because if we do not set the limit, it means that I want to fetch all the data instead of fetchsize. So I remove it.

In my opinion, hasNextWithoutConstraint() method is not correct: after I have already fetched all the timeseries, it is not necessary to do fetch again...
For example,
if we have 10,000 timeseries, so the result.size() should be 10,000.
After the first display(fetch size==1000), it will only display the first 1000 timeseries,
now the index is 1000, so it is still index < result.size() and it will keep display the rest.

And there is the only extreme condition we need this method, if we do not set any limit value, it means that we want to fetch all the data, but if the data size is bigger than Integer.MAX_VALUE, the exceed part cannot be displayed. Then we need to do second fetch..

I think it is unnecessary because the user may not want to display more than Integer.MAX_VALUE timeseries...
Please correct me if I am wrong, Thanks a lot!

@jixuan1989 jixuan1989 added 0.12.1 or 0.13 Module - Cluster PRs for the cluster module labels Apr 1, 2021
// do not use limit and offset in sub-queries unless offset is 0, otherwise the results are
// not combinable
if (offset != 0) {
plan.setLimit(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be removed. Consider this case:
The offset is 10000, the limit is 10, and the results are distributed on 3 nodes. If you keep the limit, each of the 3 nodes will only return 10 results, 30 results in total, which will all be filtered when the offset is applied.
You may set the limit to limit + offset so the unnecessary fetch can be reduced.

@jixuan1989
Copy link
Member

Hi, no test... could you add a screenshot to show the correction?
BTW, once PR #3024 is merged, we can add IT for the cluster.

@yifuzhou
Copy link
Contributor Author

Hi all. This PR is still in progress actually. In this PR, first, if we do not set the 'limit', it will fetch as large as it can, it may cause the problem of Memory Explosion. Second, the sequence of fetch data is not in order, I think it violates the purpose of 'offset' and 'limit' field because we can find what we want by SHOW TIMESERIES OFFSET ? LIMIT ?.
I will fix it later and after the IT for cluster merged, I will add the test case of this part, thanks!

@qiaojialin
Copy link
Member

Hi, to make the offset and limit meaningful in show timeseries, we need to manage timeseries in some order in the MTree firstly.
Besides, I remember that hasNextWithoutConstraint() does not consider the offset and limit. Only hasNext considers the offset and limit.

@OneSizeFitsQuorum
Copy link
Contributor

Hi all. This PR is still in progress actually. In this PR, first, if we do not set the 'limit', it will fetch as large as it can, it may cause the problem of Memory Explosion. Second, the sequence of fetch data is not in order, I think it violates the purpose of 'offset' and 'limit' field because we can find what we want by SHOW TIMESERIES OFFSET ? LIMIT ?.
I will fix it later and after the IT for cluster merged, I will add the test case of this part, thanks!

ping, you can add IT tests now~

@jt2594838 jt2594838 merged commit 55d9260 into apache:master May 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Module - Cluster PRs for the cluster module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants