Skip to content

[io] fix crash due to overflow in buffer length variable#14627

Merged
vepadulano merged 2 commits intoroot-project:masterfrom
ferdymercury:bufferio
Feb 21, 2024
Merged

[io] fix crash due to overflow in buffer length variable#14627
vepadulano merged 2 commits intoroot-project:masterfrom
ferdymercury:bufferio

Conversation

@ferdymercury
Copy link
Collaborator

@ferdymercury ferdymercury commented Feb 8, 2024

This Pull request:

Changes or fixes:

ROOT is known to have a 1 GB maximum IO size, see #6734, but did not provide proper safety checks against this (since 18y). This lead to silent crashes in the WriteFastArray function. The length of the buffer Int_t l takes a negative value if we do not add these checks and n is high enough.

This fixes the crash reported here: https://root-forum.cern.ch/t/crash-when-writing-canvas-to-tfile/58010/8?u=ferhue
A reproducer of the crash is also in that link.
Seems there are more people seeing a similar crash on the forum https://root-forum.cern.ch/t/writing-out-events-larger-than-1-gb/33106

I did not check in depth, but this PR also probably fixes cms-sw/cmssw#40132

Checklist:

  • tested changes locally
  • updated the docs (if necessary)

Fixes #14644

@ferdymercury ferdymercury requested a review from pcanal as a code owner February 8, 2024 09:26
@phsft-bot

This comment was marked as outdated.

@phsft-bot

This comment was marked as outdated.

@phsft-bot

This comment was marked as outdated.

1 similar comment
@phsft-bot

This comment was marked as outdated.

@ferdymercury ferdymercury added bug affects:master in:I/O experiment Affects an experiment / reported by its software & computimng experts affects:6.28 affects:6.30 labels Feb 8, 2024
@phsft-bot

This comment was marked as outdated.

1 similar comment
@phsft-bot

This comment was marked as outdated.

@phsft-bot

This comment was marked as outdated.

@phsft-bot

This comment was marked as outdated.

@phsft-bot

This comment was marked as outdated.

@phsft-bot

This comment was marked as outdated.

@github-actions

This comment was marked as outdated.

@phsft-bot

This comment was marked as outdated.

@ferdymercury ferdymercury requested a review from pcanal February 8, 2024 16:47
@pcanal
Copy link
Member

pcanal commented Feb 9, 2024

Maybe I create a templated function for this?

Sure. It makes sense.

@ferdymercury
Copy link
Collaborator Author

Maybe I create a templated function for this?

Sure. It makes sense.

Finally I did not do it because there are cases with fixed width of 8 that are not data types. So like now I think it's compact enough.
What do you think?

In the future, maybe the whole WriteFastArray function could be templates with some specializations/overriding for the fixed 8byte stuff.

@ferdymercury ferdymercury linked an issue Feb 9, 2024 that may be closed by this pull request
1 task
Copy link
Member

@vepadulano vepadulano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much! Here some comments/suggestions

Also, I would agree we should introduce the Fatal everywhere the 1GB limit may play a role. Maybe the other classes can be improved in a follow-up PR though

@phsft-bot
Copy link

Starting build on ROOT-performance-centos8-multicore/soversion, ROOT-ubuntu2204/nortcxxmod, ROOT-ubuntu2004/python3, mac12arm/cxx20, windows10/default
How to customize builds

1 similar comment
@phsft-bot
Copy link

Starting build on ROOT-performance-centos8-multicore/soversion, ROOT-ubuntu2204/nortcxxmod, ROOT-ubuntu2004/python3, mac12arm/cxx20, windows10/default
How to customize builds

@phsft-bot
Copy link

Build failed on ROOT-ubuntu2204/nortcxxmod.
Running on root-ubuntu-2204-3.cern.ch:/home/sftnight/build/workspace/root-pullrequests-build
See console output.

Failing tests:

@phsft-bot
Copy link

Build failed on ROOT-ubuntu2004/python3.
Running on root-ubuntu-2004-1.cern.ch:/home/sftnight/build/workspace/root-pullrequests-build
See console output.

Failing tests:

@phsft-bot
Copy link

Build failed on ROOT-performance-centos8-multicore/soversion.
Running on olbdw-01.cern.ch:/data/sftnight/workspace/root-pullrequests-build
See console output.

Failing tests:

Copy link
Member

@vepadulano vepadulano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ferdymercury thank you so much for this! Everything looks good on my side now. We have a bit of a noisy CI these days but the failures look unrelated to me.

Copy link
Member

@pcanal pcanal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this nice improvements.

@ferdymercury
Copy link
Collaborator Author

Thanks to you for your review!

Should we add a new issue to keep track of applying similar changes in the future to:

  • TBufferSQL*
  • TBufferJSON*
  • TBufferXML*

@vepadulano
Copy link
Member

Thanks to you for your review!

Should we add a new issue to keep track of applying similar changes in the future to:

* TBufferSQL*

* TBufferJSON*

* TBufferXML*

Done #14770

@vepadulano
Copy link
Member

@ferdymercury Maybe also in this PR it would be nice to clean the commit history, I guess we want to keep the first two commits as they are independent non-trivial changes but squash the following 2 commits as they were part of the review process

Fixes root-project#14644

ROOT is known to have a 1 GB maximum IO size, but did not provide since safety checks against this (since 18y). This lead to silent crashes. The length of the buffer Int_t l takes a negative value if we do not add these checks and n is high enough.

[io] add underflow safety checks for writefastarray-objects

[io] check for already preallocated elements

apply pcanals suggestion

[io] prevent underflow in SetByteCount

cnt variable could go into negative otherwise

[io] simplify width

Co-authored-by: Philippe Canal <pcanal@fnal.gov>

[io] more detailed assertion

[io] raise fatal error instead of silently skipping

apply pcanals suggestion
@phsft-bot
Copy link

Starting build on ROOT-performance-centos8-multicore/soversion, ROOT-ubuntu2204/nortcxxmod, ROOT-ubuntu2004/python3, mac12arm/cxx20, windows10/default
How to customize builds

@ferdymercury
Copy link
Collaborator Author

@ferdymercury Maybe also in this PR it would be nice to clean the commit history, I guess we want to keep the first two commits as they are independent non-trivial changes but squash the following 2 commits as they were part of the review process

Done :)

@phsft-bot
Copy link

Build failed on ROOT-ubuntu2204/nortcxxmod.
Running on root-ubuntu-2204-3.cern.ch:/home/sftnight/build/workspace/root-pullrequests-build
See console output.

Failing tests:

@phsft-bot
Copy link

Build failed on ROOT-ubuntu2004/python3.
Running on root-ubuntu-2004-1.cern.ch:/home/sftnight/build/workspace/root-pullrequests-build
See console output.

Failing tests:

@pcanal
Copy link
Member

pcanal commented Feb 20, 2024

I am not sure how but the failure in Ubuntu2310: https://github.com/root-project/root/actions/runs/7971661830/job/21783650878?pr=14627 seems limites to this PR ....

@ferdymercury
Copy link
Collaborator Author

I am not sure how but the failure in Ubuntu2310: https://github.com/root-project/root/actions/runs/7971661830/job/21783650878?pr=14627 seems limited to this PR ....

Weird. No clue.
I just see:

grep -r WriteFast * -n
bindings/pyroot/pythonizations/src/CPPInstancePyz.cxx:50:      buf->WriteFastArray(PyBytes_AS_STRING(pybuf), PyBytes_GET_SIZE(pybuf));

@vepadulano
Copy link
Member

vepadulano commented Feb 21, 2024

I am not sure how but the failure in Ubuntu2310: https://github.com/root-project/root/actions/runs/7971661830/job/21783650878?pr=14627 seems limites to this PR ....

This test sporadically fails on that platform, not the first time I see it. I doubt it is related to this PR. Let me rerun just that platform

@vepadulano
Copy link
Member

Latest ubuntu2310 run is green, the error is not related to this PR strictly. Merging now, thanks again @ferdymercury for the contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

affects:master affects:6.28 affects:6.30 bug experiment Affects an experiment / reported by its software & computimng experts in:I/O

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[io] crash due to overflow in buffer length variable ROOT write errors in prompt reconstruction

4 participants