Skip to content

Conversation

@austinkelleher
Copy link
Contributor

@austinkelleher austinkelleher commented Apr 8, 2022

Description

The specification of atob has various different conditions that we need
to abide by.

See: https://infra.spec.whatwg.org/#forgiving-base64-decode

Fixes: #42646

Considerations

Each of the new cases that are handled added could be considered breaking changes. Consider the following tests run directly from the Chrome console (Version 100.0.4896.75 (Official Build) (arm64)):

atob(undefined)
VM85:1 Uncaught DOMException: Failed to execute 'atob' on 'Window': The string to be decoded is not correctly encoded.
    at <anonymous>:1:1

atob(false)
VM123:1 Uncaught DOMException: Failed to execute 'atob' on 'Window': The string to be decoded is not correctly encoded.
    at <anonymous>:1:1

atob(1)
VM152:1 Uncaught DOMException: Failed to execute 'atob' on 'Window': The string to be decoded is not correctly encoded.

atob(false)
VM187:1 Uncaught DOMException: Failed to execute 'atob' on 'Window': The string to be decoded is not correctly encoded.

atob(0)
VM217:1 Uncaught DOMException: Failed to execute 'atob' on 'Window': The string to be decoded is not correctly encoded.
    at <anonymous>:1:1

atob('a')
VM234:1 Uncaught DOMException: Failed to execute 'atob' on 'Window': The string to be decoded is not correctly encoded.
    at <anonymous>:1:1

atob('a ')
VM243:1 Uncaught DOMException: Failed to execute 'atob' on 'Window': The string to be decoded is not correctly encoded.
    at <anonymous>:1:1

atob(' a')
VM257:1 Uncaught DOMException: Failed to execute 'atob' on 'Window': The string to be decoded is not correctly encoded.
    at <anonymous>:1:1
    
atob('aaaaa')
VM283:1 Uncaught DOMException: Failed to execute 'atob' on 'Window': The string to be decoded is not correctly encoded.
    at <anonymous>:1:1

@nodejs-github-bot nodejs-github-bot added buffer Issues and PRs related to the buffer subsystem. needs-ci PRs that need a full CI run. labels Apr 8, 2022
@austinkelleher austinkelleher force-pushed the 42646-buffer-validation branch from a924347 to 473c352 Compare April 8, 2022 21:19
@austinkelleher
Copy link
Contributor Author

Additional note - I'm happy to split this change into multiple PRs if it's helpful. 😄

@austinkelleher austinkelleher force-pushed the 42646-buffer-validation branch from 473c352 to 2ad1a00 Compare April 8, 2022 21:23
@zloirock
Copy link

zloirock commented Apr 8, 2022

atob now immediately throws when undefined, false, or a number is supplied

It should be converted to a string, so

atob(1234) === '×mø'

is correct

@austinkelleher
Copy link
Contributor Author

@zloirock Thank you for catching...I'll update the PR.

The specification of `atob` has various different conditions that we
need to abide by. The specific changes that were made:

* `atob` now immediately throws when `undefined`, `false`, or a `number`
  is supplied
* `atob` now strips ASCII whitespace before attempting to decode
* `atob` now validates that the code point's length divided by 4 leaves
  a remainder that is not 1

See: https://infra.spec.whatwg.org/#forgiving-base64-decode

Fixes: nodejs#42646
* Use approach of counting non-ASCII whitespace characters instead of
  removing the characters via string replacement
* Additional atob validation tests have been added
@austinkelleher austinkelleher force-pushed the 42646-buffer-validation branch from 1c109ce to 08e4c75 Compare April 9, 2022 12:54
@austinkelleher austinkelleher changed the title buffer: atob throw error when the input string is invalid buffer: atob throw error when the input value is invalid Apr 9, 2022
* Chrome and Firefox both use `InvalidCharacterError` when the string
  to be decoded is not correctly encoded
* Remove unnecessary atob test
Copy link
Contributor

@aduh95 aduh95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: we could keep one array and save a lookup per char.Given the fact that this is a legacy API for which performance are not really a concern, you can ignore this suggestion if you think it makes the code harder to read.

* Small optimization in atob validation by assuming positions of
  whitespace characters in allowed chars array
* Improved tests for ASCII whitespace characters
* Explicitly validate properties of `DOMException` in tests
@austinkelleher austinkelleher force-pushed the 42646-buffer-validation branch from 689d51f to a5828d8 Compare April 9, 2022 14:26
@aduh95 aduh95 added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Apr 9, 2022
@nodejs-github-bot
Copy link
Collaborator

@austinkelleher
Copy link
Contributor Author

It seems that some of the tests may have failed intermittently unless I'm missing something. Anything that I'm missing? If not, can someone please re-trigger? Thanks!

@nodejs-github-bot
Copy link
Collaborator

@austinkelleher
Copy link
Contributor Author

austinkelleher commented Apr 11, 2022

It looks like all checks have passed except for the pending node-test-linux-linked-debug. When I review the Jenkins job output, it looks like that job has actually succeeded?

@aduh95 aduh95 merged commit 7533d08 into nodejs:master Apr 12, 2022
@aduh95
Copy link
Contributor

aduh95 commented Apr 12, 2022

Landed in 7533d08, thanks for the contribution 🎉

@danielleadams
Copy link
Contributor

@austinkelleher should this land in v16.x? If so, do you mind creating a backport PR for this to v16.x-staging? (docs) Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

author ready PRs that have at least one approval, no pending requests for changes, and a CI started. buffer Issues and PRs related to the buffer subsystem. needs-ci PRs that need a full CI run.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

atob should throw an error if the input is not correctly encoded

7 participants