WIP: Streaming SHA256 ops. by roconnor-blockstream · Pull Request #817 · ElementsProject/elements

roconnor-blockstream · 2020-02-14T18:59:38Z

My proposal is

OP_SHA256INITIALIZE to pop a bytestring and push SHA256 context creating by adding the bytestring to the initial SHA256 context.
OP_SHA256UPDATE to pop a SHA256 context and bytestring, and push an updated context by adding the bytestring to the data stream being hashed.
OP_SHA256FINALIZE to pop a SHA256 context and bytestring, and push a SHA256 hash value after adding the bytestring and completing the padding.

I propose defining the SHA256-context as a bytestring that is a 40 to 103 bytes in length where

The first 32-bytes is the SHA256 midstate
The next 8-bytes is the (little-endian) unsigned integer value containing the number of bits of data processed so far. This value will always be a multiple of 8.
The next 0 to 63 bytes is the store of data processed but not compressed. The length of this "field" is X/8 % 64 (equiv. (X % 512) / 8) where X is the integer value of the previous field.

Any invalid SHA256-context will cause the operations that process the context to fail.
If any input would cause the SHA256 bit counter to overflow, the operation will fail.

I could easily be convinced to use bytes instead of bits in the second field. Bits is nicely aligned with how SHA-256 works, and the size is perfectly aligned. OTOH Bytes is what the SHA256 internal state uses in practice and would be marginally faster to processes.

Motivation: There already exists a OP_SH256 opcode that performs computes SHA256 computations. However because the MAX_SCRIPT_ELEMENT_SIZE is 520 bytes, that is the maximum message size that can OP_SHA256 can operate on. These proposed operations allow us to circumvent the MAX_SCRIPT_ELEMENT_SIZE limit on message size; however these opcode might be subject to abuse (e.g. SHA-256 with freestart).

src/crypto/sha256.cpp

src/script/interpreter.cpp

src/crypto/sha256.cpp

src/script/interpreter.cpp

src/crypto/sha256.cpp

dgpv · 2020-02-14T20:51:02Z

re OP_SHA256_UPDATE_AND_FINALIZE: Is finalize useful without a preceding write ? In a linear script without loops it seems that it is almost always will be preceded by some write, and if not, OP_0 OP_SHA256_UPDATE_AND_FINALIZE should be an equivalent

roconnor-blockstream · 2020-02-14T21:38:54Z

I've updated the proposal so that FINALIZE also does an update, just like INIT.

dgpv · 2020-02-14T21:48:38Z

code looks good to me, utACK fb6840a

src/script/script.cpp

src/crypto/sha256.cpp

roconnor-blockstream · 2020-02-25T19:14:08Z

I know it is bikeshedding, but I kinda want rename SHA256_WRITE (back) to SHA256_UPDATE.

instagibbs · 2020-02-25T19:16:45Z

(just saw this) I presume this is some sort of simplicity groundwork?

roconnor-blockstream · 2020-02-25T19:25:29Z

(just saw this) I presume this is some sort of simplicity groundwork?

Nope. Taproot is effectively a Simplicity prerequisite, and I'd somewhat like these opcodes added along with taproot to make Covenants and Oracles a little bit easier by not having to contend with the 520 (or whatever) stack item limit.

instagibbs · 2020-02-25T19:27:42Z

Oh I misread the top post, nevermind.

dgpv · 2020-02-26T01:48:01Z

src/crypto/sha256.cpp

+
+bool CSHA256::SafeWrite(const unsigned char* data, size_t len) {
+  const uint64_t SHA256_MAX = 0x1FFFFFFFFFFFFFFF; // SHA256's maximum allowed message length in bytes.
+  if (SHA256_MAX < bytes || SHA256_MAX - bytes < len) return false;


Checking for SHA256_MAX - MAX_SCRIPT_ELEMENT_SIZE in Load() would allow to get rid of three ifs (if !ctx.SafeWrite(...) -> ctx.Write(...)), because Write() will be always safe in regard to this check: the size of data on the stack cannot be bigger than MAX_SCRIPT_ELEMENT_SIZE. Why you think that doing this in Write() is better ? Note that in OP_SHA256_INITIALIZE it is even impossible to fail this check, because the CTX is new and can only increase up to MAX_SCRIPT_ELEMENT_SIZE. I think an if check is due in Load() and here this condition can just be checked with assert to be safe.

Is it because you want to allow for UPDATE with user-supplied midstate to be able to go up to SHA256_MAX exactly ?

You cold make Load() take the length and then check SHA_MAX - bytes < len inside Load().
Although this makes Load/Save api less generic. But if you want it to be more generic so it can be used in other places, you probably want that check inside Load() anyway, because in general you cannot prevent anyone to use just Write() instead of SafeWrite() after Load()

I suppose I could write Load to return a int64_t that returns the number of bytes used so far (or alternatively the number of remaining bytes available to be processed), and return -1 for failure.

Is this really all worthwhile just to avoid one if statement in SHA256INITALIZE?

I suppose we also get rid of SafeWrite which is kinda nice to get rid of.

all three if statements that are in place for SafeWrite now are not needed if Write is always safe, if the check SHA_MAX - bytes < len is done inside Load() - at least in regard to manipulating the state from the script. The Load() can just fail if the subsequent Write() will be invalid. But my opinion is not strong on this - this is mostly bikeshedding. But I would like to add that check inside Load() would follow 'fail early' principle, which I favour.

Having Load take in an argument of the number of bytes I intend to write later is such an unusual API, that I am very hesitant to implement it that way.

Just rename Load to Prepare then :-)

As I said, my opinion on this is not strong. Speaking of returning the number of bytes still available to process from Load and then checking it might be more messy than SafeWrite approach, because you will need to have that check in two places (in UPDATE and FINALIZE)

dgpv · 2020-02-26T04:19:49Z

src/crypto/sha256.cpp

+    uint64_t bits = ReadLE64(&vch[32]);
+    size_t buf_size = (bits >> 3) % 64;
+
+    if ((bits & 0x07) != 0 || vch.size() != 40 + buf_size) return false;


Maybe add a comment to state that bits & 0x07 check is to prevent malleability ?

It's a bit more than that: the SHA256 class is simply incapable of processing sub-byte data as it stands. But I can write a comment to that effect.

I mean you could just just do bits &= ~0x07 and not do the check, but this would introduce non-determinism, and the comment could tell the reader why it is not a good idea

I had hoped it goes without saying that one shouldn't mangle an invalidly serialized data blob into a valid one.

validity of serialized data depends on the definition of the validity, but it is not explicitly specified (the serialization code in Save() is only one possible implementation of implicit specification). But, on the other hand, you can view the (bits & 0x07) != 0 as the statement of the de-facto specification "low-order bits should be zero", and in this light, no extra comment seems necessary.

src/script/script.h

JeremyRubin · 2020-06-05T06:25:30Z

See some discussion of the concept here: https://lists.linuxfoundation.org/pipermail/lightning-dev/2019-October/002218.html There was some discussion on if it should be allowed to pass in midstates or not.

I think a lot of scripts can benefit from a batch update style API where you specify how many things to take off the stack.

dgpv · 2020-06-05T06:45:34Z

One data point re the linked 2019 discussion is that I encountered the limitation of OP_CAT in practice, where it limited the number of possible outputs (520 byte stack item limit) and thus one option of a complex covenant was not possible. Some of the outputs were blinded, and thus much larger than normal outputs, and I could fit two blinded outputs and two non-blinded outputs plus a fee output in a transaction, but third blinded output was too much. With streaming SHA256 operations, this limitation does not exist.

dgpv · 2020-06-05T06:46:58Z

This is of course specific to Elements where blinded outputs are possible (which are larger than normal outputs). Without blinding, the 520 bytes limit might not be a problem for practical purposes.

JeremyRubin · 2020-06-06T00:52:43Z

It's true for checktemplateverify that it imposes a limitation on how many outputs/inputs you can add dynamically, so it's true beyond elements.

dgpv · 2020-06-06T06:10:53Z

This particular limitation has no connection to CTV, because fixing outputs hash via CHECKSIGFROMSTACK does not impose any limits by itself, as outputs hash that is checked comes from just a normal sighash for segwit. This is about dynamically generating this outputs hash within the script from the fixed part (that enforces the covenant) and dynamic part (explicit fees and non-restricted outputs). CAT is used for that. Full outputs data need to be constructed on the stack before doing SHA256 on it, and this is where the 520 byte stack item datasize limit becomes the factor.

JeremyRubin · 2020-06-06T17:22:26Z

@dgpv that's exactly what I'm saying; it imposes a limit if you're dynamically adding data to a CTV template using op_cat. Hence sha256stream being preferred there too

gwillen · 2020-08-05T19:46:43Z

This is kind of ancient I guess, but: Rather than have "update" plus "initialize-and-update", just have "update" and "finalize" treat an empty stack element as an empty context. Then you can ditch initialize entirely.

roconnor-blockstream · 2020-08-05T19:57:15Z

gwillen, if I understand your comment, that would entail requiring an extra OP_0 opcode to create the empty context in all typical uses of this functionality.

apoelstra · 2021-09-05T18:20:08Z

Closing, superceded by #1020

sipa reviewed Feb 14, 2020

View reviewed changes

src/crypto/sha256.cpp Outdated Show resolved Hide resolved

sipa reviewed Feb 14, 2020

View reviewed changes

src/script/interpreter.cpp Show resolved Hide resolved

dgpv reviewed Feb 14, 2020

View reviewed changes

src/crypto/sha256.cpp Show resolved Hide resolved

dgpv reviewed Feb 14, 2020

View reviewed changes

src/script/interpreter.cpp Outdated Show resolved Hide resolved

dgpv reviewed Feb 14, 2020

View reviewed changes

src/crypto/sha256.cpp Outdated Show resolved Hide resolved

roconnor-blockstream force-pushed the 202002_sha256_opcodes branch from da06b5c to fb6840a Compare February 14, 2020 21:36

roconnor-blockstream commented Feb 15, 2020

View reviewed changes

src/script/script.cpp Outdated Show resolved Hide resolved

roconnor-blockstream commented Feb 25, 2020

View reviewed changes

src/crypto/sha256.cpp Show resolved Hide resolved

roconnor-blockstream force-pushed the 202002_sha256_opcodes branch from fb6840a to 15961e0 Compare February 26, 2020 00:10

dgpv reviewed Feb 26, 2020

View reviewed changes

src/script/script.h Outdated Show resolved Hide resolved

roconnor-blockstream force-pushed the 202002_sha256_opcodes branch from 15961e0 to 5a6b7b1 Compare March 2, 2020 19:53

WIP: Streaming SHA256 ops.

e70448f

roconnor-blockstream force-pushed the 202002_sha256_opcodes branch from 5a6b7b1 to e70448f Compare March 3, 2020 21:25

apoelstra closed this Sep 5, 2021

altafan mentioned this pull request Dec 28, 2021

Add streaming SHA256 ionio-lang/ivy-elements#15

Open

3 tasks

Conversation

roconnor-blockstream commented Feb 14, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dgpv commented Feb 14, 2020

Uh oh!

roconnor-blockstream commented Feb 14, 2020

Uh oh!

dgpv commented Feb 14, 2020

Uh oh!

Uh oh!

Uh oh!

roconnor-blockstream commented Feb 25, 2020

Uh oh!

instagibbs commented Feb 25, 2020

Uh oh!

roconnor-blockstream commented Feb 25, 2020

Uh oh!

instagibbs commented Feb 25, 2020

Uh oh!

dgpv Feb 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dgpv Feb 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dgpv Feb 26, 2020

Choose a reason for hiding this comment

Uh oh!

roconnor-blockstream Mar 2, 2020

Choose a reason for hiding this comment

Uh oh!

roconnor-blockstream Mar 2, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dgpv Mar 2, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

roconnor-blockstream Mar 2, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dgpv Mar 2, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dgpv Mar 2, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dgpv Feb 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

roconnor-blockstream Mar 2, 2020

Choose a reason for hiding this comment

Uh oh!

dgpv Mar 2, 2020

Choose a reason for hiding this comment

Uh oh!

roconnor-blockstream Mar 17, 2020

Choose a reason for hiding this comment

Uh oh!

dgpv Mar 17, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

JeremyRubin commented Jun 5, 2020

Uh oh!

dgpv commented Jun 5, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

roconnor-blockstream commented Feb 14, 2020 •

edited

Loading

dgpv Feb 26, 2020 •

edited

Loading

dgpv Feb 26, 2020 •

edited

Loading

roconnor-blockstream Mar 2, 2020 •

edited

Loading

dgpv Mar 2, 2020 •

edited

Loading

roconnor-blockstream Mar 2, 2020 •

edited

Loading

dgpv Mar 2, 2020 •

edited

Loading

dgpv Mar 2, 2020 •

edited

Loading

dgpv Feb 26, 2020 •

edited

Loading

dgpv commented Jun 5, 2020 •

edited

Loading

dgpv commented Jun 5, 2020 •

edited

Loading