Conversation
sanoursa
left a comment
There was a problem hiding this comment.
I don't think this is really a new protocol per se, but I do like the idea of picking a terminator character that is less likely to appear in the body of a message. Changing the encoding could also be a reasonable thing to do, as long as it's meaningfully improving reliability and not overly hurting perf.
| Debug.Assert(bytesRead != 0, "Not expecting 0 bytes at this time"); | ||
| } | ||
|
|
||
| return Encoding.UTF8.GetString(bytes.ToArray()); |
There was a problem hiding this comment.
It seems like the real value add of this overall change is using a different terminator character. What additional protection do we get from using UTF8 encoding? At first glance, it appears that this will just add perf overhead.
There was a problem hiding this comment.
Good points - you are correct - the main value of this change is using a different terminator character. We don't need to use UTF8 (it doesn't add any additional protection) - we just need the sender / receiver to agree on the encoding. There are some reasons I chose UTF8 for this iteration:
-
GVFS currently sends bytes as UTF8 (implicitly via the
StreamReader/StreamWriterclasses). Using UTF8 again would be keeping with current behavior, so there shouldn't be any additional overhead. -
The native code that reads / writes data from the named pipe treats the data as ASCII (which I understand is a subset of UTF8). If we switched away from UTF8, we would need to do convert the encoding in the native code.
-
The ModifiedPathsList hook also reads data from the named pipe and directly sends those bytes to git, which works with UTF8 (at least for our current scenarios), but would not work with UTF16 (which is the default representation of strings in C#).
I propose we continue using UTF8 (explicitly now) for this change? We can consider if a different encoding would make more sense...
Right - this is just a tweak / refinement of the existing behavior. I can tweak the PR description to make the intention more clear.
I wasn't trying to change the encoding (as you point out - UTF8 encoding doesn't afford extra protection) - but am just being explicit about it now. GVFS currently uses UTF8 (implicitly via the |
wilbaker
left a comment
There was a problem hiding this comment.
Just a few questions, overall the approach looks good to me
| } | ||
|
|
||
| [TestCase] | ||
| [Category(Categories.MacTODO.M4)] |
There was a problem hiding this comment.
Does this fail on Mac? If so, what's the error (out of curiosity)?
There was a problem hiding this comment.
I have not tried running this on the Mac - I will test and see what happens (or remove this, if possible)
There was a problem hiding this comment.
This test fails on the "cleanup" code (not related to the protocol changes). After "resetting" the repository, there are unexpected changes in the test repository. Because of this (and because the failure is not related to the protocol changes), I will leave this as a to for Mac
GVFS/GVFS.ReadObjectHook/main.cpp
Outdated
| #define DLO_RESPONSE_LENGTH 2 | ||
| // "S\x3" -> Success | ||
| // "F\x3" -> Failure | ||
| #define DLO_RESPONSE_LENGTH 3 |
There was a problem hiding this comment.
Why was this changed from 2 to 3?
There was a problem hiding this comment.
This should be a 2 - I will undo that
77681ee to
d226563
Compare
sanoursa
left a comment
There was a problem hiding this comment.
Curious to hear your thoughts on a length field rather than a terminator, and some other minor cleanup
d226563 to
6278f87
Compare
sanoursa
left a comment
There was a problem hiding this comment.
The optimized read here avoids the reallocations, so that looks good to me. Thanks for making those changes.
| // this was the end of the message), but the stream has been closed. Throw an exception | ||
| // and let upper layer deal with this condition. | ||
|
|
||
| throw new IOException("Incomplete message read from stream. Stream closed before message terminating byte."); |
There was a problem hiding this comment.
I would throw a BrokenPipeException here
There was a problem hiding this comment.
I can do that - I was trying to decide between those two exceptions. Currently, NamedPipeClient and NamedPipeServer both handle IOException , which is why I chose this exception. If I switch to BrokenPipeException, I will also update the handling code in NamedPipeServer to handle BrokenPipeExceptions in the same manner as it currently does for BrokenPipeExceptions.
I can go with either approach - do you have a preference?
There was a problem hiding this comment.
I'm guessing we were handling IOExceptions before because that's what the built in reader/writer were throwing. I would update this to throw/catch our specific exception type, and remove the handler for IOException unless there's still a valid reason for it.
There was a problem hiding this comment.
Yeah - the underlying Stream class can still throw IOException. We could have the NamedPipeStreamReader and NamedPipeStreamWriter class catch and wrap all underlying IOExceptions as BrokenPipeExceptions.
Would you be OK with keeping this as an IOException for now, which would allow us to keep all existing error handling in place? I think it would be easier if we extracted that change into its isolated change, so we can see the update to the exception type (and all handlers) by itself?
| /// </summary> | ||
| public class NamedPipeStreamReader | ||
| { | ||
| public const int BufferLength = 1024; |
There was a problem hiding this comment.
I think the only external code that uses this is unit tests, so please make it private. In my opinion it's not a good idea to share constants between production and test code, because they will just end up agreeing with each other no matter what.
There was a problem hiding this comment.
I was trying to enforce wat that we test sending messages that are larger than the buffer size. If we completely hide this, then we could update the production code to use a buffer larger than what the test is using.
What if we instead expose an option for callers to control the buffer size. Then the tests could force a smaller buffer size and ensure it is testing the expected functionality...
There was a problem hiding this comment.
I wouldn't do either of those things. A unit test should be testing a requirement, not knowledge of an implementation. So the unit test should decide how big of a message is reasonable to send and test that limit, and the implementation is then responsible for handling that.
| catch (IOException) | ||
| catch (IOException ex) | ||
| { | ||
| this.tracer.RelatedError($"Error reading message from NamedPipe: {ex.Message}"); |
There was a problem hiding this comment.
Minor suggestions:
- Include the full stack trace
- Create an EventMetadata and add a value for the stack trace
|
|
||
| if (bytesRead == 0) | ||
| { | ||
| // We have read a partial message (the last byte received does not indicate that |
There was a problem hiding this comment.
Should we log the partial message?
79e1fd2 to
524c434
Compare
GVFS currently sends a text based data across a named pipe with lines / messages separated by newline characters. This prohibits sending messages that include a newline as part of the message content. The main current manifestation of this problem is that GVFS does not handle git command lines that contain a newline character. There is another problem that GVFS needs to send paths across the named pipe (for ModifiedPathsList queries), and these paths can include newline characters on certain filesystems (macOS, Linux). Additionally, the protocol that GVFS uses to communicate across the named pipe is currently not consistent. It usually uses a text based stream, but for the ModifiedPathsList response, it uses null bytes to seperate entries in a list. To address these issues, this change tweaks the protocol used to communicate via the named pipe to use the 0x3 byte to indicate the end of a line / message (this is the End of text ASCII code).
67431a5 to
659c2d4
Compare
Tweak the protocol used for communicating over Named Pipes. Named Pipes is the mechanism used for interprocess communication between the git hooks, mount process, service, etc... This is to address problems of transmitting messages that include newline characters (i.e. #142).
Data is currently encoded / transmitted using a
TextWriterderived class, which uses newline characters to delineate the end of a line of text. This approach will have trouble transmitting messages that include newline characters as part of the message. Any encoding scheme we choose will need to replicated for both a managed and native implementation.To address this issue, this PR switches to writing bytes into the named pipe (instead of text into a
StreamWriter), and switching the byte value used to delineate the end of a message to0x3(ASCII code for End of text). Text will be encoded via UTF-8, as that is the format currently used implicitly via theStreamWriter. This will result in the minimal change, and not require further reactions on the native pipe clients to handle a different encoding.