Tweak Named pipe protocol by jamill · Pull Request #259 · microsoft/VFSForGit

jamill · 2018-09-11T22:50:53Z

Tweak the protocol used for communicating over Named Pipes. Named Pipes is the mechanism used for interprocess communication between the git hooks, mount process, service, etc... This is to address problems of transmitting messages that include newline characters (i.e. #142).

Data is currently encoded / transmitted using a TextWriter derived class, which uses newline characters to delineate the end of a line of text. This approach will have trouble transmitting messages that include newline characters as part of the message. Any encoding scheme we choose will need to replicated for both a managed and native implementation.

To address this issue, this PR switches to writing bytes into the named pipe (instead of text into a StreamWriter), and switching the byte value used to delineate the end of a message to 0x3 (ASCII code for End of text). Text will be encoded via UTF-8, as that is the format currently used implicitly via the StreamWriter. This will result in the minimal change, and not require further reactions on the native pipe clients to handle a different encoding.

sanoursa

I don't think this is really a new protocol per se, but I do like the idea of picking a terminator character that is less likely to appear in the body of a message. Changing the encoding could also be a reasonable thing to do, as long as it's meaningfully improving reliability and not overly hurting perf.

sanoursa · 2018-09-11T23:01:43Z

GVFS/GVFS.Common/NamedPipes/NamedPipeStreamReader.cs

+                Debug.Assert(bytesRead != 0, "Not expecting 0 bytes at this time");
+            }
+
+            return Encoding.UTF8.GetString(bytes.ToArray());


It seems like the real value add of this overall change is using a different terminator character. What additional protection do we get from using UTF8 encoding? At first glance, it appears that this will just add perf overhead.

Good points - you are correct - the main value of this change is using a different terminator character. We don't need to use UTF8 (it doesn't add any additional protection) - we just need the sender / receiver to agree on the encoding. There are some reasons I chose UTF8 for this iteration:

GVFS currently sends bytes as UTF8 (implicitly via the StreamReader / StreamWriter classes). Using UTF8 again would be keeping with current behavior, so there shouldn't be any additional overhead.

The native code that reads / writes data from the named pipe treats the data as ASCII (which I understand is a subset of UTF8). If we switched away from UTF8, we would need to do convert the encoding in the native code.

The ModifiedPathsList hook also reads data from the named pipe and directly sends those bytes to git, which works with UTF8 (at least for our current scenarios), but would not work with UTF16 (which is the default representation of strings in C#).

I propose we continue using UTF8 (explicitly now) for this change? We can consider if a different encoding would make more sense...

NamedPipeProtocol.md

jamill · 2018-09-12T01:01:06Z

I don't think this is really a new protocol per se,

Right - this is just a tweak / refinement of the existing behavior. I can tweak the PR description to make the intention more clear.

Changing the encoding could also be a reasonable thing to do, as long as it's meaningfully improving reliability and not overly hurting perf.

I wasn't trying to change the encoding (as you point out - UTF8 encoding doesn't afford extra protection) - but am just being explicit about it now. GVFS currently uses UTF8 (implicitly via the StreamWriter), and the native readers currently expect ASCII / UTF8. I can also include this in the PR description to make this point more clear.

wilbaker

Just a few questions, overall the approach looks good to me

wilbaker · 2018-09-12T20:58:33Z

GVFS/GVFS.FunctionalTests/Tests/GitCommands/UpdateIndexTests.cs

        }

+        [TestCase]
+        [Category(Categories.MacTODO.M4)]


Does this fail on Mac? If so, what's the error (out of curiosity)?

I have not tried running this on the Mac - I will test and see what happens (or remove this, if possible)

This test fails on the "cleanup" code (not related to the protocol changes). After "resetting" the repository, there are unexpected changes in the test repository. Because of this (and because the failure is not related to the protocol changes), I will leave this as a to for Mac

wilbaker · 2018-09-12T21:02:31Z

GVFS/GVFS.ReadObjectHook/main.cpp

-#define DLO_RESPONSE_LENGTH 2
+// "S\x3" -> Success
+// "F\x3" -> Failure
+#define DLO_RESPONSE_LENGTH 3


Why was this changed from 2 to 3?

This should be a 2 - I will undo that

jamill · 2018-09-17T13:09:29Z

This is ready for review now. Thanks for the initial feedback on the approach!

/cc @sanoursa @benpeart @wilbaker

sanoursa

Curious to hear your thoughts on a length field rather than a terminator, and some other minor cleanup

GVFS/GVFS.Common/NamedPipes/NamedPipeStreamReader.cs

sanoursa

The optimized read here avoids the reallocations, so that looks good to me. Thanks for making those changes.

sanoursa · 2018-09-19T15:09:22Z

GVFS/GVFS.Common/NamedPipes/NamedPipeStreamReader.cs

+                    // this was the end of the message), but the stream has been closed. Throw an exception
+                    // and let upper layer deal with this condition.
+
+                    throw new IOException("Incomplete message read from stream. Stream closed before message terminating byte.");


I would throw a BrokenPipeException here

I can do that - I was trying to decide between those two exceptions. Currently, NamedPipeClient and NamedPipeServer both handle IOException , which is why I chose this exception. If I switch to BrokenPipeException, I will also update the handling code in NamedPipeServer to handle BrokenPipeExceptions in the same manner as it currently does for BrokenPipeExceptions.

I can go with either approach - do you have a preference?

I'm guessing we were handling IOExceptions before because that's what the built in reader/writer were throwing. I would update this to throw/catch our specific exception type, and remove the handler for IOException unless there's still a valid reason for it.

Yeah - the underlying Stream class can still throw IOException. We could have the NamedPipeStreamReader and NamedPipeStreamWriter class catch and wrap all underlying IOExceptions as BrokenPipeExceptions.

Would you be OK with keeping this as an IOException for now, which would allow us to keep all existing error handling in place? I think it would be easier if we extracted that change into its isolated change, so we can see the update to the exception type (and all handlers) by itself?

GVFS/GVFS.Common/NamedPipes/NamedPipeStreamReader.cs

sanoursa · 2018-09-19T15:12:42Z

GVFS/GVFS.Common/NamedPipes/NamedPipeStreamReader.cs

+    /// </summary>
+    public class NamedPipeStreamReader
+    {
+        public const int BufferLength = 1024;


I think the only external code that uses this is unit tests, so please make it private. In my opinion it's not a good idea to share constants between production and test code, because they will just end up agreeing with each other no matter what.

I was trying to enforce wat that we test sending messages that are larger than the buffer size. If we completely hide this, then we could update the production code to use a buffer larger than what the test is using.

What if we instead expose an option for callers to control the buffer size. Then the tests could force a smaller buffer size and ensure it is testing the expected functionality...

I wouldn't do either of those things. A unit test should be testing a requirement, not knowledge of an implementation. So the unit test should decide how big of a message is reasonable to send and test that limit, and the implementation is then responsible for handling that.

NamedPipeProtocol.md

wilbaker

Approved with suggestions

wilbaker · 2018-09-19T15:36:26Z

GVFS/GVFS.Common/NamedPipes/NamedPipeServer.cs

-                catch (IOException)
+                catch (IOException ex)
                {
+                    this.tracer.RelatedError($"Error reading message from NamedPipe: {ex.Message}");


Minor suggestions:

Include the full stack trace

Create an EventMetadata and add a value for the stack trace

wilbaker · 2018-09-19T15:38:14Z

GVFS/GVFS.Common/NamedPipes/NamedPipeStreamReader.cs

+
+                if (bytesRead == 0)
+                {
+                    // We have read a partial message (the last byte received does not indicate that


Should we log the partial message?

GVFS currently sends a text based data across a named pipe with lines / messages separated by newline characters. This prohibits sending messages that include a newline as part of the message content. The main current manifestation of this problem is that GVFS does not handle git command lines that contain a newline character. There is another problem that GVFS needs to send paths across the named pipe (for ModifiedPathsList queries), and these paths can include newline characters on certain filesystems (macOS, Linux). Additionally, the protocol that GVFS uses to communicate across the named pipe is currently not consistent. It usually uses a text based stream, but for the ModifiedPathsList response, it uses null bytes to seperate entries in a list. To address these issues, this change tweaks the protocol used to communicate via the named pipe to use the 0x3 byte to indicate the end of a line / message (this is the End of text ASCII code).

jamill requested review from benpeart, jrbriggs and sanoursa September 11, 2018 22:50

sanoursa reviewed Sep 11, 2018

View reviewed changes

wilbaker reviewed Sep 12, 2018

View reviewed changes

jamill force-pushed the named_pipe_stream_reader branch 2 times, most recently from 77681ee to d226563 Compare September 14, 2018 02:21

jamill changed the title ~~RFC: Named pipe protocol~~ Tweak Named pipe protocol Sep 14, 2018

sanoursa suggested changes Sep 17, 2018

View reviewed changes

jamill force-pushed the named_pipe_stream_reader branch from d226563 to 6278f87 Compare September 18, 2018 21:08

sanoursa approved these changes Sep 19, 2018

View reviewed changes

wilbaker approved these changes Sep 19, 2018

View reviewed changes

jamill force-pushed the named_pipe_stream_reader branch 2 times, most recently from 79e1fd2 to 524c434 Compare September 20, 2018 16:50

jamill added 3 commits September 21, 2018 15:34

Tests demonstrating issue with newline in Git commands

3aac413

Add automated tests around NamedPipeStream protocol

659c2d4

jamill force-pushed the named_pipe_stream_reader branch from 67431a5 to 659c2d4 Compare September 21, 2018 19:34

jamill merged commit 788aaca into microsoft:master Sep 21, 2018

jamill deleted the named_pipe_stream_reader branch September 21, 2018 20:39

derrickstolee mentioned this pull request Oct 1, 2018

LockData: Limit number of '|' chars that split the message #319

Closed

wilbaker mentioned this pull request Oct 3, 2018

NamedPipeStreamReader might read in two messages as one #333

Closed

jrbriggs mentioned this pull request Oct 12, 2018

Merge milestones/M142 to releases/shipped #381

Merged

Conversation

jamill commented Sep 11, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sanoursa left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jamill commented Sep 12, 2018

Uh oh!

wilbaker left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jamill commented Sep 17, 2018

Uh oh!

sanoursa left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sanoursa left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wilbaker left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jamill commented Sep 11, 2018 •

edited

Loading