Remove temp paths from ModifiedPathsDatabase#207
Remove temp paths from ModifiedPathsDatabase#207kewillford merged 14 commits intomicrosoft:masterfrom
Conversation
sanoursa
left a comment
There was a problem hiding this comment.
First pass through the code. Haven't had a chance to test it yet, hoping to do so by EOD.
|
Any idea why the Mac functional tests failed? |
| case FileSystemTask.OperationType.OnFileDeleted: | ||
| metadata.Add("virtualPath", gitUpdate.VirtualPath); | ||
| result = this.AddModifiedPathAndRemoveFromPlaceholderList(gitUpdate.VirtualPath); | ||
| if (this.newPaths.Contains(gitUpdate.VirtualPath)) |
There was a problem hiding this comment.
Two related comments:
-
Currently
OnFileDeletedis scheduled from the PreDelete callback on Mac. However, on Mac we can't remove files from ModifiedPaths in PreDelete because we don't actually know that the file has been deleted. -
Related to 1 (and something @jrbriggs and I were just discussing), should we try to design this to work on Mac and Windows from the start? Or at this point should we just ensure what whatever we do for Windows does not break Mac?
For this specific feature it might only take a few tweaks to get things working on both platforms (e.g. on Mac we could take an extra step of checking if the file actually was deleted before calling TryRemoveModifiedPath). Thoughts?
CC: @sanoursa for his thoughts as well.
There was a problem hiding this comment.
However, on Mac we can't remove files from ModifiedPaths in PreDelete because we don't actually know that the file has been deleted.
The bare minimum change for this PR could be to wrap the new code here in an IsUnderConstruction check.
There's still the open question though of whether we design\implement for both platforms from the start.
There was a problem hiding this comment.
I think it would be nice to start thinking about how to make all new features cross-platform from the start. Unless we're up against a part of the Mac design that is still unknown or in heavy flux, I would take the time now to make this work for both.
241eea9 to
259e056
Compare
wilbaker
left a comment
There was a problem hiding this comment.
Comments so far, I still need to finish reviewing FileSystemCallbacks.cs
GVFS/GVFS.FunctionalTests/Tests/EnlistmentPerTestCase/PersistedModifiedPathsTests.cs
Show resolved
Hide resolved
| { | ||
| string tempFolder = this.CreateDirectory(fileSystem, "Temp"); | ||
| string tempFile1 = this.CreateFile(fileSystem, "Temp\\temp1.txt"); | ||
| string tempFile2 = this.CreateFile(fileSystem, "Temp\\temp2.txt"); |
There was a problem hiding this comment.
Use Path.Combine rather than hardcoding \\ so that this will work on Mac as well
b1b3d0e to
bc005ee
Compare
| metadata.Add("Area", "ModifiedPathsDatabase"); | ||
| metadata.Add(nameof(entry), entry); | ||
| metadata.Add(nameof(isFolder), isFolder); | ||
| metadata.Add("Exception", e.ToString()); |
There was a problem hiding this comment.
What would you think of adding a null check before calling e.ToString()? Most of our other CreateEventMetadata helper methods handle null.
wilbaker
left a comment
There was a problem hiding this comment.
Overall looking good, a few outstanding questions on the folder rename scenario.
GVFS/GVFS.FunctionalTests.Windows/Windows/Tests/DiskLayoutUpgradeTests.cs
Show resolved
Hide resolved
| if (!this.newlyCreatedFileAndFolderPaths.Contains(itemVirtualPath)) | ||
| { | ||
| relativeFolderPaths.Enqueue(itemVirtualPath); | ||
| this.newlyCreatedFileAndFolderPaths.Add(itemVirtualPath); |
There was a problem hiding this comment.
It looks like before we were adding itemVirtualPath to ModifiedPaths.dat as well (old line 688):
result = this.TryAddModifiedPath(itemVirtualPath, isFolder: false);
Do we still need to do this?
There was a problem hiding this comment.
We are adding the top level folder result = this.TryAddModifiedPath(gitUpdate.VirtualPath, isFolder: true); so we don't need to add all the subfolder and files. This is kind of left over from when we would add the top level folder to that sparse-checkout and then need to add files to the always_exclude. Since this is just one list now we only need the top level folder and git will handle it correctly for both the sparse and exclude cases.
| string itemVirtualPath = Path.Combine(folderPath, itemInfo.Name); | ||
| if (itemInfo.IsDirectory) | ||
| string oldItemVirtualPath = gitUpdate.OldVirtualPath + itemVirtualPath.Substring(gitUpdate.VirtualPath.Length); | ||
| if (!this.newlyCreatedFileAndFolderPaths.Contains(itemVirtualPath)) |
There was a problem hiding this comment.
There's no need to call Contains first, we can just call this.newlyCreatedFileAndFolderPaths.Add(itemVirtualPath); because ConcurrentHashSet.Add is really TryAdd:
public bool Add(T entry)
{
return this.dictionary.TryAdd(entry, true);
}
59d5580 to
39f9448
Compare
| if (result == FileSystemTaskResult.Success) | ||
| { | ||
| this.newlyCreatedFileAndFolderPaths.Add(gitUpdate.VirtualPath); | ||
| this.TryRemoveModifiedPath(gitUpdate.OldVirtualPath, isFolder: true); |
There was a problem hiding this comment.
Should we check the return value of TryRemoveModifiedPath here?
There was a problem hiding this comment.
I was up in the air about that because on the one hand there won't be any issues if we fail to remove the path except the modified paths doesn't get as small. On the other we could retry when needed and keep the modified paths small but I'm not sure we should kill the mount process if we get a FatalError. What are you thoughts?
There was a problem hiding this comment.
I agree there's minimal harm to this failing, but I'm also thinking that if there are failures it could indicate some larger issue with the state of the file system (that I'm not sure we should ignore).
I lean toward checking the value (and killing the process if we see a FatalError). @sanoursa, what are your thoughts?
There was a problem hiding this comment.
Yea I don't think we should ever ignore a FatalError. That said, why is TryRemoveModifiedPath ever returning FatalErrors if we don't intend to shut down as a result? It feels like that method shouldn't even be a Try method - if we focus it down to tmp file scenarios, it can just be a best effort and not even return a result.
There was a problem hiding this comment.
It is still trying to write to the modified paths database file and the pattern is to retry if there was and IOException and return Fatal if there is some other exception. To keep with the pattern I think we should shut down because we don't know what the state of the modified paths database file is in if we get a fatal error writing to it.
| { | ||
| if (!this.newlyCreatedFileAndFolderPaths.Contains(virtualPath)) | ||
| { | ||
| return FileSystemTaskResult.Invalid; |
There was a problem hiding this comment.
Just noticed this now, but I believe that FileSystemTaskResult.Invalid is never supposed to be used in code (i.e. it's an invalid value for FileSystemTaskResult).
Given the choice I think it would be cleaner to move this check back out of TryRemoveModifiedPath and that way the return value of TryRemoveModifiedPath reflects only whether we succeeded\failed when trying to update ModifiedPaths.
There was a problem hiding this comment.
I see that William commented on the Invalid value too, and I agree, from reading this code it seems like this check maybe just doesn't belong in here, at least as currently written.
I would probably change the usage of this method to something like:
if (this.IsTempFile(virtualPath))
{
result = this.TryRemoveModifiedPath(virtualPath, isFolder);
}
else
{
result = this.TryAddModifiedPath(virtualPath, isFolder);
}
sanoursa
left a comment
There was a problem hiding this comment.
The change looks good overall, just needs some cleanup around the pattern for actually removing the entries.
| { | ||
| if (!this.newlyCreatedFileAndFolderPaths.Contains(virtualPath)) | ||
| { | ||
| return FileSystemTaskResult.Invalid; |
There was a problem hiding this comment.
I see that William commented on the Invalid value too, and I agree, from reading this code it seems like this check maybe just doesn't belong in here, at least as currently written.
I would probably change the usage of this method to something like:
if (this.IsTempFile(virtualPath))
{
result = this.TryRemoveModifiedPath(virtualPath, isFolder);
}
else
{
result = this.TryAddModifiedPath(virtualPath, isFolder);
}
| if (result == FileSystemTaskResult.Success) | ||
| { | ||
| this.newlyCreatedFileAndFolderPaths.Add(gitUpdate.VirtualPath); | ||
| this.TryRemoveModifiedPath(gitUpdate.OldVirtualPath, isFolder: true); |
There was a problem hiding this comment.
Yea I don't think we should ever ignore a FatalError. That said, why is TryRemoveModifiedPath ever returning FatalErrors if we don't intend to shut down as a result? It feels like that method shouldn't even be a Try method - if we focus it down to tmp file scenarios, it can just be a best effort and not even return a result.
7618ea2 to
a6ba326
Compare
sanoursa
left a comment
There was a problem hiding this comment.
Code looks good. Just some perf questions, but feel free to ignore if that makes sense.
| return false; | ||
| } | ||
|
|
||
| return true; |
There was a problem hiding this comment.
Remove this line since the very next line is also the same :-)
| return false; | ||
| } | ||
|
|
||
| return true; |
There was a problem hiding this comment.
Same here, remove redundant line
|
|
||
| public virtual void OnFileRenamed(string oldRelativePath, string newRelativePath) | ||
| { | ||
| this.newlyCreatedFileAndFolderPaths.Add(newRelativePath); |
There was a problem hiding this comment.
Do we actually need this? How often does someone rename into a temp path? If we don't expect benefit from this, I would remove it so that we're not adding unnecessary contention on the concurrent data structure.
There was a problem hiding this comment.
This would be so that if something deleted the renamed file we would remove it from the modified paths. Not sure how often something would rename, then delete though.
There was a problem hiding this comment.
Yep I get that that's the scenario it's supporting. What I don't know is how often that actually happens, but my guess is it's pretty unlikely. We know that we're adding lock contention by including this line, so we should only do that if we can also measure an actual benefit.
There was a problem hiding this comment.
I will remove this for now because we aren't measuring how often this is happening so we can't measure any benefit if we don't know what or if there are current difficulties that are from this.
72472d8 to
8b16562
Compare
wilbaker
left a comment
There was a problem hiding this comment.
Latest commit looks good, just a minor suggestion on adding some comment(s)
19f3acd to
243f621
Compare
Builds and other processes will write temp files or create temp directories in the source directory which then adds the file or folder path to the ModifiedPathsDatabase. This causes performance issues because the ModifiedPathsDatabase is currently append only.
This PR is an initial change to address the issue of files or folder being created and deleted. This change only removes from the ModifiedPathsDatabase when the file or folder was created and then delete and the index has not changed in between to solve the issue with builds and other processes creating and deleting a bunch of files and folders.
See #117