Submitter | phabricator |
---|---|
Date | Nov. 6, 2019, 5:16 a.m. |
Message ID | <differential-rev-PHID-DREV-ogrsmordypahfwve6ojs-req@mercurial-scm.org> |
Download | mbox | patch |
Permalink | /patch/42798/ |
State | Superseded |
Headers | show |
Comments
Alphare added a comment. Sorry to necropost, but since this broke the Rust implementation, I was wondering what the best approach would be to replicate this behavior, and I am starting to think that this should be reverted. IIUC, currently any new path passes through the `pathauditor` first for validation, so any further check would be completely redundant. My intuition is that adding this check is purely here to satisfy the fuzzer, but would never happen in real life. Adding checks to this (very) internal data structure comes at a cost, both in performance and in code ergonomics. REPOSITORY rHG Mercurial CHANGES SINCE LAST ACTION https://phab.mercurial-scm.org/D7252/new/ REVISION DETAIL https://phab.mercurial-scm.org/D7252 To: durin42, #hg-reviewers, indygreg Cc: Alphare, mercurial-devel
durin42 added a comment. In D7252#109627 <https://phab.mercurial-scm.org/D7252#109627>, @Alphare wrote: > Sorry to necropost, but since this broke the Rust implementation, I was wondering what the best approach would be to replicate this behavior, and I am starting to think that this should be reverted. Oh, you mean the Rust version doesn't do the same rejection? Given that you're about to do a hash lookup, I'm a little skeptical that a `endswith('/')` check would show up meaningfully in a profiler, but I'm willing to be proven wrong? > IIUC, currently any new path passes through the `pathauditor` first for validation, so any further check would be completely redundant. Plausible, but I'd like some sort of test coverage demonstrating that. > My intuition is that adding this check is purely here to satisfy the fuzzer, but would never happen in real life. Adding checks to this (very) internal data structure comes at a cost, both in performance and in code ergonomics. Yes, this was largely added to make the fuzzer not get stuck on OOM conditions. That said, if the pathauditor can't catch this, we need to defend against this DoS vector at this layer, and it's such a small check at this layer I'm inclined to keep it unless it is measurably slowing down real uses... REPOSITORY rHG Mercurial CHANGES SINCE LAST ACTION https://phab.mercurial-scm.org/D7252/new/ REVISION DETAIL https://phab.mercurial-scm.org/D7252 To: durin42, #hg-reviewers, indygreg Cc: Alphare, mercurial-devel
durin42 added a comment. In D7252#109656 <https://phab.mercurial-scm.org/D7252#109656>, @durin42 wrote: > In D7252#109627 <https://phab.mercurial-scm.org/D7252#109627>, @Alphare wrote: > >> Sorry to necropost, but since this broke the Rust implementation, I was wondering what the best approach would be to replicate this behavior, and I am starting to think that this should be reverted. > > Oh, you mean the Rust version doesn't do the same rejection? > Given that you're about to do a hash lookup, I'm a little skeptical that a `endswith('/')` check would show up meaningfully in a profiler, but I'm willing to be proven wrong? > >> IIUC, currently any new path passes through the `pathauditor` first for validation, so any further check would be completely redundant. > > Plausible, but I'd like some sort of test coverage demonstrating that. > >> My intuition is that adding this check is purely here to satisfy the fuzzer, but would never happen in real life. Adding checks to this (very) internal data structure comes at a cost, both in performance and in code ergonomics. > > Yes, this was largely added to make the fuzzer not get stuck on OOM conditions. That said, if the pathauditor can't catch this, we need to defend against this DoS vector at this layer, and it's such a small check at this layer I'm inclined to keep it unless it is measurably slowing down real uses... I should add: if we can substantiate that such a path can't make it through the pathauditor (and we have tests for that in the pathauditor layer so we don't break that in the future!) we can push this consecutive-slashes check into dirs_fuzzer.cc and remove it from dirs.c. REPOSITORY rHG Mercurial CHANGES SINCE LAST ACTION https://phab.mercurial-scm.org/D7252/new/ REVISION DETAIL https://phab.mercurial-scm.org/D7252 To: durin42, #hg-reviewers, indygreg Cc: Alphare, mercurial-devel
Alphare added a comment. > Oh, you mean the Rust version doesn't do the same rejection? It does not, currently. > Given that you're about to do a hash lookup, I'm a little skeptical that a `endswith('/')` check would show up meaningfully in a profiler, but I'm willing to be proven wrong? I have the same intuition you do, it's mostly about not repeating the same operations that have any impact at all on performance/code. > I should add: if we can substantiate that such a path can't make it through the pathauditor (and we have tests for that in the pathauditor layer so we don't break that in the future!) we can push this consecutive-slashes check into dirs_fuzzer.cc and remove it from dirs.c. It was my candid intuition that the `pathauditor` was already "ensured" as the barrier for path-based vulnerabilities. Since my plan for the Rust implementation was to enforce that very fact, I guess I'll be the one to write said tests when I write a Rust `pathauditor`. It should come up not too long after my current work (in-progress) about matchers, since handling unknown files in any capacity will require the `pathauditor`. I'm not 100% sure of the implications of writing the aforementioned tests, if it's too much work, I can replicate this check in Rust for the time being. REPOSITORY rHG Mercurial CHANGES SINCE LAST ACTION https://phab.mercurial-scm.org/D7252/new/ REVISION DETAIL https://phab.mercurial-scm.org/D7252 To: durin42, #hg-reviewers, indygreg Cc: Alphare, mercurial-devel
marmoute added a comment. In practice, this means the tests consistently fails when testing with Rust. Can we either have a quick fix of the Rust code or a temporary backout of this (until we Rust code is fixed)? REPOSITORY rHG Mercurial CHANGES SINCE LAST ACTION https://phab.mercurial-scm.org/D7252/new/ REVISION DETAIL https://phab.mercurial-scm.org/D7252 To: durin42, #hg-reviewers, indygreg Cc: marmoute, Alphare, mercurial-devel
Patch
diff --git a/tests/test-dirs.py b/tests/test-dirs.py new file mode 100644 --- /dev/null +++ b/tests/test-dirs.py @@ -0,0 +1,27 @@ +from __future__ import absolute_import + +import unittest + +import silenttestrunner + +from mercurial import util + + +class dirstests(unittest.TestCase): + def testdirs(self): + for case, want in [ + (b'a/a/a', [b'a', b'a/a', b'']), + (b'alpha/beta/gamma', [b'', b'alpha', b'alpha/beta']), + ]: + d = util.dirs() + d.addpath(case) + self.assertEqual(sorted(d), sorted(want)) + + def testinvalid(self): + with self.assertRaises(ValueError): + d = util.dirs() + d.addpath(b'a//b') + + +if __name__ == '__main__': + silenttestrunner.main(__name__) diff --git a/mercurial/util.py b/mercurial/util.py --- a/mercurial/util.py +++ b/mercurial/util.py @@ -3545,6 +3545,8 @@ def finddirs(path): pos = path.rfind(b'/') while pos != -1: + if path[pos - 2 : pos - 1] == '/': + raise ValueError("found invalid consecutive slashes in path") yield path[:pos] pos = path.rfind(b'/', 0, pos) yield b'' diff --git a/mercurial/cext/dirs.c b/mercurial/cext/dirs.c --- a/mercurial/cext/dirs.c +++ b/mercurial/cext/dirs.c @@ -66,6 +66,14 @@ while ((pos = _finddir(cpath, pos - 1)) != -1) { PyObject *val; + /* Sniff for trailing slashes, a marker of an invalid input. */ + if (pos > 0 && cpath[pos - 1] == '/') { + PyErr_SetString( + PyExc_ValueError, + "found invalid consecutive slashes in path"); + goto bail; + } + key = PyBytes_FromStringAndSize(cpath, pos); if (key == NULL) goto bail;