Submitter | Enrique A. Tobis |
---|---|
Date | Dec. 11, 2014, 3:47 p.m. |
Message ID | <a2351ea044f117402b17.1418312868@eowyn> |
Download | mbox | patch |
Permalink | /patch/7051/ |
State | Rejected |
Headers | show |
Comments
On Thu, 2014-12-11 at 10:47 -0500, Enrique A. Tobis wrote: > # HG changeset patch > # User Enrique A. Tobis <enrique@tobis.com.ar> > # Date 1418275100 18000 > # Thu Dec 11 00:18:20 2014 -0500 > # Branch stable > # Node ID a2351ea044f117402b17809050bdc5dc2de29ff4 > # Parent c67b563426866e1c717c6a25b0056bb51cb155e5 > scmutil: record all symlinks to a repository in walkrepos > > In walkrepos, list all paths that lead to a repository when symlinks are > followed, while still avoiding infinite loops. This patch is too complex and includes unrelated changes that interfere with its readability. For instance: > - def errhandler(err): > - if err.filename == path: > - raise err ..we used to have this error handling code that was presumably there for a reason. You removed it with no explanation. Even if there is a good reason to remove it, it doesn't belong in this patch. Relatedly, you've replaced os.walk without implementing its error handling. If you really need to rewrite os.walk (I'm not convinced) to get the behavior you want, you should do your patches like this: - cleanups - replace os.walk, no functional changes - behavior change
On Fri, Dec 12, 2014 at 3:52 PM, Matt Mackall <mpm@selenic.com> wrote: > On Thu, 2014-12-11 at 10:47 -0500, Enrique A. Tobis wrote: >> # HG changeset patch >> # User Enrique A. Tobis <enrique@tobis.com.ar> >> # Date 1418275100 18000 >> # Thu Dec 11 00:18:20 2014 -0500 >> # Branch stable >> # Node ID a2351ea044f117402b17809050bdc5dc2de29ff4 >> # Parent c67b563426866e1c717c6a25b0056bb51cb155e5 >> scmutil: record all symlinks to a repository in walkrepos >> >> In walkrepos, list all paths that lead to a repository when symlinks are >> followed, while still avoiding infinite loops. > > This patch is too complex and includes unrelated changes that interfere > with its readability. For instance: > >> - def errhandler(err): >> - if err.filename == path: >> - raise err > > ..we used to have this error handling code that was presumably there for > a reason. You removed it with no explanation. > > Even if there is a good reason to remove it, it doesn't belong in this > patch. > > Relatedly, you've replaced os.walk without implementing its error > handling. > > If you really need to rewrite os.walk (I'm not convinced) to get the > behavior you want, you should do your patches like this: > > - cleanups > - replace os.walk, no functional changes > - behavior change > Thank you for your feedback! I'll be more careful with my patches in the future. The current implementation of scmutil.walkrepo will follow symlinks if asked, but if it finds a repository more than once, only the first time it's found is yielded. My intention was to change that behavior and make it return every occurence, not just the first. That is only relevant in the context of symlinks, so I also want to avoid infinite loops. if there is an infinite loop, I think it's reasonable to traverse it once, but not more. My idea for implementing this new behavior is to change the meaning of seen_dirs. Right now it's, approximately, "every directory I've seen so far since the top-most invocation of walkrepos." I want it to be "every directory I've traversed to get here starting from the top-most path." In other words, a stack of the directories traversed. I couldn't come up with a way to do that using os.walk. Do you think my change in behavior makes sense? If so, do you think it can be implemented using os.walk? Thanks again, Enrique
On Sat, 2014-12-13 at 00:10 -0500, Enrique A. Tobis wrote: > On Fri, Dec 12, 2014 at 3:52 PM, Matt Mackall <mpm@selenic.com> wrote: > > On Thu, 2014-12-11 at 10:47 -0500, Enrique A. Tobis wrote: > >> # HG changeset patch > >> # User Enrique A. Tobis <enrique@tobis.com.ar> > >> # Date 1418275100 18000 > >> # Thu Dec 11 00:18:20 2014 -0500 > >> # Branch stable > >> # Node ID a2351ea044f117402b17809050bdc5dc2de29ff4 > >> # Parent c67b563426866e1c717c6a25b0056bb51cb155e5 > >> scmutil: record all symlinks to a repository in walkrepos > >> > >> In walkrepos, list all paths that lead to a repository when symlinks are > >> followed, while still avoiding infinite loops. > > > > This patch is too complex and includes unrelated changes that interfere > > with its readability. For instance: > > > >> - def errhandler(err): > >> - if err.filename == path: > >> - raise err > > > > ..we used to have this error handling code that was presumably there for > > a reason. You removed it with no explanation. > > > > Even if there is a good reason to remove it, it doesn't belong in this > > patch. > > > > Relatedly, you've replaced os.walk without implementing its error > > handling. > > > > If you really need to rewrite os.walk (I'm not convinced) to get the > > behavior you want, you should do your patches like this: > > > > - cleanups > > - replace os.walk, no functional changes > > - behavior change > > > Thank you for your feedback! I'll be more careful with my patches in the future. > > The current implementation of scmutil.walkrepo will follow symlinks if > asked, but if it finds a repository more than once, only the first > time it's found is yielded. My intention was to change that behavior > and make it return every occurence, not just the first. Sounds like we probably had that behavior to start with.. and then nixed it for some reason. So you need to do some archaelogy to figure out why we did that and either a) convince us to change it back (not very likely, because it just means someone else will be trying to convince us to go the other way again next year) or b) convince us to make it an option.
On 12/15/2014 11:07 AM, Matt Mackall wrote: > On Sat, 2014-12-13 at 00:10 -0500, Enrique A. Tobis wrote: >> On Fri, Dec 12, 2014 at 3:52 PM, Matt Mackall <mpm@selenic.com> wrote: >>> On Thu, 2014-12-11 at 10:47 -0500, Enrique A. Tobis wrote: >>>> # HG changeset patch >>>> # User Enrique A. Tobis <enrique@tobis.com.ar> >>>> # Date 1418275100 18000 >>>> # Thu Dec 11 00:18:20 2014 -0500 >>>> # Branch stable >>>> # Node ID a2351ea044f117402b17809050bdc5dc2de29ff4 >>>> # Parent c67b563426866e1c717c6a25b0056bb51cb155e5 >>>> scmutil: record all symlinks to a repository in walkrepos >>>> >>>> In walkrepos, list all paths that lead to a repository when symlinks are >>>> followed, while still avoiding infinite loops. >>> >>> This patch is too complex and includes unrelated changes that interfere >>> with its readability. For instance: >>> >>>> - def errhandler(err): >>>> - if err.filename == path: >>>> - raise err >>> >>> ..we used to have this error handling code that was presumably there for >>> a reason. You removed it with no explanation. >>> >>> Even if there is a good reason to remove it, it doesn't belong in this >>> patch. >>> >>> Relatedly, you've replaced os.walk without implementing its error >>> handling. >>> >>> If you really need to rewrite os.walk (I'm not convinced) to get the >>> behavior you want, you should do your patches like this: >>> >>> - cleanups >>> - replace os.walk, no functional changes >>> - behavior change >>> >> Thank you for your feedback! I'll be more careful with my patches in the future. >> >> The current implementation of scmutil.walkrepo will follow symlinks if >> asked, but if it finds a repository more than once, only the first >> time it's found is yielded. My intention was to change that behavior >> and make it return every occurence, not just the first. > > Sounds like we probably had that behavior to start with.. and then nixed > it for some reason. So you need to do some archaelogy to figure out why > we did that and either a) convince us to change it back (not very > likely, because it just means someone else will be trying to convince us > to go the other way again next year) or b) convince us to make it an > option. I remember this has been changed around the creation of the clowncopter repository. But I cannot remember the original reasoning. The patch was probably issued by David Soria Para, Durham Goode or Siddharth Agarwa >
Patch
diff -r c67b56342686 -r a2351ea044f1 mercurial/scmutil.py --- a/mercurial/scmutil.py Wed Dec 10 23:46:47 2014 -0500 +++ b/mercurial/scmutil.py Thu Dec 11 00:18:20 2014 -0500 @@ -448,9 +448,6 @@ def walkrepos(path, followsym=False, seen_dirs=None, recurse=False): '''yield every hg repository under path, always recursively. The recurse flag will only control recursion into repo working dirs''' - def errhandler(err): - if err.filename == path: - raise err samestat = getattr(os.path, 'samestat', None) if followsym and samestat is not None: def adddir(dirlst, dirname): @@ -463,35 +460,33 @@ if not match: dirlst.append(dirstat) return not match + removedir = lambda dirlst: dirlst.pop() + if seen_dirs is None: + seen_dirs = [] else: followsym = False - - if (seen_dirs is None) and followsym: - seen_dirs = [] - adddir(seen_dirs, path) - for root, dirs, files in os.walk(path, topdown=True, onerror=errhandler): + adddir = lambda dirlst, dirname: not os.path.islink(dirname) + removedir = lambda dirlst: None + def _walkrepos(path, followsym, seen_dirs, recurse): + dirs = [dirname for dirname in os.listdir(path) + if os.path.isdir(os.path.join(path, dirname))] dirs.sort() if '.hg' in dirs: - yield root # found a repository - qroot = os.path.join(root, '.hg', 'patches') + yield path + qroot = os.path.join(path, '.hg', 'patches') if os.path.isdir(os.path.join(qroot, '.hg')): - yield qroot # we have a patch queue repo here + yield qroot if recurse: - # avoid recursing inside the .hg directory dirs.remove('.hg') else: - dirs[:] = [] # don't descend further - elif followsym: - newdirs = [] - for d in dirs: - fname = os.path.join(root, d) - if adddir(seen_dirs, fname): - if os.path.islink(fname): - for hgname in walkrepos(fname, True, seen_dirs): - yield hgname - else: - newdirs.append(d) - dirs[:] = newdirs + return # don't descend further + for dirname in (os.path.join(path, dirname) for dirname in dirs): + if adddir(seen_dirs, dirname): + for repo in _walkrepos(dirname, followsym, seen_dirs, recurse): + yield repo + removedir(seen_dirs) + for repo in _walkrepos(path, followsym, seen_dirs, recurse): + yield repo def osrcpath(): '''return default os-specific hgrc search path''' diff -r c67b56342686 -r a2351ea044f1 tests/test-hgwebdirsym.t --- a/tests/test-hgwebdirsym.t Wed Dec 10 23:46:47 2014 -0500 +++ b/tests/test-hgwebdirsym.t Thu Dec 11 00:18:20 2014 -0500 @@ -40,6 +40,9 @@ /al/ /b/ /c/ + /circle/al/ + /circle/b/ + /circle/c/ $ "$TESTDIR/get-with-headers.py" localhost:$HGPORT 'al/file/tip/a?style=raw' 200 Script output follows @@ -53,28 +56,38 @@ 200 Script output follows c - + $ "$TESTDIR/get-with-headers.py" localhost:$HGPORT 'circle/al/file/tip/a?style=raw' + 200 Script output follows + + a + $ "$TESTDIR/get-with-headers.py" localhost:$HGPORT 'circle/b/file/tip/b?style=raw' + 200 Script output follows + + b + $ "$TESTDIR/get-with-headers.py" localhost:$HGPORT 'circle/c/file/tip/c?style=raw' + 200 Script output follows + + c should fail - $ "$TESTDIR/get-with-headers.py" localhost:$HGPORT 'circle/al/file/tip/a?style=raw' + $ "$TESTDIR/get-with-headers.py" localhost:$HGPORT 'circle/circle/al/file/tip/a?style=raw' 404 Not Found - error: repository circle/al/file/tip/a not found + error: repository circle/circle/al/file/tip/a not found [1] - $ "$TESTDIR/get-with-headers.py" localhost:$HGPORT 'circle/b/file/tip/a?style=raw' + $ "$TESTDIR/get-with-headers.py" localhost:$HGPORT 'circle/circle/b/file/tip/b?style=raw' 404 Not Found - error: repository circle/b/file/tip/a not found + error: repository circle/circle/b/file/tip/b not found [1] - $ "$TESTDIR/get-with-headers.py" localhost:$HGPORT 'circle/c/file/tip/a?style=raw' + $ "$TESTDIR/get-with-headers.py" localhost:$HGPORT 'circle/circle/c/file/tip/c?style=raw' 404 Not Found - error: repository circle/c/file/tip/a not found + error: repository circle/circle/c/file/tip/c not found [1] - collections errors $ cat error-collections.log diff -r c67b56342686 -r a2351ea044f1 tests/test-walkrepo.py --- a/tests/test-walkrepo.py Wed Dec 10 23:46:47 2014 -0500 +++ b/tests/test-walkrepo.py Thu Dec 11 00:18:20 2014 -0500 @@ -22,22 +22,22 @@ def runtest(): reposet = frozenset(walkrepos('.', followsym=True)) - if sym and (len(reposet) != 3): + if sym and (len(reposet) != 7): print "reposet = %r" % (reposet,) - print ("Found %d repositories when I should have found 3" + print ("Found %d repositories when I should have found 7" % (len(reposet),)) if (not sym) and (len(reposet) != 2): print "reposet = %r" % (reposet,) print ("Found %d repositories when I should have found 2" % (len(reposet),)) sub1set = frozenset((pjoin('.', 'sub1'), - pjoin('.', 'circle', 'subdir', 'sub1'))) + pjoin('.', 'circle', 'circle', 'subdir', 'sub1'))) if len(sub1set & reposet) != 1: print "sub1set = %r" % (sub1set,) print "reposet = %r" % (reposet,) print "sub1set and reposet should have exactly one path in common." - sub2set = frozenset((pjoin('.', 'subsub1'), - pjoin('.', 'subsubdir', 'subsub1'))) + sub2set = frozenset((pjoin('.', 'sub1'), + pjoin('.', 'circle', 'subdir', 'circle', 'top1'))) if len(sub2set & reposet) != 1: print "sub2set = %r" % (sub2set,) print "reposet = %r" % (reposet,)