Patchwork D6876: phabricator: support automatically obsoleting old revisions of pulled commits

login
register
mail settings
Submitter phabricator
Date Sept. 25, 2019, 5:17 a.m.
Message ID <differential-rev-PHID-DREV-looitrxgt3omaau7a7qk-req@mercurial-scm.org>
Download mbox | patch
Permalink /patch/41749/
State New
Headers show

Comments

phabricator - Sept. 25, 2019, 5:17 a.m.
mharbison72 created this revision.
Herald added subscribers: mercurial-devel, Kwan.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  This is basically an import of the `pullcreatemarkers` extension[1] from the FB
  repo, with minor adjustments to `getmatchingdiff()` to work with modern hg.
  Since this is very phabricator specific, it makes more sense to me to bundle it
  into the existing extension.  It wasn't very obvious from the old name what
  functionality was provided, and it may make sense to do this in other scenarios
  besides `hg pull`.
  
  There are two use cases that I can see- first, ensuring that old revisions are
  cleaned up for a contributor (I seem to recall something I submitted recently
  needed to be explicitly pruned, though most submissions do clean up
  automatically).  Second, any `hg phabread | hg import -` would otherwise need to
  be manually cleaned up.  The latter is annoying enough that I tend not to grab
  the code and try it when reviewing.
  
  It is currently guarded by a config option (off by default), because @marmoute
  expressed concerns about duplicate marker creation if the pushing reviewer also
  creates a marker.  I don't think that's possible here, since the obsolete
  revisions are explicitly excluded.  But maybe there are other reasons someone
  wouldn't want older revisions obsoleted.  The config name reflects the fact that
  I'm not sure if other things like import should get this too.
  
  I suspect that we could wrap a function deeper in the pull sequence to improve
  both the code and the UX.  For example, when pulling an obsolete marker, it can
  print out a warning that the working directory parent is obsolete, but that
  doesn't happen here.  (It won't happen with this test.  It *should* without the
  `--bypass` option, but doesn't.)  It should also be possible to not have to
  query the range of new revisions, and maybe it can be added to the existing
  transaction.
  
  [1] https://bitbucket.org/facebook/hg-experimental/src/default/hgext3rd/pullcreatemarkers.py

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D6876

AFFECTED FILES
  hgext/phabricator.py
  tests/test-phabricator.t

CHANGE DETAILS




To: mharbison72, #hg-reviewers
Cc: Kwan, mercurial-devel, marmoute
phabricator - Oct. 5, 2019, 1:35 p.m.
pulkit added a comment.
pulkit added subscribers: sheehan, pulkit.


  @sheehan this might be of interest to you.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6876/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6876

To: mharbison72, #hg-reviewers
Cc: pulkit, sheehan, Kwan, mercurial-devel, marmoute
phabricator - Oct. 8, 2019, 12:43 a.m.
marmoute added a comment.


  The feature seens pretty usful, but is also a potential foot-gun/data-loss engine. I think it is useful to take the feature, but maybe with proper documentaiton warning and turned of by efaut. I made a couple of comment about the implementation.

INLINE COMMENTS

> phabricator.py:187
> +    if m:
> +        return b"D%s" % m.group(r'id')
> +

So, this only checks the diff number, right ? So in theory, we could obsolete a later version of this, silently dropping change (instead of detecting potential phase-divergence)

> phabricator.py:192
> +@eh.wrapcommand(b"pull")
> +def _pull(orig, ui, repo, *args, **opts):
> +    if not (obsolete.isenabled(repo, obsolete.createmarkersopt)

It would probably make sense to wrap the pull exchange logic instead. It would catch more instance and could reuse the same transaction.

> phabricator.py:216
> +    unfiltered = repo.unfiltered()
> +    for rev in unfiltered.revs("draft() - obsolete()"):
> +        n = unfiltered[rev]

This should be `non public()` instead of `draft()` these could be draft.

> phabricator.py:225
> +
> +    with unfiltered.lock(), unfiltered.transaction('phabpullcreatemarkers'):
> +        obsolete.createmarkers(unfiltered, tocreate)

We should do the computation before locking to avoid other process updating the repo under our feet.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6876/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6876

To: mharbison72, #hg-reviewers
Cc: pulkit, sheehan, Kwan, mercurial-devel, marmoute
phabricator - Oct. 8, 2019, 4:20 a.m.
mharbison72 added a comment.


  In D6876#102925 <https://phab.mercurial-scm.org/D6876#102925>, @marmoute wrote:
  
  > The feature seens pretty usful, but is also a potential foot-gun/data-loss engine. I think it is useful to take the feature, but maybe with proper documentaiton warning and turned of by efaut. I made a couple of comment about the implementation.
  
  I basically agree with the inline comments.  This was an import with minimal changes as a baseline, and I didn't want to spend time making a bunch of follow ups until I gauged interest in this.
  
  Alternately, if there's concern over obsolete marker growth, should this use the archived phase instead?  Since the use case is mostly around stuff a developer imports locally, I don't see much value in those markers being exchangeable.  The only potential hole I see is if someone has pushed this to a non publishing repo before pulling.  Then everyone else could pull the (not obsoleted) commit.  But then //they// would archive it on pull too, instead of N developers potentially creating N markers for the same commit.

INLINE COMMENTS

> marmoute wrote in phabricator.py:187
> So, this only checks the diff number, right ? So in theory, we could obsolete a later version of this, silently dropping change (instead of detecting potential phase-divergence)

It looks like, based on the loop where it's used, it will obsolete //all// (non public) commits that match.  The only way I can think of where that happens is if I `phabimport` someone else's patches, they make updates, and then I `phabimport` again.  That's why I was wondering aloud about doing this on import, but if the eventual pull cleans it all up, that may be sufficient for the use case I had in mind.  I don't see how a developer would do that on their own commits with amend and friends.

Do you have any ideas to detect phase divergence?  And can that even happen if we're ignoring public commits?

> marmoute wrote in phabricator.py:192
> It would probably make sense to wrap the pull exchange logic instead. It would catch more instance and could reuse the same transaction.

I wonder if it would be better in some txn hook.  And then it could (I think) share the existing transaction and lock.

> marmoute wrote in phabricator.py:216
> This should be `non public()` instead of `draft()` these could be draft.

Agreed.

I saw mention of the archived phase the other day; how would that (and other special phases) play here?

> marmoute wrote in phabricator.py:225
> We should do the computation before locking to avoid other process updating the repo under our feet.

You mean //after// locking, correct?

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6876/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6876

To: mharbison72, #hg-reviewers
Cc: pulkit, sheehan, Kwan, mercurial-devel, marmoute
phabricator - Oct. 8, 2019, 10:17 a.m.
marmoute added inline comments.

INLINE COMMENTS

> mharbison72 wrote in phabricator.py:187
> It looks like, based on the loop where it's used, it will obsolete //all// (non public) commits that match.  The only way I can think of where that happens is if I `phabimport` someone else's patches, they make updates, and then I `phabimport` again.  That's why I was wondering aloud about doing this on import, but if the eventual pull cleans it all up, that may be sufficient for the use case I had in mind.  I don't see how a developer would do that on their own commits with amend and friends.
> 
> Do you have any ideas to detect phase divergence?  And can that even happen if we're ignoring public commits?

User  A phasend a patch

- User A phasend a changeset `A` as D1337 <https://phab.mercurial-scm.org/D1337>
- User B phabimport D1337 <https://phab.mercurial-scm.org/D1337>, rebase (as `A'`) and pushes
- User A fix critical bug in D1337 <https://phab.mercurial-scm.org/D1337> and amend it as `A''`
- User A pull the **old** version of D1337 <https://phab.mercurial-scm.org/D1337> without the fix.

If we obsolete the **new** version of D1337 <https://phab.mercurial-scm.org/D1337> (`A''`), we are silently dropping the fix to the critical bug.

If markers were properly exchange, the evolution machinery would detect the divergence, warn the user and be able to automatically solve it. Preserving the fix to the criticial bug.

> mharbison72 wrote in phabricator.py:216
> Agreed.
> 
> I saw mention of the archived phase the other day; how would that (and other special phases) play here?

Thing could be in `secret` phase too.

> mharbison72 wrote in phabricator.py:225
> You mean //after// locking, correct?

I do mean **after** locking (sorry)

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6876/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6876

To: mharbison72, #hg-reviewers
Cc: pulkit, sheehan, Kwan, mercurial-devel, marmoute
phabricator - Oct. 8, 2019, 12:12 p.m.
mharbison72 added inline comments.

INLINE COMMENTS

> marmoute wrote in phabricator.py:187
> User  A phasend a patch
> 
> - User A phasend a changeset `A` as D1337 <https://phab.mercurial-scm.org/D1337>
> - User B phabimport D1337 <https://phab.mercurial-scm.org/D1337>, rebase (as `A'`) and pushes
> - User A fix critical bug in D1337 <https://phab.mercurial-scm.org/D1337> and amend it as `A''`
> - User A pull the **old** version of D1337 <https://phab.mercurial-scm.org/D1337> without the fix.
> 
> If we obsolete the **new** version of D1337 <https://phab.mercurial-scm.org/D1337> (`A''`), we are silently dropping the fix to the critical bug.
> 
> If markers were properly exchange, the evolution machinery would detect the divergence, warn the user and be able to automatically solve it. Preserving the fix to the criticial bug.

I see. So maybe this should only obsolete things that have no precursors, and warn (and ignore) things that do.  That means it doesn’t clean up if the reviewer accidentally forgets to import with `—obsolete`, but that’s easy enough to fix with a copy/paste command if A and A' are identified in the output.

> marmoute wrote in phabricator.py:216
> Thing could be in `secret` phase too.

Right, and this is probably the typical case. (The `phabimport` alias I have imports as secret to prevent accidental push.)  But secret is visible to the user, and benefits from the cleanup.  Archived isn’t visible (IIUC), so it probably doesn’t need to be obsoleted too- especially if we’re worried about marker growth.

(I’m also not sure how well thg handles archived, and that probably needs to be squared away first if we opt to archive instead of obsolete here.)

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6876/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6876

To: mharbison72, #hg-reviewers
Cc: pulkit, sheehan, Kwan, mercurial-devel, marmoute

Patch

diff --git a/tests/test-phabricator.t b/tests/test-phabricator.t
--- a/tests/test-phabricator.t
+++ b/tests/test-phabricator.t
@@ -136,7 +136,8 @@ 
   D1253 - skipped - 1acd4b60af38: create comment for phabricator test
 
 Phabreading a DREV with a local:commits time as a string:
-  $ hg phabread --test-vcr "$VCR/phabread-str-time.json" D1285
+  $ hg phabread --test-vcr "$VCR/phabread-str-time.json" D1285 > ../d1285.diff
+  $ cat ../d1285.diff
   # HG changeset patch
   # User test <test>
   # Date 1562019844 0
@@ -153,5 +154,20 @@ 
   @@ * @@ (glob)
   +test
   
+  $ hg init ../simple
+  $ hg -R ../simple import -q ../d1285.diff
+  $ hg import --bypass -q ../d1285.diff
+  $ hg pull -f ../simple --config experimental.evolution.createmarkers=True \
+  >                      --config experimental.phabricator.fabricate-obsmarkers=True
+  pulling from ../simple
+  searching for changes
+  warning: repository is unrelated
+  requesting all changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 1 changesets with 0 changes to 1 files (+1 heads)
+  new changesets b68c34204469
+  (run 'hg heads' to see heads, 'hg merge' to merge)
 
   $ cd ..
diff --git a/hgext/phabricator.py b/hgext/phabricator.py
--- a/hgext/phabricator.py
+++ b/hgext/phabricator.py
@@ -57,6 +57,7 @@ 
     exthelper,
     httpconnection as httpconnectionmod,
     mdiff,
+    obsolete,
     obsutil,
     parser,
     patch,
@@ -87,7 +88,11 @@ 
 command = eh.command
 configtable = eh.configtable
 templatekeyword = eh.templatekeyword
+uisetup = eh.finaluisetup
 
+eh.configitem(b'experimental', b'phabricator.fabricate-obsmarkers',
+    default=False,
+)
 # developer config: phabricator.batchsize
 eh.configitem(b'phabricator', b'batchsize',
     default=12,
@@ -176,6 +181,52 @@ 
                        optionalrepo=optionalrepo)(inner)
     return decorate
 
+def getmatchingdiff(ctx):
+    m = _differentialrevisiondescre.search(ctx.description())
+    if m:
+        return b"D%s" % m.group(r'id')
+
+    return None
+
+@eh.wrapcommand(b"pull")
+def _pull(orig, ui, repo, *args, **opts):
+    if not (obsolete.isenabled(repo, obsolete.createmarkersopt)
+            and ui.configbool(b'experimental',
+                              b'phabricator.fabricate-obsmarkers')):
+        return orig(ui, repo, *args, **opts)
+    maxrevbeforepull = len(repo.changelog)
+    r = orig(ui, repo, *args, **opts)
+    maxrevafterpull = len(repo.changelog)
+
+    # Collect the diff number of the landed diffs
+    landeddiffs = {}
+    for rev in range(maxrevbeforepull, maxrevafterpull):
+        n = repo[rev]
+        if n.phase() == phases.public:
+            diff = getmatchingdiff(n)
+            if diff is not None:
+                landeddiffs[diff] = n
+
+    if not landeddiffs:
+        return r
+
+    # Try to find match with the drafts
+    tocreate = []
+    unfiltered = repo.unfiltered()
+    for rev in unfiltered.revs("draft() - obsolete()"):
+        n = unfiltered[rev]
+        diff = getmatchingdiff(n)
+        if diff in landeddiffs and landeddiffs[diff].rev() != n.rev():
+            tocreate.append((n, (landeddiffs[diff],)))
+
+    if not tocreate:
+        return r
+
+    with unfiltered.lock(), unfiltered.transaction('phabpullcreatemarkers'):
+        obsolete.createmarkers(unfiltered, tocreate)
+
+    return r
+
 def urlencodenested(params):
     """like urlencode, but works with nested parameters.