Patchwork D6953: sidedatacopies: read rename information from sidedata

login
register
mail settings
Submitter phabricator
Date Oct. 3, 2019, 5:56 a.m.
Message ID <differential-rev-PHID-DREV-5xepsur27qyxgnv57s6p-req@mercurial-scm.org>
Download mbox | patch
Permalink /patch/41940/
State Superseded
Headers show

Comments

phabricator - Oct. 3, 2019, 5:56 a.m.
marmoute created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  Repository using the new format now use changeset centric algorithm and read the
  copies information from the changelog sidedata.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D6953

AFFECTED FILES
  mercurial/changelog.py
  mercurial/context.py
  mercurial/copies.py
  tests/test-copies-unrelated.t
  tests/test-copies.t

CHANGE DETAILS




To: marmoute, #hg-reviewers
Cc: mercurial-devel
phabricator - Oct. 9, 2019, 5:45 p.m.
martinvonz added inline comments.

INLINE COMMENTS

> changelog.py:369-371
> +        if rawindices is not None:
> +            rawindices = decodefileindices(self.files, rawindices)
> +        return rawindices

Calling the list of filenames `rawindices` is misleading. I see that you've changed this code to return an empty list of files when the list of indices was itself an empty list. That makes sense. Could you extract that to a separate patch ? Does it matter in practice for sidedata? Please explain in the commit message if it does. That patch would also assign the decoded list of indices to a variable called `files` or something like that.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6953/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6953

To: marmoute, #hg-reviewers
Cc: martinvonz, mercurial-devel
phabricator - Oct. 9, 2019, 6:10 p.m.
martinvonz added inline comments.

INLINE COMMENTS

> context.py:536-550
>          filesadded = self._changeset.filesadded
> -        if True:
> +        if self._repo.filecopiesmode == b'changeset-sidedata':
> +            if filesadded is None:
> +                filesadded = []
> +        else:
>              source = self._repo.ui.config(b'experimental', b'copies.read-from')
>              if source == b'changeset-only':

What do you think about writing this as follows? That reduces some of the `if filesremoved is None: filesremoved = []` duplication.

  filesadded = self._changeset.filesadded
  if self._repo.filecopiesmode != b'changeset-sidedata':
      source = self._repo.ui.config(b'experimental', b'copies.read-from')
      if source == b'compatibility':
          if filesadded is None:
              filesadded = copies.computechangesetfilesadded(self)
      elif source != b'changeset-only':
          filesadded = copies.computechangesetfilesadded(self)
  if filesadded is None:
      filesadded = []
  return filesadded

Analogous changes can be made to `filesremoved()` and `p[12]copies()`.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6953/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6953

To: marmoute, #hg-reviewers
Cc: martinvonz, mercurial-devel
phabricator - Oct. 9, 2019, 6:13 p.m.
martinvonz added inline comments.

INLINE COMMENTS

> test-copies.t:457-459
> +  y -> z (sidedata !)
>    y -> z (compatibility !)
>    y -> z (changeset !)

nit: combine these into a `no-filelog` case?

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6953/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6953

To: marmoute, #hg-reviewers
Cc: martinvonz, mercurial-devel
phabricator - Oct. 9, 2019, 6:30 p.m.
marmoute added inline comments.

INLINE COMMENTS

> martinvonz wrote in changelog.py:369-371
> Calling the list of filenames `rawindices` is misleading. I see that you've changed this code to return an empty list of files when the list of indices was itself an empty list. That makes sense. Could you extract that to a separate patch ? Does it matter in practice for sidedata? Please explain in the commit message if it does. That patch would also assign the decoded list of indices to a variable called `files` or something like that.

> I see that you've changed this code to return an empty list of files when the list of indices was itself an empty list.

I dont' follow this sentence. What do you mean ?

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6953/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6953

To: marmoute, #hg-reviewers
Cc: martinvonz, mercurial-devel
phabricator - Oct. 9, 2019, 6:33 p.m.
martinvonz added inline comments.

INLINE COMMENTS

> marmoute wrote in changelog.py:369-371
> > I see that you've changed this code to return an empty list of files when the list of indices was itself an empty list.
> 
> I dont' follow this sentence. What do you mean ?

I thought the reason you explicitly did `if rawindices is not None` was in order to not run the next statement if `rawindices` was an empty list. But that should have no effect anyway, so I was clearly wrong about that. So why did you make the `is not None` explicit? Can you just revert it if there was no good reason for it?

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6953/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6953

To: marmoute, #hg-reviewers
Cc: martinvonz, mercurial-devel
phabricator - Oct. 9, 2019, 6:34 p.m.
marmoute added inline comments.

INLINE COMMENTS

> martinvonz wrote in context.py:536-550
> What do you think about writing this as follows? That reduces some of the `if filesremoved is None: filesremoved = []` duplication.
> 
>   filesadded = self._changeset.filesadded
>   if self._repo.filecopiesmode != b'changeset-sidedata':
>       source = self._repo.ui.config(b'experimental', b'copies.read-from')
>       if source == b'compatibility':
>           if filesadded is None:
>               filesadded = copies.computechangesetfilesadded(self)
>       elif source != b'changeset-only':
>           filesadded = copies.computechangesetfilesadded(self)
>   if filesadded is None:
>       filesadded = []
>   return filesadded
> 
> Analogous changes can be made to `filesremoved()` and `p[12]copies()`.

I think I prefer my version for being a bit more explicite (all `if`s are positive) and hence easier to read for people new to this code. However, I don't mind your version if you really want it that way. Just let me know.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6953/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6953

To: marmoute, #hg-reviewers
Cc: martinvonz, mercurial-devel
phabricator - Oct. 9, 2019, 6:39 p.m.
martinvonz added inline comments.

INLINE COMMENTS

> marmoute wrote in context.py:536-550
> I think I prefer my version for being a bit more explicite (all `if`s are positive) and hence easier to read for people new to this code. However, I don't mind your version if you really want it that way. Just let me know.

Is there a way of writing `self._repo.filecopiesmode != b'changeset-sidedata'` in positive way? Maybe  `self._repo.filecopiesmode is None`? The `source != b'changeset-only'` is because we want any invalid/unexpected config to be treated as "filelog-only".

I won't insist on changing.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6953/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6953

To: marmoute, #hg-reviewers
Cc: martinvonz, mercurial-devel
phabricator - Oct. 9, 2019, 6:40 p.m.
marmoute added inline comments.

INLINE COMMENTS

> martinvonz wrote in changelog.py:369-371
> I thought the reason you explicitly did `if rawindices is not None` was in order to not run the next statement if `rawindices` was an empty list. But that should have no effect anyway, so I was clearly wrong about that. So why did you make the `is not None` explicit? Can you just revert it if there was no good reason for it?

I am confused, the function should return a list or None. Since rawindides are bytes, they are not eligible for returns.
I updated the code as follow:

  --- a/mercurial/changelog.py
  +++ b/mercurial/changelog.py
  @@ -366,9 +366,9 @@ class changelogrevision(object):
               rawindices = self._sidedata.get(sidedatamod.SD_FILESADDED)
           else:
               rawindices = self.extra.get(b'filesadded')
  -        if rawindices is not None:
  -            rawindices = decodefileindices(self.files, rawindices)
  -        return rawindices
  +        if rawindices is None:
  +            return None
  +        return decodefileindices(self.files, rawindices)
   
       @property
       def filesremoved(self):

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6953/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6953

To: marmoute, #hg-reviewers
Cc: martinvonz, mercurial-devel
phabricator - Oct. 9, 2019, 6:45 p.m.
martinvonz added inline comments.

INLINE COMMENTS

> marmoute wrote in changelog.py:369-371
> I am confused, the function should return a list or None. Since rawindides are bytes, they are not eligible for returns.
> I updated the code as follow:
> 
>   --- a/mercurial/changelog.py
>   +++ b/mercurial/changelog.py
>   @@ -366,9 +366,9 @@ class changelogrevision(object):
>                rawindices = self._sidedata.get(sidedatamod.SD_FILESADDED)
>            else:
>                rawindices = self.extra.get(b'filesadded')
>   -        if rawindices is not None:
>   -            rawindices = decodefileindices(self.files, rawindices)
>   -        return rawindices
>   +        if rawindices is None:
>   +            return None
>   +        return decodefileindices(self.files, rawindices)
>    
>        @property
>        def filesremoved(self):

I think you're saying that you changed it in order to handle `rawindices == b''` correctly. Makes sense.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6953/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6953

To: marmoute, #hg-reviewers
Cc: martinvonz, mercurial-devel
phabricator - Oct. 9, 2019, 6:46 p.m.
marmoute added inline comments.

INLINE COMMENTS

> martinvonz wrote in context.py:536-550
> Is there a way of writing `self._repo.filecopiesmode != b'changeset-sidedata'` in positive way? Maybe  `self._repo.filecopiesmode is None`? The `source != b'changeset-only'` is because we want any invalid/unexpected config to be treated as "filelog-only".
> 
> I won't insist on changing.

What about something like this: We "duplicate" the intend code, and factor the actual "computation" out.

  def filesadded(self):
      filesadded = self._changeset.filesadded
      compute_on_none = True
      if self._repo.filecopiesmode == b'changeset-sidedata':
          compute_on_none = False
      else:
          source = self._repo.ui.config(b'experimental', b'copies.read-from')
          if source == b'changeset-only':
              compute_on_none = False
          elif source != b'compatibility':
              # filelog mode, ignore any changelog content
              filesadded = None
      if filesadded is None:
          if compute_on_none:
              filesadded = scmutil.computechangesetfilesadded(self)
          else:
              filesadded = []
      return filesadded

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6953/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6953

To: marmoute, #hg-reviewers
Cc: martinvonz, mercurial-devel
phabricator - Oct. 9, 2019, 6:48 p.m.
marmoute added inline comments.
marmoute marked 3 inline comments as done.

INLINE COMMENTS

> martinvonz wrote in changelog.py:369-371
> I think you're saying that you changed it in order to handle `rawindices == b''` correctly. Makes sense.

Yes, this is to handle `b''` in  different way from `None`.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6953/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6953

To: marmoute, #hg-reviewers
Cc: martinvonz, mercurial-devel
phabricator - Oct. 9, 2019, 9:49 p.m.
martinvonz added inline comments.

INLINE COMMENTS

> marmoute wrote in changelog.py:369-371
> Yes, this is to handle `b''` in  different way from `None`.

Actually, both `decodefileindices()` and `decodecopies()` seem to properly handle the "empty input" case. So I think you can revert this change. As I suggested earlier, it may be best as a separate patch anyway since it seems unrelated to this patch (although maybe it is required by, if I'm still misunderstanding and it really is a bug). Sorry I didn't notice this earlier.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6953/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6953

To: marmoute, #hg-reviewers
Cc: martinvonz, mercurial-devel
phabricator - Oct. 9, 2019, 9:56 p.m.
marmoute added inline comments.

INLINE COMMENTS

> martinvonz wrote in changelog.py:369-371
> Actually, both `decodefileindices()` and `decodecopies()` seem to properly handle the "empty input" case. So I think you can revert this change. As I suggested earlier, it may be best as a separate patch anyway since it seems unrelated to this patch (although maybe it is required by, if I'm still misunderstanding and it really is a bug). Sorry I didn't notice this earlier.

On this side of the discussion, confusion increases I want `None` value to be preserved (if not relevant now, if will quickly become so). As far as I understand, `decodefileindices` does not do this.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6953/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6953

To: marmoute, #hg-reviewers
Cc: martinvonz, mercurial-devel
phabricator - Oct. 9, 2019, 9:59 p.m.
martinvonz added inline comments.

INLINE COMMENTS

> marmoute wrote in changelog.py:369-371
> On this side of the discussion, confusion increases I want `None` value to be preserved (if not relevant now, if will quickly become so). As far as I understand, `decodefileindices` does not do this.

I think we've talked about this long enough now that it's clear that changing this behavior is not the purpose of this patch and that it deserves its own patch where you can explain what you're changing and why.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6953/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6953

To: marmoute, #hg-reviewers
Cc: martinvonz, mercurial-devel

Patch

diff --git a/tests/test-copies.t b/tests/test-copies.t
--- a/tests/test-copies.t
+++ b/tests/test-copies.t
@@ -309,7 +309,6 @@ 
   x -> z
   $ hg debugpathcopies 0 2
   x -> z (filelog !)
-  x -> z (sidedata !)
 
 Copy file that exists on both sides of the merge, different content
   $ newrepo
@@ -338,12 +337,14 @@ 
      x
   $ hg debugp1copies -r 2
   x -> z (changeset !)
+  x -> z (sidedata !)
   $ hg debugp2copies -r 2
-  x -> z (no-changeset !)
+  x -> z (no-changeset no-sidedata !)
   $ hg debugpathcopies 1 2
   x -> z (changeset !)
+  x -> z (sidedata !)
   $ hg debugpathcopies 0 2
-  x -> z (no-changeset !)
+  x -> z (no-changeset no-sidedata !)
 
 Copy x->y on one side of merge and copy x->z on the other side. Pathcopies from one parent
 of the merge to the merge should include the copy from the other side.
@@ -403,7 +404,7 @@ 
   $ hg debugpathcopies 2 3
   y -> z
   $ hg debugpathcopies 1 3
-  y -> z (no-filelog no-sidedata !)
+  y -> z (no-filelog !)
 
 Create x and y, then rename x to z on one side of merge, and rename y to z and
 modify z on the other side. When storing copies in the changeset, we don't
@@ -448,18 +449,18 @@ 
   o  0 add x and y
      x y
   $ hg debugpathcopies 1 4
-  y -> z (no-filelog no-sidedata !)
+  y -> z (no-filelog !)
   $ hg debugpathcopies 2 4
-  x -> z (no-filelog no-sidedata !)
+  x -> z (no-filelog !)
   $ hg debugpathcopies 0 4
   x -> z (filelog !)
-  x -> z (sidedata !)
+  y -> z (sidedata !)
   y -> z (compatibility !)
   y -> z (changeset !)
   $ hg debugpathcopies 1 5
-  y -> z (no-filelog no-sidedata !)
+  y -> z (no-filelog !)
   $ hg debugpathcopies 2 5
-  x -> z (no-filelog no-sidedata !)
+  x -> z (no-filelog !)
   $ hg debugpathcopies 0 5
   x -> z
 
diff --git a/tests/test-copies-unrelated.t b/tests/test-copies-unrelated.t
--- a/tests/test-copies-unrelated.t
+++ b/tests/test-copies-unrelated.t
@@ -179,8 +179,8 @@ 
   o  0 add x
      x
   $ hg debugpathcopies 0 5
-  x -> y (no-filelog no-sidedata !)
-#if no-filelog no-sidedata
+  x -> y (no-filelog !)
+#if no-filelog
   $ hg graft -r 2
   grafting 2:* "modify x again" (glob)
   merging y and x to y
@@ -347,8 +347,8 @@ 
   o  0 base
      a
   $ hg debugpathcopies 1 5
-  x -> y (no-filelog no-sidedata !)
-#if no-filelog no-sidedata
+  x -> y (no-filelog !)
+#if no-filelog
   $ hg graft -r 2
   grafting 2:* "modify x" (glob)
   merging y and x to y
diff --git a/mercurial/copies.py b/mercurial/copies.py
--- a/mercurial/copies.py
+++ b/mercurial/copies.py
@@ -182,6 +182,8 @@ 
 
 def usechangesetcentricalgo(repo):
     """Checks if we should use changeset-centric copy algorithms"""
+    if repo.filecopiesmode == 'changeset-sidedata':
+        return True
     readfrom = repo.ui.config('experimental', 'copies.read-from')
     changesetsource = ('changeset-only', 'compatibility')
     return readfrom in changesetsource
diff --git a/mercurial/context.py b/mercurial/context.py
--- a/mercurial/context.py
+++ b/mercurial/context.py
@@ -456,7 +456,10 @@ 
 
     def filesadded(self):
         filesadded = self._changeset.filesadded
-        if True:
+        if self._repo.filecopiesmode == 'changeset-sidedata':
+            if filesadded is None:
+                filesadded = []
+        else:
             source = self._repo.ui.config('experimental', 'copies.read-from')
             if source == 'changeset-only':
                 if filesadded is None:
@@ -470,7 +473,10 @@ 
 
     def filesremoved(self):
         filesremoved = self._changeset.filesremoved
-        if True:
+        if self._repo.filecopiesmode == 'changeset-sidedata':
+            if filesremoved is None:
+                filesremoved = []
+        else:
             source = self._repo.ui.config('experimental', 'copies.read-from')
             if source == 'changeset-only':
                 if filesremoved is None:
@@ -486,7 +492,12 @@ 
     def _copies(self):
         p1copies = self._changeset.p1copies
         p2copies = self._changeset.p2copies
-        if True:
+        if self._repo.filecopiesmode == 'changeset-sidedata':
+            if p1copies is None:
+                p1copies = {}
+            if p2copies is None:
+                p2copies = {}
+        else:
             source = self._repo.ui.config('experimental', 'copies.read-from')
             # If config says to get copy metadata only from changeset, then
             # return that, defaulting to {} if there was no copy metadata.  In
diff --git a/mercurial/changelog.py b/mercurial/changelog.py
--- a/mercurial/changelog.py
+++ b/mercurial/changelog.py
@@ -339,23 +339,43 @@ 
 
     @property
     def filesadded(self):
-        rawindices = self.extra.get('filesadded')
-        return rawindices and decodefileindices(self.files, rawindices)
+        if sidedatamod.SD_FILESADDED in self._sidedata:
+            rawindices = self._sidedata.get(sidedatamod.SD_FILESADDED)
+        else:
+            rawindices = self.extra.get('filesadded')
+        if rawindices is not None:
+            rawindices = decodefileindices(self.files, rawindices)
+        return rawindices
 
     @property
     def filesremoved(self):
-        rawindices = self.extra.get('filesremoved')
-        return rawindices and decodefileindices(self.files, rawindices)
+        if sidedatamod.SD_FILESREMOVED in self._sidedata:
+            rawindices = self._sidedata.get(sidedatamod.SD_FILESREMOVED)
+        else:
+            rawindices = self.extra.get('filesremoved')
+        if rawindices is not None:
+            rawindices = decodefileindices(self.files, rawindices)
+        return rawindices
 
     @property
     def p1copies(self):
-        rawcopies = self.extra.get('p1copies')
-        return rawcopies and decodecopies(self.files, rawcopies)
+        if sidedatamod.SD_P1COPIES in self._sidedata:
+            rawcopies = self._sidedata.get(sidedatamod.SD_P1COPIES)
+        else:
+            rawcopies = self.extra.get('p1copies')
+        if rawcopies is not None:
+            rawcopies = decodecopies(self.files, rawcopies)
+        return rawcopies
 
     @property
     def p2copies(self):
-        rawcopies = self.extra.get('p2copies')
-        return rawcopies and decodecopies(self.files, rawcopies)
+        if sidedatamod.SD_P2COPIES in self._sidedata:
+            rawcopies = self._sidedata.get(sidedatamod.SD_P2COPIES)
+        else:
+            rawcopies = self.extra.get('p2copies')
+        if rawcopies is not None:
+            rawcopies = decodecopies(self.files, rawcopies)
+        return rawcopies
 
     @property
     def description(self):