Patchwork [1,of,3] lfs: always exclude '.hg*' text files

login
register
mail settings
Submitter Matt Harbison
Date Jan. 14, 2018, 4:26 a.m.
Message ID <12187caa0688c05c7399.1515903961@Envy>
Download mbox | patch
Permalink /patch/26726/
State Accepted
Headers show

Comments

Matt Harbison - Jan. 14, 2018, 4:26 a.m.
# HG changeset patch
# User Matt Harbison <matt_harbison@yahoo.com>
# Date 1515892034 18000
#      Sat Jan 13 20:07:14 2018 -0500
# Node ID 12187caa0688c05c7399f4a08df95f5b0630fa04
# Parent  c780e0649e4152180e892418f9879b82e51542c3
lfs: always exclude '.hg*' text files

I can't think of any problematic scenarios (though things might get interesting
with .hgtags, since every head is consulted).  The eol extension explicitly
disables handling these files, and that seems reasonable here too.
Yuya Nishihara - Jan. 15, 2018, 12:56 p.m.
On Sat, 13 Jan 2018 23:26:01 -0500, Matt Harbison wrote:
> # HG changeset patch
> # User Matt Harbison <matt_harbison@yahoo.com>
> # Date 1515892034 18000
> #      Sat Jan 13 20:07:14 2018 -0500
> # Node ID 12187caa0688c05c7399f4a08df95f5b0630fa04
> # Parent  c780e0649e4152180e892418f9879b82e51542c3
> lfs: always exclude '.hg*' text files
> 
> I can't think of any problematic scenarios (though things might get interesting
> with .hgtags, since every head is consulted).  The eol extension explicitly
> disables handling these files, and that seems reasonable here too.
> 
> diff --git a/hgext/lfs/wrapper.py b/hgext/lfs/wrapper.py
> --- a/hgext/lfs/wrapper.py
> +++ b/hgext/lfs/wrapper.py
> @@ -130,7 +130,9 @@
>          textlen -= offset
>  
>      lfstrack = self.opener.options['lfstrack']
> -    if lfstrack(self.filename, textlen):
> +
> +    # Always exclude hg owned files
> +    if not self.filename.startswith('.hg') and lfstrack(self.filename, textlen):

Seems fine. Queued, thanks.

CC-ed Jun since I'm not pretty sure if this is considered a nice feature.
Jun Wu - Jan. 20, 2018, 7:43 a.m.
Excerpts from Yuya Nishihara's message of 2018-01-15 21:56:49 +0900:
> Seems fine. Queued, thanks.
> 
> CC-ed Jun since I'm not pretty sure if this is considered a nice feature.

It is considered as a nice feature. For example:

  - If the LFS server hostname expires, it's possible to "revive" dead links
    without rewriting hashes.
  - It's possible to convert old normal large files to LFS, to save server's
    disk space, without triggering integrity check errors client-side.

IIRC, mpm said lfs has a better design than largefiles. I *guess* the hash
preserving behavior is one of the things that make it better.
Augie Fackler - Jan. 22, 2018, 8:12 p.m.
> On Jan 20, 2018, at 02:43, Jun Wu <quark@fb.com> wrote:
> 
> Excerpts from Yuya Nishihara's message of 2018-01-15 21:56:49 +0900:
>> Seems fine. Queued, thanks.
>> 
>> CC-ed Jun since I'm not pretty sure if this is considered a nice feature.
> 
> It is considered as a nice feature. For example:
> 
>  - If the LFS server hostname expires, it's possible to "revive" dead links
>    without rewriting hashes.
>  - It's possible to convert old normal large files to LFS, to save server's
>    disk space, without triggering integrity check errors client-side.
> 
> IIRC, mpm said lfs has a better design than largefiles. I *guess* the hash
> preserving behavior is one of the things that make it better.

IMO that's a killer feature compared to largefiles.

> 
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Yuya Nishihara - Jan. 23, 2018, 12:07 p.m.
On Mon, 22 Jan 2018 15:12:57 -0500, Augie Fackler wrote:
> > On Jan 20, 2018, at 02:43, Jun Wu <quark@fb.com> wrote:
> > Excerpts from Yuya Nishihara's message of 2018-01-15 21:56:49 +0900:
> >> Seems fine. Queued, thanks.
> >> 
> >> CC-ed Jun since I'm not pretty sure if this is considered a nice feature.
> > 
> > It is considered as a nice feature. For example:
> > 
> >  - If the LFS server hostname expires, it's possible to "revive" dead links
> >    without rewriting hashes.
> >  - It's possible to convert old normal large files to LFS, to save server's
> >    disk space, without triggering integrity check errors client-side.
> > 
> > IIRC, mpm said lfs has a better design than largefiles. I *guess* the hash
> > preserving behavior is one of the things that make it better.
> 
> IMO that's a killer feature compared to largefiles.

Absolutely yes.

My original question was whether it makes sense to forcibly exclude .hg* files
(e.g. .hgtags) from lfs. That seems totally fine, but fewer "if"s might be
better on the other hand.
Jun Wu - Jan. 23, 2018, 5:50 p.m.
Excerpts from Yuya Nishihara's message of 2018-01-23 21:07:47 +0900:
> My original question was whether it makes sense to forcibly exclude .hg*
> files (e.g. .hgtags) from lfs. That seems totally fine, but fewer "if"s
> might be better on the other hand.

I see. In theory .hgtags do not need to be treated specially - it should
work even if it's LFS. I also like less "if"s and more flexible for power
users. Given the fact that the filter language is expressive enough to
exclude .hgtags. I personally think documentation would be enough, i.e.
say something like "you might want to exclude .hgtags in lfs.track".
Matt Harbison - Jan. 24, 2018, 1:57 a.m.
On Tue, 23 Jan 2018 12:50:22 -0500, Jun Wu <quark@fb.com> wrote:

> Excerpts from Yuya Nishihara's message of 2018-01-23 21:07:47 +0900:
>> My original question was whether it makes sense to forcibly exclude .hg*
>> files (e.g. .hgtags) from lfs. That seems totally fine, but fewer "if"s
>> might be better on the other hand.
>
> I see. In theory .hgtags do not need to be treated specially - it should
> work even if it's LFS. I also like less "if"s and more flexible for power
> users. Given the fact that the filter language is expressive enough to
> exclude .hgtags. I personally think documentation would be enough, i.e.
> say something like "you might want to exclude .hgtags in lfs.track".

It feels weird to allow it, but I don't feel that strongly, so I'll drop  
the exclusion.  We can always add it back later without breaking existing  
repos, if we find a problem.

Patch

diff --git a/hgext/lfs/wrapper.py b/hgext/lfs/wrapper.py
--- a/hgext/lfs/wrapper.py
+++ b/hgext/lfs/wrapper.py
@@ -130,7 +130,9 @@ 
         textlen -= offset
 
     lfstrack = self.opener.options['lfstrack']
-    if lfstrack(self.filename, textlen):
+
+    # Always exclude hg owned files
+    if not self.filename.startswith('.hg') and lfstrack(self.filename, textlen):
         flags |= revlog.REVIDX_EXTSTORED
 
     return orig(self, text, transaction, link, p1, p2, cachedelta=cachedelta,
diff --git a/tests/test-lfs.t b/tests/test-lfs.t
--- a/tests/test-lfs.t
+++ b/tests/test-lfs.t
@@ -17,7 +17,10 @@ 
 
 # Commit small file
   $ echo s > smallfile
-  $ hg commit -Aqm "add small file"
+  $ echo '**.py = LF' > .hgeol
+  $ hg --config lfs.track='size(">1000B") | "path:.hgeol"' commit -Aqm "add small file"
+  $ hg debugdata .hgeol 0
+  **.py = LF
 
 # Commit large file
   $ echo $LONG > largefile
@@ -70,7 +73,7 @@ 
   adding changesets
   adding manifests
   adding file changes
-  added 2 changesets with 2 changes to 2 files
+  added 2 changesets with 3 changes to 3 files
   calling hook pretxnchangegroup.lfs: hgext.lfs.checkrequireslfs
   $ grep lfs $TESTTMP/server/.hg/requires
   lfs
@@ -104,8 +107,8 @@ 
   adding changesets
   adding manifests
   adding file changes
-  added 2 changesets with 2 changes to 2 files
-  new changesets b29ba743f89d:00c137947d30
+  added 2 changesets with 3 changes to 3 files
+  new changesets 0ead593177f7:b88141481348
   (run 'hg update' to get a working copy)
   $ grep lfs .hg/requires $TESTTMP/server/.hg/requires
   .hg/requires:lfs
@@ -117,7 +120,7 @@ 
 
 # Update to the last revision containing the large file
   $ hg update
-  2 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  3 files updated, 0 files merged, 0 files removed, 0 files unresolved
 
 # Check the blobstore has been populated on update
   $ find .hg/store/lfs/objects | sort