Patchwork [3,of,3] eol: on update, only re-check files if filtering changed

login
register
mail settings
Submitter Mads Kiilerich
Date Oct. 9, 2016, 2:19 p.m.
Message ID <956e0c953cca045fc2bb.1476022781@madski>
Download mbox | patch
Permalink /patch/17013/
State Accepted
Headers show

Comments

Mads Kiilerich - Oct. 9, 2016, 2:19 p.m.
# HG changeset patch
# User Mads Kiilerich <madski@unity3d.com>
# Date 1476021289 -7200
#      Sun Oct 09 15:54:49 2016 +0200
# Node ID 956e0c953cca045fc2bb57867b8c95f123d66841
# Parent  5240a8c9b6289838592f24f4e41a1158a48b951a
eol: on update, only re-check files if filtering changed

Before, update would mark all files as 'normallookup' in dirstate if .hgeol
changed so all files would get the new filtering applied. That takes some time
... and is pointless if the filtering for that file didn't change.

Instead, keep track of the old filtering and only check files where the
filtering is changed.

To keep the old filtering, change to write the applied .hgeol content to
.hg/eol.cache instead of just touching it. That change is backwards/forwards
compatible.

In a real world test, this takes an update that is changing .hgeol and 30000
files from 12s to 4s - where the remaining eol overhead is 1-2s.
Pierre-Yves David - Oct. 9, 2016, 2:45 p.m.
On 10/09/2016 04:19 PM, Mads Kiilerich wrote:
> # HG changeset patch
> # User Mads Kiilerich <madski@unity3d.com>
> # Date 1476021289 -7200
> #      Sun Oct 09 15:54:49 2016 +0200
> # Node ID 956e0c953cca045fc2bb57867b8c95f123d66841
> # Parent  5240a8c9b6289838592f24f4e41a1158a48b951a
> eol: on update, only re-check files if filtering changed
>
> Before, update would mark all files as 'normallookup' in dirstate if .hgeol
> changed so all files would get the new filtering applied. That takes some time
> ... and is pointless if the filtering for that file didn't change.
>
> Instead, keep track of the old filtering and only check files where the
> filtering is changed.
>
> To keep the old filtering, change to write the applied .hgeol content to
> .hg/eol.cache instead of just touching it. That change is backwards/forwards
> compatible.
>
> In a real world test, this takes an update that is changing .hgeol and 30000
> files from 12s to 4s - where the remaining eol overhead is 1-2s.
>
> diff --git a/hgext/eol.py b/hgext/eol.py
> --- a/hgext/eol.py
> +++ b/hgext/eol.py
> @@ -312,10 +312,15 @@ def reposetup(ui, repo):
>                  self._eolmatch = util.never
>                  return
>
> +            oldeol = None
>              try:
>                  cachemtime = os.path.getmtime(self.join("eol.cache"))

This seems like it should live in the "cache/" directory. Any reason not 
to ?
Mads Kiilerich - Oct. 9, 2016, 4:19 p.m.
On 10/09/2016 04:45 PM, Pierre-Yves David wrote:
>
>
> On 10/09/2016 04:19 PM, Mads Kiilerich wrote:
>> diff --git a/hgext/eol.py b/hgext/eol.py
>> --- a/hgext/eol.py
>> +++ b/hgext/eol.py
>> @@ -312,10 +312,15 @@ def reposetup(ui, repo):
>>                  self._eolmatch = util.never
>>                  return
>>
>> +            oldeol = None
>>              try:
>>                  cachemtime = os.path.getmtime(self.join("eol.cache"))
>
> This seems like it should live in the "cache/" directory. Any reason 
> not to ?
>

I don't know. I guess it predates the cache directory. I would probably 
agree it now would be better if it lived there.

This patch is backwards compatible and doesn't change that. Changing the 
location would be a separate independent change - and apparently trivial 
and backwards compatible.


/Mads
Pierre-Yves David - Oct. 13, 2016, 11:42 a.m.
On 10/09/2016 06:19 PM, Mads Kiilerich wrote:
> On 10/09/2016 04:45 PM, Pierre-Yves David wrote:
>>
>>
>> On 10/09/2016 04:19 PM, Mads Kiilerich wrote:
>>> diff --git a/hgext/eol.py b/hgext/eol.py
>>> --- a/hgext/eol.py
>>> +++ b/hgext/eol.py
>>> @@ -312,10 +312,15 @@ def reposetup(ui, repo):
>>>                  self._eolmatch = util.never
>>>                  return
>>>
>>> +            oldeol = None
>>>              try:
>>>                  cachemtime = os.path.getmtime(self.join("eol.cache"))
>>
>> This seems like it should live in the "cache/" directory. Any reason
>> not to ?
>>
>
> I don't know. I guess it predates the cache directory. I would probably
> agree it now would be better if it lived there.
>
> This patch is backwards compatible and doesn't change that. Changing the
> location would be a separate independent change - and apparently trivial
> and backwards compatible.

Ha yep. We should probably move the file in the cache directory in a 
follow up patch.

I've pushed that patch as is. I've seen some scary (pre-existing) wlock 
handling in the neighborhood of this patch, I'll send a follow up for that.

Cheers,

Patch

diff --git a/hgext/eol.py b/hgext/eol.py
--- a/hgext/eol.py
+++ b/hgext/eol.py
@@ -312,10 +312,15 @@  def reposetup(ui, repo):
                 self._eolmatch = util.never
                 return
 
+            oldeol = None
             try:
                 cachemtime = os.path.getmtime(self.join("eol.cache"))
             except OSError:
                 cachemtime = 0
+            else:
+                olddata = self.vfs.read("eol.cache")
+                if olddata:
+                    oldeol = eolfile(self.ui, self.root, olddata)
 
             try:
                 eolmtime = os.path.getmtime(self.wjoin(".hgeol"))
@@ -324,17 +329,37 @@  def reposetup(ui, repo):
 
             if eolmtime > cachemtime:
                 self.ui.debug("eol: detected change in .hgeol\n")
+
+                hgeoldata = self.wvfs.read('.hgeol')
+                neweol = eolfile(self.ui, self.root, hgeoldata)
+
                 wlock = None
                 try:
                     wlock = self.wlock()
                     for f in self.dirstate:
-                        if self.dirstate[f] == 'n':
-                            # all normal files need to be looked at
-                            # again since the new .hgeol file might no
-                            # longer match a file it matched before
-                            self.dirstate.normallookup(f)
-                    # Create or touch the cache to update mtime
-                    self.vfs("eol.cache", "w").close()
+                        if self.dirstate[f] != 'n':
+                            continue
+                        if oldeol is not None:
+                            if not oldeol.match(f) and not neweol.match(f):
+                                continue
+                            oldkey = None
+                            for pattern, key, m in oldeol.patterns:
+                                if m(f):
+                                    oldkey = key
+                                    break
+                            newkey = None
+                            for pattern, key, m in neweol.patterns:
+                                if m(f):
+                                    newkey = key
+                                    break
+                            if oldkey == newkey:
+                                continue
+                        # all normal files need to be looked at again since
+                        # the new .hgeol file specify a different filter
+                        self.dirstate.normallookup(f)
+                    # Write the cache to update mtime and cache .hgeol
+                    with self.vfs("eol.cache", "w") as f:
+                        f.write(hgeoldata)
                     wlock.release()
                 except error.LockUnavailable:
                     # If we cannot lock the repository and clear the