Submitter | Boris Feld |
---|---|
Date | July 16, 2018, 6:50 p.m. |
Message ID | <c34ef3def14e04dca76c.1531767023@FB-lair> |
Download | mbox | patch |
Permalink | /patch/32872/ |
State | Accepted |
Headers | show |
Comments
On Mon, Jul 16, 2018 at 11:50 AM Boris Feld <boris.feld@octobus.net> wrote: > # HG changeset patch > # User Paul Morelle <paul.morelle@octobus.net> > # Date 1528179575 -7200 > # Tue Jun 05 08:19:35 2018 +0200 > # Node ID c34ef3def14e04dca76c43667766496a99636b44 > # Parent 5ae60e5a705ef273d316e33df401e4c44a4c482a > # EXP-Topic write-for-sparse-read > # Available At https://bitbucket.org/octobus/mercurial-devel/ > # hg pull https://bitbucket.org/octobus/mercurial-devel/ -r > c34ef3def14e > sparse-revlog: implement algorithm to write sparse delta chains (issue5480) > > The classic behavior of revlog._isgooddeltainfo is to consider the span > size > of the whole delta chain, and limit it to 4 * textlen. > Once sparse-revlog writing is allowed (and enforced with a requirement), > revlog._isgooddeltainfo considers the span of the largest chunk as the > distance used in the verification, instead of using the span of the whole > delta chain. > > In order to compute the span of the largest chunk, we need to slice into > chunks a chain with the new revision at the top of the revlog, and take the > maximal span of these chunks. The sparse read density is a parameter to the > slicing, as it will stop when the global read density reaches this > threshold. > For instance, a density of 50% means that 2 of 4 read bytes are actually > used > for the reconstruction of the revision (the others are part of other > chains). > > This allows a new revision to be potentially stored with a diff against > another revision anywhere in the history, instead of forcing it in the > last 4 > * textlen. The result is a much better compression on repositories that > have > many concurrent branches. Here are a comparison between using deltas from > current upstream (aggressive-merge-deltas on by default) and deltas from a > sparse-revlog > Did you consider simply removing the restriction on span distance? See how FB did it in https://bitbucket.org/facebook/hg-experimental/commits/f60c6d7b9976ce1be85f51883d49b12b13302b27. I tried it on the pypy repo and it shrunk the 00manifest.d a little more (from 329MB with this patch to 282MB with Durham's). I couldn't measure any operations getting slower. Do you have any operations that you would expect would get slower with that patch? I tried setting maxchainlen=1000 and it produced an insignificantly larger 00manifest.d (~1MB), in case you're thinking that the smaller manifest is at the expense of very long chains. Or maybe the difference would mostly be noticed on cold disk? I didn't test that at all. > Comparison of `.hg/store/` size: > > mercurial (6.74% merges): > before: 46,831,873 bytes > after: 46,795,992 bytes (no relevant change) > pypy (8.30% merges): > before: 333,524,651 bytes > after: 308,417,511 bytes -8% > netbeans (34.21% merges): > before: 1,141,847,554 bytes > after: 1,131,093,161 bytes -1% > mozilla-central (4.84% merges): > before: 2,344,248,850 bytes > after: 2,328,459,258 bytes -1% > large-private-repo-A (merge 19.73%) > before: 41,510,550,163 bytes > after: 8,121,763,428 bytes -80% > large-private-repo-B (23.77%) > before: 58,702,221,709 bytes > after: 8,351,588,828 bytes -76% > > Comparison of `00manifest.d` size: > > mercurial (6.74% merges): > before: 6,143,044 bytes > after: 6,107,163 bytes > pypy (8.30% merges): > before: 52,941,780 bytes > after: 27,834,082 bytes -48% > netbeans (34.21% merges): > before: 130,088,982 bytes > after: 119,337,636 bytes -10% > mozilla-central (4.84% merges): > before: 215,096,339 bytes > after: 199,496,863 bytes -8% > large-private-repo-A (merge 19.73%) > before: 33,725,285,081 bytes > after: 390,302,545 bytes -99% > large-private-repo-B (23.77%) > before: 49,457,701,645 bytes > after: 1,366,752,187 bytes -97% > > > The better delta chains provide a performance boost in relevant > repositories: > > pypy, bundling 1000 revisions: > before: 1.670s > after: 1.149s -31% > > Unbundling got a bit slower. probably because the sparse algorithm is still > pure > python. > > pypy, unbundling 1000 revisions: > before: 4.062s > after: 4.507s +10% > > Performance of bundle/unbundle in repository with few concurrent branches > (eg: > mercurial) are unaffected. > > No significant differences have been noticed then timing `hg push` and `hg > pull` locally. More state timings are being gathered. > > Same as for aggressive-merge-delta, better delta comes with longer delta > chains. Longer chains have a performance impact. For example. The length of > the chain needed to get the manifest of pypy's tip moves from 82 item to > 1929 > items. This moves the restore time from 3.88ms to 11.3ms. > > Delta chain length is an independent issue that affects repository without > this changes. It will be dealt with independently. > > No significant differences have been observed on repositories where > `sparse-revlog` have not much effect (mercurial, unity, netbeans). On pypy, > small differences have been observed on some operation affected by delta > chain > building and retrieval. > > > pypy, perfmanifest > before: 0.006162s > after: 0.017899s +190% > > pypy, commit: > before: 0.382 > after: 0.376 -1% > > pypy, status: > before: 0.157 > after: 0.168 +7% > > More comprehensive and stable timing comparisons are in progress. > > diff --git a/mercurial/revlog.py b/mercurial/revlog.py > --- a/mercurial/revlog.py > +++ b/mercurial/revlog.py > @@ -216,6 +216,9 @@ class _testrevlog(object): > def length(self, rev): > return self.end(rev) - self.start(rev) > > + def __len__(self): > + return len(self._data) > + > def _trimchunk(revlog, revs, startidx, endidx=None): > """returns revs[startidx:endidx] without empty trailing revs > > @@ -293,7 +296,7 @@ def _segmentspan(revlog, revs): > return 0 > return revlog.end(revs[-1]) - revlog.start(revs[0]) > > -def _slicechunk(revlog, revs, targetsize=None): > +def _slicechunk(revlog, revs, deltainfo=None, targetsize=None): > """slice revs to reduce the amount of unrelated data to be read from > disk. > > ``revs`` is sliced into groups that should be read in one time. > @@ -341,20 +344,27 @@ def _slicechunk(revlog, revs, targetsize > [[1, 2], [5, 8, 10, 11], [14]] > > Slicing with a maximum chunk size > - >>> list(_slicechunk(revlog, [0, 11, 13, 15], 15)) > + >>> list(_slicechunk(revlog, [0, 11, 13, 15], targetsize=15)) > [[0], [11], [13], [15]] > - >>> list(_slicechunk(revlog, [0, 11, 13, 15], 20)) > + >>> list(_slicechunk(revlog, [0, 11, 13, 15], targetsize=20)) > [[0], [11], [13, 15]] > """ > if targetsize is not None: > targetsize = max(targetsize, revlog._srmingapsize) > + # targetsize should not be specified when evaluating delta candidates: > + # * targetsize is used to ensure we stay within specification when > reading, > + # * deltainfo is used to pick are good delta chain when writing. > + if not (deltainfo is None or targetsize is None): > + msg = 'cannot use `targetsize` with a `deltainfo`' > + raise error.ProgrammingError(msg) > for chunk in _slicechunktodensity(revlog, revs, > + deltainfo, > revlog._srdensitythreshold, > revlog._srmingapsize): > for subchunk in _slicechunktosize(revlog, chunk, targetsize): > yield subchunk > > -def _slicechunktosize(revlog, revs, targetsize): > +def _slicechunktosize(revlog, revs, targetsize=None): > """slice revs to match the target size > > This is intended to be used on chunk that density slicing selected by > that > @@ -431,12 +441,16 @@ def _slicechunktosize(revlog, revs, targ > endrevidx = idx > yield _trimchunk(revlog, revs, startrevidx) > > -def _slicechunktodensity(revlog, revs, targetdensity=0.5, mingapsize=0): > +def _slicechunktodensity(revlog, revs, deltainfo=None, targetdensity=0.5, > + mingapsize=0): > """slice revs to reduce the amount of unrelated data to be read from > disk. > > ``revs`` is sliced into groups that should be read in one time. > Assume that revs are sorted. > > + ``deltainfo`` is a _deltainfo instance of a revision that we would > append > + to the top of the revlog. > + > The initial chunk is sliced until the overall density > (payload/chunks-span > ratio) is above `targetdensity`. No gap smaller than `mingapsize` is > skipped. > @@ -487,13 +501,21 @@ def _slicechunktodensity(revlog, revs, t > yield revs > return > > - readdata = deltachainspan = _segmentspan(revlog, revs) > + nextrev = len(revlog) > + nextoffset = revlog.end(nextrev - 1) > + > + if deltainfo is None: > + deltachainspan = _segmentspan(revlog, revs) > + chainpayload = sum(length(r) for r in revs) > + else: > + deltachainspan = deltainfo.distance > + chainpayload = deltainfo.compresseddeltalen > > if deltachainspan < mingapsize: > yield revs > return > > - chainpayload = sum(length(r) for r in revs) > + readdata = deltachainspan > > if deltachainspan: > density = chainpayload / float(deltachainspan) > @@ -504,13 +526,21 @@ def _slicechunktodensity(revlog, revs, t > yield revs > return > > + if deltainfo is not None: > + revs = list(revs) > + revs.append(nextrev) > + > # Store the gaps in a heap to have them sorted by decreasing size > gapsheap = [] > heapq.heapify(gapsheap) > prevend = None > for i, rev in enumerate(revs): > - revstart = start(rev) > - revlen = length(rev) > + if rev < nextrev: > + revstart = start(rev) > + revlen = length(rev) > + else: > + revstart = nextoffset > + revlen = deltainfo.deltalen > > # Skip empty revisions to form larger holes > if revlen == 0: > @@ -1980,7 +2010,7 @@ class revlog(object): > if not self._withsparseread: > slicedchunks = (revs,) > else: > - slicedchunks = _slicechunk(self, revs, targetsize) > + slicedchunks = _slicechunk(self, revs, targetsize=targetsize) > > for revschunk in slicedchunks: > firstrev = revschunk[0] > @@ -2393,13 +2423,33 @@ class revlog(object): > # deltas we need to apply -- bounding it limits the amount of > CPU > # we consume. > > - distance = deltainfo.distance > + if self._sparserevlog: > + # As sparse-read will be used, we can consider that the > distance, > + # instead of being the span of the whole chunk, > + # is the span of the largest read chunk > + base = deltainfo.base > + > + if base != nullrev: > + deltachain = self._deltachain(base)[0] > + else: > + deltachain = [] > + > + chunks = _slicechunk(self, deltachain, deltainfo) > + distance = max(map(lambda revs:_segmentspan(self, revs), > chunks)) > + else: > + distance = deltainfo.distance > + > textlen = revinfo.textlen > defaultmax = textlen * 4 > maxdist = self._maxdeltachainspan > if not maxdist: > maxdist = distance # ensure the conditional pass > maxdist = max(maxdist, defaultmax) > + if self._sparserevlog and maxdist < self._srmingapsize: > + # In multiple place, we are ignoring irrelevant data range > below a > + # certain size. Be also apply this tradeoff here and relax > span > + # constraint for small enought content. > + maxdist = self._srmingapsize > if (distance > maxdist or deltainfo.deltalen > textlen or > deltainfo.compresseddeltalen > textlen * 2 or > (self._maxchainlen and deltainfo.chainlen > > self._maxchainlen)): > _______________________________________________ > Mercurial-devel mailing list > Mercurial-devel@mercurial-scm.org > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel >
On 27/07/2018 01:57, Martin von Zweigbergk via Mercurial-devel wrote: > > > On Mon, Jul 16, 2018 at 11:50 AM Boris Feld <boris.feld@octobus.net > <mailto:boris.feld@octobus.net>> wrote: > > # HG changeset patch > # User Paul Morelle <paul.morelle@octobus.net > <mailto:paul.morelle@octobus.net>> > # Date 1528179575 -7200 > # Tue Jun 05 08:19:35 2018 +0200 > # Node ID c34ef3def14e04dca76c43667766496a99636b44 > # Parent 5ae60e5a705ef273d316e33df401e4c44a4c482a > # EXP-Topic write-for-sparse-read > # Available At https://bitbucket.org/octobus/mercurial-devel/ > # hg pull > https://bitbucket.org/octobus/mercurial-devel/ -r c34ef3def14e > sparse-revlog: implement algorithm to write sparse delta chains > (issue5480) > > The classic behavior of revlog._isgooddeltainfo is to consider the > span size > of the whole delta chain, and limit it to 4 * textlen. > Once sparse-revlog writing is allowed (and enforced with a > requirement), > revlog._isgooddeltainfo considers the span of the largest chunk as the > distance used in the verification, instead of using the span of > the whole > delta chain. > > In order to compute the span of the largest chunk, we need to > slice into > chunks a chain with the new revision at the top of the revlog, and > take the > maximal span of these chunks. The sparse read density is a > parameter to the > slicing, as it will stop when the global read density reaches this > threshold. > For instance, a density of 50% means that 2 of 4 read bytes are > actually used > for the reconstruction of the revision (the others are part of > other chains). > > This allows a new revision to be potentially stored with a diff > against > another revision anywhere in the history, instead of forcing it in > the last 4 > * textlen. The result is a much better compression on repositories > that have > many concurrent branches. Here are a comparison between using > deltas from > current upstream (aggressive-merge-deltas on by default) and > deltas from a > sparse-revlog > > > Did you consider simply removing the restriction on span distance? See > how FB did it in > https://bitbucket.org/facebook/hg-experimental/commits/f60c6d7b9976ce1be85f51883d49b12b13302b27. > I tried it on the pypy repo and it shrunk the 00manifest.d a little > more (from 329MB with this patch to 282MB with Durham's). I couldn't > measure any operations getting slower. Do you have any operations that > you would expect would get slower with that patch? I tried setting > maxchainlen=1000 and it produced an insignificantly larger > 00manifest.d (~1MB), in case you're thinking that the smaller manifest > is at the expense of very long chains. Or maybe the difference would > mostly be noticed on cold disk? I didn't test that at all. We came to the same conclusion and sent a series last year to allow setting it up with `experimental.maxdeltachainspan=0` (see e9d325cfe071 for details). Testing it on several repositories in the wild shown that reading the full unlimited span introduces memory consumption issues. The sparse-read capabilities and delta strategies are the solutions to these issues. > > > Comparison of `.hg/store/` size: > > mercurial (6.74% merges): > before: 46,831,873 bytes > after: 46,795,992 bytes (no relevant change) > pypy (8.30% merges): > before: 333,524,651 bytes > after: 308,417,511 bytes -8% > netbeans (34.21% merges): > before: 1,141,847,554 bytes > after: 1,131,093,161 bytes -1% > mozilla-central (4.84% merges): > before: 2,344,248,850 bytes > after: 2,328,459,258 bytes -1% > large-private-repo-A (merge 19.73%) > before: 41,510,550,163 bytes > after: 8,121,763,428 bytes -80% > large-private-repo-B (23.77%) > before: 58,702,221,709 bytes > after: 8,351,588,828 bytes -76% > > Comparison of `00manifest.d` size: > > mercurial (6.74% merges): > before: 6,143,044 bytes > after: 6,107,163 bytes > pypy (8.30% merges): > before: 52,941,780 bytes > after: 27,834,082 bytes -48% > netbeans (34.21% merges): > before: 130,088,982 bytes > after: 119,337,636 bytes -10% > mozilla-central (4.84% merges): > before: 215,096,339 bytes > after: 199,496,863 bytes -8% > large-private-repo-A (merge 19.73%) > before: 33,725,285,081 bytes > after: 390,302,545 bytes -99% > large-private-repo-B (23.77%) > before: 49,457,701,645 bytes > after: 1,366,752,187 bytes -97% > > > The better delta chains provide a performance boost in relevant > repositories: > > pypy, bundling 1000 revisions: > before: 1.670s > after: 1.149s -31% > > Unbundling got a bit slower. probably because the sparse algorithm > is still > pure > python. > > pypy, unbundling 1000 revisions: > before: 4.062s > after: 4.507s +10% > > Performance of bundle/unbundle in repository with few concurrent > branches (eg: > mercurial) are unaffected. > > No significant differences have been noticed then timing `hg push` > and `hg > pull` locally. More state timings are being gathered. > > Same as for aggressive-merge-delta, better delta comes with longer > delta > chains. Longer chains have a performance impact. For example. The > length of > the chain needed to get the manifest of pypy's tip moves from 82 > item to 1929 > items. This moves the restore time from 3.88ms to 11.3ms. > > Delta chain length is an independent issue that affects repository > without > this changes. It will be dealt with independently. > > No significant differences have been observed on repositories where > `sparse-revlog` have not much effect (mercurial, unity, netbeans). > On pypy, > small differences have been observed on some operation affected by > delta chain > building and retrieval. > > > pypy, perfmanifest > before: 0.006162s > after: 0.017899s +190% > > pypy, commit: > before: 0.382 > after: 0.376 -1% > > pypy, status: > before: 0.157 > after: 0.168 +7% > > More comprehensive and stable timing comparisons are in progress. > > diff --git a/mercurial/revlog.py b/mercurial/revlog.py > --- a/mercurial/revlog.py > +++ b/mercurial/revlog.py > @@ -216,6 +216,9 @@ class _testrevlog(object): > def length(self, rev): > return self.end(rev) - self.start(rev) > > + def __len__(self): > + return len(self._data) > + > def _trimchunk(revlog, revs, startidx, endidx=None): > """returns revs[startidx:endidx] without empty trailing revs > > @@ -293,7 +296,7 @@ def _segmentspan(revlog, revs): > return 0 > return revlog.end(revs[-1]) - revlog.start(revs[0]) > > -def _slicechunk(revlog, revs, targetsize=None): > +def _slicechunk(revlog, revs, deltainfo=None, targetsize=None): > """slice revs to reduce the amount of unrelated data to be > read from disk. > > ``revs`` is sliced into groups that should be read in one time. > @@ -341,20 +344,27 @@ def _slicechunk(revlog, revs, targetsize > [[1, 2], [5, 8, 10, 11], [14]] > > Slicing with a maximum chunk size > - >>> list(_slicechunk(revlog, [0, 11, 13, 15], 15)) > + >>> list(_slicechunk(revlog, [0, 11, 13, 15], targetsize=15)) > [[0], [11], [13], [15]] > - >>> list(_slicechunk(revlog, [0, 11, 13, 15], 20)) > + >>> list(_slicechunk(revlog, [0, 11, 13, 15], targetsize=20)) > [[0], [11], [13, 15]] > """ > if targetsize is not None: > targetsize = max(targetsize, revlog._srmingapsize) > + # targetsize should not be specified when evaluating delta > candidates: > + # * targetsize is used to ensure we stay within specification > when reading, > + # * deltainfo is used to pick are good delta chain when writing. > + if not (deltainfo is None or targetsize is None): > + msg = 'cannot use `targetsize` with a `deltainfo`' > + raise error.ProgrammingError(msg) > for chunk in _slicechunktodensity(revlog, revs, > + deltainfo, > revlog._srdensitythreshold, > revlog._srmingapsize): > for subchunk in _slicechunktosize(revlog, chunk, targetsize): > yield subchunk > > -def _slicechunktosize(revlog, revs, targetsize): > +def _slicechunktosize(revlog, revs, targetsize=None): > """slice revs to match the target size > > This is intended to be used on chunk that density slicing > selected by that > @@ -431,12 +441,16 @@ def _slicechunktosize(revlog, revs, targ > endrevidx = idx > yield _trimchunk(revlog, revs, startrevidx) > > -def _slicechunktodensity(revlog, revs, targetdensity=0.5, > mingapsize=0): > +def _slicechunktodensity(revlog, revs, deltainfo=None, > targetdensity=0.5, > + mingapsize=0): > """slice revs to reduce the amount of unrelated data to be > read from disk. > > ``revs`` is sliced into groups that should be read in one time. > Assume that revs are sorted. > > + ``deltainfo`` is a _deltainfo instance of a revision that we > would append > + to the top of the revlog. > + > The initial chunk is sliced until the overall density > (payload/chunks-span > ratio) is above `targetdensity`. No gap smaller than > `mingapsize` is > skipped. > @@ -487,13 +501,21 @@ def _slicechunktodensity(revlog, revs, t > yield revs > return > > - readdata = deltachainspan = _segmentspan(revlog, revs) > + nextrev = len(revlog) > + nextoffset = revlog.end(nextrev - 1) > + > + if deltainfo is None: > + deltachainspan = _segmentspan(revlog, revs) > + chainpayload = sum(length(r) for r in revs) > + else: > + deltachainspan = deltainfo.distance > + chainpayload = deltainfo.compresseddeltalen > > if deltachainspan < mingapsize: > yield revs > return > > - chainpayload = sum(length(r) for r in revs) > + readdata = deltachainspan > > if deltachainspan: > density = chainpayload / float(deltachainspan) > @@ -504,13 +526,21 @@ def _slicechunktodensity(revlog, revs, t > yield revs > return > > + if deltainfo is not None: > + revs = list(revs) > + revs.append(nextrev) > + > # Store the gaps in a heap to have them sorted by decreasing size > gapsheap = [] > heapq.heapify(gapsheap) > prevend = None > for i, rev in enumerate(revs): > - revstart = start(rev) > - revlen = length(rev) > + if rev < nextrev: > + revstart = start(rev) > + revlen = length(rev) > + else: > + revstart = nextoffset > + revlen = deltainfo.deltalen > > # Skip empty revisions to form larger holes > if revlen == 0: > @@ -1980,7 +2010,7 @@ class revlog(object): > if not self._withsparseread: > slicedchunks = (revs,) > else: > - slicedchunks = _slicechunk(self, revs, targetsize) > + slicedchunks = _slicechunk(self, revs, > targetsize=targetsize) > > for revschunk in slicedchunks: > firstrev = revschunk[0] > @@ -2393,13 +2423,33 @@ class revlog(object): > # deltas we need to apply -- bounding it limits the > amount of CPU > # we consume. > > - distance = deltainfo.distance > + if self._sparserevlog: > + # As sparse-read will be used, we can consider that > the distance, > + # instead of being the span of the whole chunk, > + # is the span of the largest read chunk > + base = deltainfo.base > + > + if base != nullrev: > + deltachain = self._deltachain(base)[0] > + else: > + deltachain = [] > + > + chunks = _slicechunk(self, deltachain, deltainfo) > + distance = max(map(lambda revs:_segmentspan(self, > revs), chunks)) > + else: > + distance = deltainfo.distance > + > textlen = revinfo.textlen > defaultmax = textlen * 4 > maxdist = self._maxdeltachainspan > if not maxdist: > maxdist = distance # ensure the conditional pass > maxdist = max(maxdist, defaultmax) > + if self._sparserevlog and maxdist < self._srmingapsize: > + # In multiple place, we are ignoring irrelevant data > range below a > + # certain size. Be also apply this tradeoff here and > relax span > + # constraint for small enought content. > + maxdist = self._srmingapsize > if (distance > maxdist or deltainfo.deltalen > textlen or > deltainfo.compresseddeltalen > textlen * 2 or > (self._maxchainlen and deltainfo.chainlen > > self._maxchainlen)): > _______________________________________________ > Mercurial-devel mailing list > Mercurial-devel@mercurial-scm.org > <mailto:Mercurial-devel@mercurial-scm.org> > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel > > > > _______________________________________________ > Mercurial-devel mailing list > Mercurial-devel@mercurial-scm.org > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
On Fri, Jul 27, 2018 at 1:24 PM Boris FELD <boris.feld@octobus.net> wrote: > On 27/07/2018 01:57, Martin von Zweigbergk via Mercurial-devel wrote: > > > > On Mon, Jul 16, 2018 at 11:50 AM Boris Feld <boris.feld@octobus.net> > wrote: > >> # HG changeset patch >> # User Paul Morelle <paul.morelle@octobus.net> >> # Date 1528179575 -7200 >> # Tue Jun 05 08:19:35 2018 +0200 >> # Node ID c34ef3def14e04dca76c43667766496a99636b44 >> # Parent 5ae60e5a705ef273d316e33df401e4c44a4c482a >> # EXP-Topic write-for-sparse-read >> # Available At https://bitbucket.org/octobus/mercurial-devel/ >> # hg pull https://bitbucket.org/octobus/mercurial-devel/ -r >> c34ef3def14e >> sparse-revlog: implement algorithm to write sparse delta chains >> (issue5480) >> >> The classic behavior of revlog._isgooddeltainfo is to consider the span >> size >> of the whole delta chain, and limit it to 4 * textlen. >> Once sparse-revlog writing is allowed (and enforced with a requirement), >> revlog._isgooddeltainfo considers the span of the largest chunk as the >> distance used in the verification, instead of using the span of the whole >> delta chain. >> >> In order to compute the span of the largest chunk, we need to slice into >> chunks a chain with the new revision at the top of the revlog, and take >> the >> maximal span of these chunks. The sparse read density is a parameter to >> the >> slicing, as it will stop when the global read density reaches this >> threshold. >> For instance, a density of 50% means that 2 of 4 read bytes are actually >> used >> for the reconstruction of the revision (the others are part of other >> chains). >> >> This allows a new revision to be potentially stored with a diff against >> another revision anywhere in the history, instead of forcing it in the >> last 4 >> * textlen. The result is a much better compression on repositories that >> have >> many concurrent branches. Here are a comparison between using deltas from >> current upstream (aggressive-merge-deltas on by default) and deltas from a >> sparse-revlog >> > > Did you consider simply removing the restriction on span distance? See how > FB did it in > https://bitbucket.org/facebook/hg-experimental/commits/f60c6d7b9976ce1be85f51883d49b12b13302b27. > I tried it on the pypy repo and it shrunk the 00manifest.d a little more > (from 329MB with this patch to 282MB with Durham's). I couldn't measure any > operations getting slower. Do you have any operations that you would expect > would get slower with that patch? I tried setting maxchainlen=1000 and it > produced an insignificantly larger 00manifest.d (~1MB), in case you're > thinking that the smaller manifest is at the expense of very long chains. > Or maybe the difference would mostly be noticed on cold disk? I didn't test > that at all. > > We came to the same conclusion and sent a series last year to allow > setting it up with `experimental.maxdeltachainspan=0` (see e9d325cfe071 for > details). > Thanks for the pointer. > Testing it on several repositories in the wild shown that reading the full > unlimited span introduces memory consumption issues. > What operations would use too much memory? Can the effect be seen on some open-source repo? > The sparse-read capabilities and delta strategies are the solutions to > these issues. > > > >> Comparison of `.hg/store/` size: >> >> mercurial (6.74% merges): >> before: 46,831,873 bytes >> after: 46,795,992 bytes (no relevant change) >> pypy (8.30% merges): >> before: 333,524,651 bytes >> after: 308,417,511 bytes -8% >> netbeans (34.21% merges): >> before: 1,141,847,554 bytes >> after: 1,131,093,161 bytes -1% >> mozilla-central (4.84% merges): >> before: 2,344,248,850 bytes >> after: 2,328,459,258 bytes -1% >> large-private-repo-A (merge 19.73%) >> before: 41,510,550,163 bytes >> after: 8,121,763,428 bytes -80% >> large-private-repo-B (23.77%) >> before: 58,702,221,709 bytes >> after: 8,351,588,828 bytes -76% >> >> Comparison of `00manifest.d` size: >> >> mercurial (6.74% merges): >> before: 6,143,044 bytes >> after: 6,107,163 bytes >> pypy (8.30% merges): >> before: 52,941,780 bytes >> after: 27,834,082 bytes -48% >> netbeans (34.21% merges): >> before: 130,088,982 bytes >> after: 119,337,636 bytes -10% >> mozilla-central (4.84% merges): >> before: 215,096,339 bytes >> after: 199,496,863 bytes -8% >> large-private-repo-A (merge 19.73%) >> before: 33,725,285,081 bytes >> after: 390,302,545 bytes -99% >> large-private-repo-B (23.77%) >> before: 49,457,701,645 bytes >> after: 1,366,752,187 bytes -97% >> >> >> The better delta chains provide a performance boost in relevant >> repositories: >> >> pypy, bundling 1000 revisions: >> before: 1.670s >> after: 1.149s -31% >> >> Unbundling got a bit slower. probably because the sparse algorithm is >> still >> pure >> python. >> >> pypy, unbundling 1000 revisions: >> before: 4.062s >> after: 4.507s +10% >> >> Performance of bundle/unbundle in repository with few concurrent branches >> (eg: >> mercurial) are unaffected. >> >> No significant differences have been noticed then timing `hg push` and `hg >> pull` locally. More state timings are being gathered. >> >> Same as for aggressive-merge-delta, better delta comes with longer delta >> chains. Longer chains have a performance impact. For example. The length >> of >> the chain needed to get the manifest of pypy's tip moves from 82 item to >> 1929 >> items. This moves the restore time from 3.88ms to 11.3ms. >> >> Delta chain length is an independent issue that affects repository without >> this changes. It will be dealt with independently. >> >> No significant differences have been observed on repositories where >> `sparse-revlog` have not much effect (mercurial, unity, netbeans). On >> pypy, >> small differences have been observed on some operation affected by delta >> chain >> building and retrieval. >> >> >> pypy, perfmanifest >> before: 0.006162s >> after: 0.017899s +190% >> >> pypy, commit: >> before: 0.382 >> after: 0.376 -1% >> >> pypy, status: >> before: 0.157 >> after: 0.168 +7% >> >> More comprehensive and stable timing comparisons are in progress. >> >> diff --git a/mercurial/revlog.py b/mercurial/revlog.py >> --- a/mercurial/revlog.py >> +++ b/mercurial/revlog.py >> @@ -216,6 +216,9 @@ class _testrevlog(object): >> def length(self, rev): >> return self.end(rev) - self.start(rev) >> >> + def __len__(self): >> + return len(self._data) >> + >> def _trimchunk(revlog, revs, startidx, endidx=None): >> """returns revs[startidx:endidx] without empty trailing revs >> >> @@ -293,7 +296,7 @@ def _segmentspan(revlog, revs): >> return 0 >> return revlog.end(revs[-1]) - revlog.start(revs[0]) >> >> -def _slicechunk(revlog, revs, targetsize=None): >> +def _slicechunk(revlog, revs, deltainfo=None, targetsize=None): >> """slice revs to reduce the amount of unrelated data to be read from >> disk. >> >> ``revs`` is sliced into groups that should be read in one time. >> @@ -341,20 +344,27 @@ def _slicechunk(revlog, revs, targetsize >> [[1, 2], [5, 8, 10, 11], [14]] >> >> Slicing with a maximum chunk size >> - >>> list(_slicechunk(revlog, [0, 11, 13, 15], 15)) >> + >>> list(_slicechunk(revlog, [0, 11, 13, 15], targetsize=15)) >> [[0], [11], [13], [15]] >> - >>> list(_slicechunk(revlog, [0, 11, 13, 15], 20)) >> + >>> list(_slicechunk(revlog, [0, 11, 13, 15], targetsize=20)) >> [[0], [11], [13, 15]] >> """ >> if targetsize is not None: >> targetsize = max(targetsize, revlog._srmingapsize) >> + # targetsize should not be specified when evaluating delta >> candidates: >> + # * targetsize is used to ensure we stay within specification when >> reading, >> + # * deltainfo is used to pick are good delta chain when writing. >> + if not (deltainfo is None or targetsize is None): >> + msg = 'cannot use `targetsize` with a `deltainfo`' >> + raise error.ProgrammingError(msg) >> for chunk in _slicechunktodensity(revlog, revs, >> + deltainfo, >> revlog._srdensitythreshold, >> revlog._srmingapsize): >> for subchunk in _slicechunktosize(revlog, chunk, targetsize): >> yield subchunk >> >> -def _slicechunktosize(revlog, revs, targetsize): >> +def _slicechunktosize(revlog, revs, targetsize=None): >> """slice revs to match the target size >> >> This is intended to be used on chunk that density slicing selected >> by that >> @@ -431,12 +441,16 @@ def _slicechunktosize(revlog, revs, targ >> endrevidx = idx >> yield _trimchunk(revlog, revs, startrevidx) >> >> -def _slicechunktodensity(revlog, revs, targetdensity=0.5, mingapsize=0): >> +def _slicechunktodensity(revlog, revs, deltainfo=None, targetdensity=0.5, >> + mingapsize=0): >> """slice revs to reduce the amount of unrelated data to be read from >> disk. >> >> ``revs`` is sliced into groups that should be read in one time. >> Assume that revs are sorted. >> >> + ``deltainfo`` is a _deltainfo instance of a revision that we would >> append >> + to the top of the revlog. >> + >> The initial chunk is sliced until the overall density >> (payload/chunks-span >> ratio) is above `targetdensity`. No gap smaller than `mingapsize` is >> skipped. >> @@ -487,13 +501,21 @@ def _slicechunktodensity(revlog, revs, t >> yield revs >> return >> >> - readdata = deltachainspan = _segmentspan(revlog, revs) >> + nextrev = len(revlog) >> + nextoffset = revlog.end(nextrev - 1) >> + >> + if deltainfo is None: >> + deltachainspan = _segmentspan(revlog, revs) >> + chainpayload = sum(length(r) for r in revs) >> + else: >> + deltachainspan = deltainfo.distance >> + chainpayload = deltainfo.compresseddeltalen >> >> if deltachainspan < mingapsize: >> yield revs >> return >> >> - chainpayload = sum(length(r) for r in revs) >> + readdata = deltachainspan >> >> if deltachainspan: >> density = chainpayload / float(deltachainspan) >> @@ -504,13 +526,21 @@ def _slicechunktodensity(revlog, revs, t >> yield revs >> return >> >> + if deltainfo is not None: >> + revs = list(revs) >> + revs.append(nextrev) >> + >> # Store the gaps in a heap to have them sorted by decreasing size >> gapsheap = [] >> heapq.heapify(gapsheap) >> prevend = None >> for i, rev in enumerate(revs): >> - revstart = start(rev) >> - revlen = length(rev) >> + if rev < nextrev: >> + revstart = start(rev) >> + revlen = length(rev) >> + else: >> + revstart = nextoffset >> + revlen = deltainfo.deltalen >> >> # Skip empty revisions to form larger holes >> if revlen == 0: >> @@ -1980,7 +2010,7 @@ class revlog(object): >> if not self._withsparseread: >> slicedchunks = (revs,) >> else: >> - slicedchunks = _slicechunk(self, revs, targetsize) >> + slicedchunks = _slicechunk(self, revs, targetsize=targetsize) >> >> for revschunk in slicedchunks: >> firstrev = revschunk[0] >> @@ -2393,13 +2423,33 @@ class revlog(object): >> # deltas we need to apply -- bounding it limits the amount of >> CPU >> # we consume. >> >> - distance = deltainfo.distance >> + if self._sparserevlog: >> + # As sparse-read will be used, we can consider that the >> distance, >> + # instead of being the span of the whole chunk, >> + # is the span of the largest read chunk >> + base = deltainfo.base >> + >> + if base != nullrev: >> + deltachain = self._deltachain(base)[0] >> + else: >> + deltachain = [] >> + >> + chunks = _slicechunk(self, deltachain, deltainfo) >> + distance = max(map(lambda revs:_segmentspan(self, revs), >> chunks)) >> + else: >> + distance = deltainfo.distance >> + >> textlen = revinfo.textlen >> defaultmax = textlen * 4 >> maxdist = self._maxdeltachainspan >> if not maxdist: >> maxdist = distance # ensure the conditional pass >> maxdist = max(maxdist, defaultmax) >> + if self._sparserevlog and maxdist < self._srmingapsize: >> + # In multiple place, we are ignoring irrelevant data range >> below a >> + # certain size. Be also apply this tradeoff here and relax >> span >> + # constraint for small enought content. >> + maxdist = self._srmingapsize >> if (distance > maxdist or deltainfo.deltalen > textlen or >> deltainfo.compresseddeltalen > textlen * 2 or >> (self._maxchainlen and deltainfo.chainlen > >> self._maxchainlen)): >> _______________________________________________ >> Mercurial-devel mailing list >> Mercurial-devel@mercurial-scm.org >> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel >> > > > _______________________________________________ > Mercurial-devel mailing listMercurial-devel@mercurial-scm.orghttps://www.mercurial-scm.org/mailman/listinfo/mercurial-devel > > >
Patch
diff --git a/mercurial/revlog.py b/mercurial/revlog.py --- a/mercurial/revlog.py +++ b/mercurial/revlog.py @@ -216,6 +216,9 @@ class _testrevlog(object): def length(self, rev): return self.end(rev) - self.start(rev) + def __len__(self): + return len(self._data) + def _trimchunk(revlog, revs, startidx, endidx=None): """returns revs[startidx:endidx] without empty trailing revs @@ -293,7 +296,7 @@ def _segmentspan(revlog, revs): return 0 return revlog.end(revs[-1]) - revlog.start(revs[0]) -def _slicechunk(revlog, revs, targetsize=None): +def _slicechunk(revlog, revs, deltainfo=None, targetsize=None): """slice revs to reduce the amount of unrelated data to be read from disk. ``revs`` is sliced into groups that should be read in one time. @@ -341,20 +344,27 @@ def _slicechunk(revlog, revs, targetsize [[1, 2], [5, 8, 10, 11], [14]] Slicing with a maximum chunk size - >>> list(_slicechunk(revlog, [0, 11, 13, 15], 15)) + >>> list(_slicechunk(revlog, [0, 11, 13, 15], targetsize=15)) [[0], [11], [13], [15]] - >>> list(_slicechunk(revlog, [0, 11, 13, 15], 20)) + >>> list(_slicechunk(revlog, [0, 11, 13, 15], targetsize=20)) [[0], [11], [13, 15]] """ if targetsize is not None: targetsize = max(targetsize, revlog._srmingapsize) + # targetsize should not be specified when evaluating delta candidates: + # * targetsize is used to ensure we stay within specification when reading, + # * deltainfo is used to pick are good delta chain when writing. + if not (deltainfo is None or targetsize is None): + msg = 'cannot use `targetsize` with a `deltainfo`' + raise error.ProgrammingError(msg) for chunk in _slicechunktodensity(revlog, revs, + deltainfo, revlog._srdensitythreshold, revlog._srmingapsize): for subchunk in _slicechunktosize(revlog, chunk, targetsize): yield subchunk -def _slicechunktosize(revlog, revs, targetsize): +def _slicechunktosize(revlog, revs, targetsize=None): """slice revs to match the target size This is intended to be used on chunk that density slicing selected by that @@ -431,12 +441,16 @@ def _slicechunktosize(revlog, revs, targ endrevidx = idx yield _trimchunk(revlog, revs, startrevidx) -def _slicechunktodensity(revlog, revs, targetdensity=0.5, mingapsize=0): +def _slicechunktodensity(revlog, revs, deltainfo=None, targetdensity=0.5, + mingapsize=0): """slice revs to reduce the amount of unrelated data to be read from disk. ``revs`` is sliced into groups that should be read in one time. Assume that revs are sorted. + ``deltainfo`` is a _deltainfo instance of a revision that we would append + to the top of the revlog. + The initial chunk is sliced until the overall density (payload/chunks-span ratio) is above `targetdensity`. No gap smaller than `mingapsize` is skipped. @@ -487,13 +501,21 @@ def _slicechunktodensity(revlog, revs, t yield revs return - readdata = deltachainspan = _segmentspan(revlog, revs) + nextrev = len(revlog) + nextoffset = revlog.end(nextrev - 1) + + if deltainfo is None: + deltachainspan = _segmentspan(revlog, revs) + chainpayload = sum(length(r) for r in revs) + else: + deltachainspan = deltainfo.distance + chainpayload = deltainfo.compresseddeltalen if deltachainspan < mingapsize: yield revs return - chainpayload = sum(length(r) for r in revs) + readdata = deltachainspan if deltachainspan: density = chainpayload / float(deltachainspan) @@ -504,13 +526,21 @@ def _slicechunktodensity(revlog, revs, t yield revs return + if deltainfo is not None: + revs = list(revs) + revs.append(nextrev) + # Store the gaps in a heap to have them sorted by decreasing size gapsheap = [] heapq.heapify(gapsheap) prevend = None for i, rev in enumerate(revs): - revstart = start(rev) - revlen = length(rev) + if rev < nextrev: + revstart = start(rev) + revlen = length(rev) + else: + revstart = nextoffset + revlen = deltainfo.deltalen # Skip empty revisions to form larger holes if revlen == 0: @@ -1980,7 +2010,7 @@ class revlog(object): if not self._withsparseread: slicedchunks = (revs,) else: - slicedchunks = _slicechunk(self, revs, targetsize) + slicedchunks = _slicechunk(self, revs, targetsize=targetsize) for revschunk in slicedchunks: firstrev = revschunk[0] @@ -2393,13 +2423,33 @@ class revlog(object): # deltas we need to apply -- bounding it limits the amount of CPU # we consume. - distance = deltainfo.distance + if self._sparserevlog: + # As sparse-read will be used, we can consider that the distance, + # instead of being the span of the whole chunk, + # is the span of the largest read chunk + base = deltainfo.base + + if base != nullrev: + deltachain = self._deltachain(base)[0] + else: + deltachain = [] + + chunks = _slicechunk(self, deltachain, deltainfo) + distance = max(map(lambda revs:_segmentspan(self, revs), chunks)) + else: + distance = deltainfo.distance + textlen = revinfo.textlen defaultmax = textlen * 4 maxdist = self._maxdeltachainspan if not maxdist: maxdist = distance # ensure the conditional pass maxdist = max(maxdist, defaultmax) + if self._sparserevlog and maxdist < self._srmingapsize: + # In multiple place, we are ignoring irrelevant data range below a + # certain size. Be also apply this tradeoff here and relax span + # constraint for small enought content. + maxdist = self._srmingapsize if (distance > maxdist or deltainfo.deltalen > textlen or deltainfo.compresseddeltalen > textlen * 2 or (self._maxchainlen and deltainfo.chainlen > self._maxchainlen)):