Submitter | phabricator |
---|---|
Date | June 20, 2019, 9:09 p.m. |
Message ID | <differential-rev-PHID-DREV-fh4g6ufirvk523hpzp4v-req@mercurial-scm.org> |
Download | mbox | patch |
Permalink | /patch/40633/ |
State | Superseded |
Headers | show |
Comments
> while work: > - r, i1, copies1 = heapq.heappop(work) > + r, i1, copies = heapq.heappop(work) > if work and work[0][0] == r: > # We are tracing copies from both parents > r, i2, copies2 = heapq.heappop(work) > - copies = {} > - allcopies = set(copies1) | set(copies2) > - for dst in allcopies: > + for dst, src in copies2.items(): > # Unlike when copies are stored in the filelog, we consider > # it a copy even if the destination already existed on the > # other branch. It's simply too expensive to check if the > # file existed in the manifest. > - if dst in copies1: > - # If it was copied on the p1 side, mark it as copied from > + if dst not in copies: > + # If it was copied on the p1 side, leave it as copied from > # that side, even if it was also copied on the p2 side. > - copies[dst] = copies1[dst] > - else: > copies[dst] = copies2[dst] Are we sure there's no `copies` alias held by later `work`? I'm just asking because we've optimized some `copies.copy()`s away at 5ceb91136ebe. I don't know if it's safe or not to mutate `copies` at this point.
yuja added a comment. > while work: > > - r, i1, copies1 = heapq.heappop(work) > > + r, i1, copies = heapq.heappop(work) > > if work and work[0][0] == r: > # We are tracing copies from both parents > r, i2, copies2 = heapq.heappop(work) > > - copies = {} > - allcopies = set(copies1) | set(copies2) > - for dst in allcopies: > > + for dst, src in copies2.items(): > > 1. Unlike when copies are stored in the filelog, we consider > 2. it a copy even if the destination already existed on the > 3. other branch. It's simply too expensive to check if the > 4. file existed in the manifest. > - if dst in copies1: > - # If it was copied on the p1 side, mark it as copied from > > + if dst not in copies: > + # If it was copied on the p1 side, leave it as copied from > > 1. that side, even if it was also copied on the p2 side. > - copies[dst] = copies1[dst] > - else: copies[dst] = copies2[dst] Are we sure there's no `copies` alias held by later `work`? I'm just asking because we've optimized some `copies.copy()`s away at 5ceb91136ebe <https://phab.mercurial-scm.org/rHG5ceb91136ebe79215da1723d3a905009b5f2a64f>. I don't know if it's safe or not to mutate `copies` at this point. REPOSITORY rHG Mercurial CHANGES SINCE LAST ACTION https://phab.mercurial-scm.org/D6561/new/ REVISION DETAIL https://phab.mercurial-scm.org/D6561 To: martinvonz, #hg-reviewers Cc: yuja, mercurial-devel
martinvonz added a comment. In D6561#95834 <https://phab.mercurial-scm.org/D6561#95834>, @yuja wrote: >> while work: >> >> - r, i1, copies1 = heapq.heappop(work) >> >> + r, i1, copies = heapq.heappop(work) >> >> if work and work[0][0] == r: >> # We are tracing copies from both parents >> r, i2, copies2 = heapq.heappop(work) >> >> - copies = {} >> - allcopies = set(copies1) | set(copies2) >> - for dst in allcopies: >> >> + for dst, src in copies2.items(): >> >> 1. Unlike when copies are stored in the filelog, we consider >> 2. it a copy even if the destination already existed on the >> 3. other branch. It's simply too expensive to check if the >> 4. file existed in the manifest. >> - if dst in copies1: >> - # If it was copied on the p1 side, mark it as copied from >> >> + if dst not in copies: >> + # If it was copied on the p1 side, leave it as copied from >> >> 1. that side, even if it was also copied on the p2 side. >> - copies[dst] = copies1[dst] >> - else: copies[dst] = copies2[dst] > > Are we sure there's no `copies` alias held by later `work`? > I'm just asking because we've optimized some `copies.copy()`s away at > 5ceb91136ebe <https://phab.mercurial-scm.org/rHG5ceb91136ebe79215da1723d3a905009b5f2a64f>. I don't know if it's safe or not to mutate `copies` at this > point. I *think* it should be safe. The policy is that each instance can only appear in one place in the queue. I should send a patch to document that policy. Let me know if you see a violation of that policy. REPOSITORY rHG Mercurial CHANGES SINCE LAST ACTION https://phab.mercurial-scm.org/D6561/new/ REVISION DETAIL https://phab.mercurial-scm.org/D6561 To: martinvonz, #hg-reviewers Cc: yuja, mercurial-devel
Patch
diff --git a/mercurial/copies.py b/mercurial/copies.py --- a/mercurial/copies.py +++ b/mercurial/copies.py @@ -272,25 +272,19 @@ heapq.heapify(work) alwaysmatch = match.always() while work: - r, i1, copies1 = heapq.heappop(work) + r, i1, copies = heapq.heappop(work) if work and work[0][0] == r: # We are tracing copies from both parents r, i2, copies2 = heapq.heappop(work) - copies = {} - allcopies = set(copies1) | set(copies2) - for dst in allcopies: + for dst, src in copies2.items(): # Unlike when copies are stored in the filelog, we consider # it a copy even if the destination already existed on the # other branch. It's simply too expensive to check if the # file existed in the manifest. - if dst in copies1: - # If it was copied on the p1 side, mark it as copied from + if dst not in copies: + # If it was copied on the p1 side, leave it as copied from # that side, even if it was also copied on the p2 side. - copies[dst] = copies1[dst] - else: copies[dst] = copies2[dst] - else: - copies = copies1 if r == b.rev(): _filter(a, b, copies) return copies