Patchwork [5,of,5] sshpeer: run the ssh command unbuffered

login
register
mail settings
Submitter Pierre-Yves David
Date May 20, 2015, 11:53 p.m.
Message ID <41cdf78c6c38b842b8db.1432166004@marginatus.alto.octopoid.net>
Download mbox | patch
Permalink /patch/9210/
State Superseded
Commit c88975a4d2643009a94a12f1e37604712a9d13ee
Headers show

Comments

Pierre-Yves David - May 20, 2015, 11:53 p.m.
# HG changeset patch
# User Pierre-Yves David <pierre-yves.david@fb.com>
# Date 1432139498 18000
#      Wed May 20 11:31:38 2015 -0500
# Node ID 41cdf78c6c38b842b8dbed6c0d9e87144f8fedce
# Parent  f0f5363aafdce72c18936ab478480a47b230a941
sshpeer: run the ssh command unbuffered

This will be necessary to get the successful non-blocking IO we need for
restoring real time output with ssh peer.

Changeset fce065538bcf is talking about 5x regression on Mac OS X when playing
with this value. I'll just assume this was not about sshpeer performance. We'll
see if Mac OS X user come knocking at my door with pitchfork.
Matt Mackall - May 22, 2015, 4:37 p.m.
On Wed, 2015-05-20 at 18:53 -0500, Pierre-Yves David wrote:
> # HG changeset patch
> # User Pierre-Yves David <pierre-yves.david@fb.com>
> # Date 1432139498 18000
> #      Wed May 20 11:31:38 2015 -0500
> # Node ID 41cdf78c6c38b842b8dbed6c0d9e87144f8fedce
> # Parent  f0f5363aafdce72c18936ab478480a47b230a941
> sshpeer: run the ssh command unbuffered

I'm going to wait on this one until we've researched whether it
regresses OS X. I've queued the rest for default, thanks.
Matt Mackall - May 22, 2015, 4:51 p.m.
On Wed, 2015-05-20 at 18:53 -0500, Pierre-Yves David wrote:
> # HG changeset patch
> # User Pierre-Yves David <pierre-yves.david@fb.com>
> # Date 1432139498 18000
> #      Wed May 20 11:31:38 2015 -0500
> # Node ID 41cdf78c6c38b842b8dbed6c0d9e87144f8fedce
> # Parent  f0f5363aafdce72c18936ab478480a47b230a941
> sshpeer: run the ssh command unbuffered

I'm going to wait on this one until we've researched whether it
regresses OS X. I've queued the rest for default, thanks.
Pierre-Yves David - May 22, 2015, 5:08 p.m.
On 05/22/2015 11:37 AM, Matt Mackall wrote:
> On Wed, 2015-05-20 at 18:53 -0500, Pierre-Yves David wrote:
>> # HG changeset patch
>> # User Pierre-Yves David <pierre-yves.david@fb.com>
>> # Date 1432139498 18000
>> #      Wed May 20 11:31:38 2015 -0500
>> # Node ID 41cdf78c6c38b842b8dbed6c0d9e87144f8fedce
>> # Parent  f0f5363aafdce72c18936ab478480a47b230a941
>> sshpeer: run the ssh command unbuffered
>
> I'm going to wait on this one until we've researched whether it
> regresses OS X. I've queued the rest for default, thanks.

On my linux here is what local ssh operation give me with this series 
(on mercurial-devel repository evolve enabled)

1) current public tip
2) ssh unbuffered
3) full patch series for real time output

First number is the number of read call. The second one is the timing of 
the whole operation in second

clone
1) 72068 40s
2) 99064 41s
3) 98983 43s

pull 1000:
1) 3940 7.1s
2) 5447 7.1s
3) 5445 7.3s

push 1000:
1) 2026 9.0s
2) 2739 8.4s
3) 2748 8.3s

note:
- the push number are puzzling
- I think our lazy reading from the stderr pipe could be more efficient.
Augie Fackler - May 27, 2015, 7:43 p.m.
On Fri, May 22, 2015 at 12:37 PM, Matt Mackall <mpm@selenic.com> wrote:
> I'm going to wait on this one until we've researched whether it
> regresses OS X. I've queued the rest for default, thanks.


What in particular are you worried about on OS X? I can try and run some tests.
Matt Mackall - May 31, 2015, 5:03 p.m.
On Wed, 2015-05-27 at 15:43 -0400, Augie Fackler wrote:
> On Fri, May 22, 2015 at 12:37 PM, Matt Mackall <mpm@selenic.com> wrote:
> > I'm going to wait on this one until we've researched whether it
> > regresses OS X. I've queued the rest for default, thanks.
> 
> 
> What in particular are you worried about on OS X? I can try and run some tests.

The 5x performance regression mentioned in commit cited in the patch.
Basically, at some point we were reading from ssh a byte at a time
according to dtrace... and context switching each time.

This actually will probably be less visible today as Macs are all
multicore and thus can more efficiently slowly dribble data over a pipe
between two processes, but it will probably still manifest as
significantly higher CPU usage.

Patch

diff --git a/mercurial/sshpeer.py b/mercurial/sshpeer.py
--- a/mercurial/sshpeer.py
+++ b/mercurial/sshpeer.py
@@ -85,11 +85,12 @@  class sshpeer(wireproto.wirepeer):
         self.ui.debug('running %s\n' % cmd)
         cmd = util.quotecommand(cmd)
 
         # while self.subprocess isn't used, having it allows the subprocess to
         # to clean up correctly later
-        self.pipeo, self.pipei, self.pipee, self.subprocess = util.popen4(cmd)
+        sub = util.popen4(cmd, bufsize=0) # no buffer allow the use of 'select'
+        self.pipeo, self.pipei, self.pipee, self.subprocess = sub
 
         # skip any noise generated by remote shell
         self._callstream("hello")
         r = self._callstream("between", pairs=("%s-%s" % ("0"*40, "0"*40)))
         lines = ["", "dummy"]