Patchwork D8499: rust-regex: increase the DFA size limit for the `regex` crate

login
register
mail settings
Submitter phabricator
Date May 7, 2020, 10:30 a.m.
Message ID <differential-rev-PHID-DREV-dvaapwmfhx7ekn7cont2-req@mercurial-scm.org>
Download mbox | patch
Permalink /patch/46265/
State Superseded
Headers show

Comments

phabricator - May 7, 2020, 10:30 a.m.
Alphare created this revision.
Herald added a reviewer: hg-reviewers.
Herald added a subscriber: mercurial-patches.

REVISION SUMMARY
  `re2`'s DFA limit is already increased in `rust/hg-core/src/re2/rust_re2.cpp`,
  the same has to be done for the `regex` crate.
  
  Big repositories with big `.hgignore`s will sometimes hit this limit and face
  extreme performance regressions (I've seen one take *minutes* for `hg status`).

REPOSITORY
  rHG Mercurial

BRANCH
  stable

REVISION DETAIL
  https://phab.mercurial-scm.org/D8499

AFFECTED FILES
  rust/hg-core/src/matchers.rs

CHANGE DETAILS




To: Alphare, #hg-reviewers
Cc: mercurial-patches, mercurial-devel

Patch

diff --git a/rust/hg-core/src/matchers.rs b/rust/hg-core/src/matchers.rs
--- a/rust/hg-core/src/matchers.rs
+++ b/rust/hg-core/src/matchers.rs
@@ -358,6 +358,10 @@ 
     let pattern_string = unsafe { String::from_utf8_unchecked(escaped_bytes) };
     let re = regex::bytes::RegexBuilder::new(&pattern_string)
         .unicode(false)
+        // Big repos with big `.hgignore` will hit the default limit and
+        // incur a significant performance hit. One repo's `hg status` hit
+        // multiple *minutes*.
+        .dfa_size_limit(50 * (1 << 20))
         .build()
         .map_err(|e| PatternError::UnsupportedSyntax(e.to_string()))?;