by Toast_25 on 2/9/18, 4:30 PM with 20 comments
by twic on 2/9/18, 9:50 PM
#![feature(test)]
extern crate regex;
extern crate test;
use regex::Regex;
pub fn check(pattern: &Regex, input: &str) -> bool {
pattern.is_match(input)
}
#[cfg(test)]
mod tests {
use super::*;
use test::Bencher;
#[bench]
fn bench_check(b: &mut Bencher) {
let n = 29;
let pattern = Regex::new(&format!("{}{}", "a?".repeat(n), "a".repeat(n))).unwrap();
let input = "a".repeat(n);
b.iter(|| check(&pattern, &input));
}
}
And then: $ cat rust-toolchain
nightly-2018-02-09
$ cargo bench
Compiling void v1.0.2
Compiling lazy_static v1.0.0
Compiling regex-syntax v0.4.2
Compiling libc v0.2.36
Compiling utf8-ranges v1.0.0
Compiling unreachable v1.0.0
Compiling thread_local v0.3.5
Compiling memchr v2.0.1
Compiling aho-corasick v0.6.4
Compiling regex v0.2.6
Compiling regexps v0.1.0 (file:///home/twic/Code/Regexps)
Finished release [optimized] target(s) in 20.11 secs
Running target/release/deps/regexps-9c084e63d7d31ac3
running 1 test
test tests::bench_check ... bench: 212 ns/iter (+/- 8)
test result: ok. 0 passed; 0 failed; 0 ignored; 1 measured; 0 filtered out
Rust's standard regex library isn't the fastest in the world, and the lack of backtracking can be limiting, but it is highly optimised for Rust's main use case of posting showoff comments on Hacker News.I also tried this in Java (OpenJDK 1.8.0_161-b14), and it took about 24 seconds per iteration at 29 characters.
by thedirt0115 on 2/9/18, 7:26 PM
by glangdale on 2/10/18, 4:19 AM
In our experience, regex matching can be simple and fast, but only sometimes. RE2's approach is straightforward but naive; we built a considerably faster (on some metrics) and more scalable system in Hyperscan. The cost of this turned out to considerable complexity - and there are still many regexes and inputs that don't perform particularly well in either system.
by patrickmay on 2/9/18, 7:40 PM
by carapace on 2/9/18, 9:27 PM
by harpocrates on 2/9/18, 7:19 PM
[1]: https://news.ycombinator.com/item?id=466845
by anilshanbhag on 2/9/18, 9:36 PM
by u801e on 2/9/18, 8:51 PM
[1] http://vimhelp.appspot.com/options.txt.html#%27regexpengine%...
by gumby on 2/9/18, 7:20 PM
The article (and the wikipedia) say Thomson put regexps into a version of QED for CTSS, but CTSS was only used at MIT. He must have traveled between MIT and Bell labs as part of the Multics project. The CTSS reference is pretty obscure.
by lerax on 2/9/18, 7:33 PM