by __exit__ on 12/16/21, 10:09 AM with 41 comments
by zwegner on 12/16/21, 10:37 AM
Discussion: https://news.ycombinator.com/item?id=29439403
The article mentions in an addendum (and BeeOnRope also pointed it out in the HN thread) a nice CLMUL trick for dealing with quotes originally discovered by Geoff Langdale. That should work here for a nice speedup.
But without the CLMUL trick, I'd guess that the unaligned loads that generally occur after a vector containing both quotes and newlines in this version (the "else" case on lines 34-40) would hamper the performance somewhat, since it would eat up twice as much L1 cache bandwidth. I'd suggest dealing with the masks using bitwise operations in a loop, and letting i stay divisible by 16. Or just use CLMUL :)
by jagrsw on 12/16/21, 11:18 AM
gigabytes per second
to
gigabits per siemens
:)
by mattewong on 12/18/21, 7:32 AM
by liuliu on 12/16/21, 5:23 PM
Nice article otherwise!
by michaelg7x on 12/16/21, 6:44 PM
by Tuna-Fish on 12/16/21, 11:18 AM
by rwmj on 12/16/21, 10:46 AM