by simplegeek on 6/16/21, 12:14 PM with 40 comments
by version_five on 6/17/21, 3:52 PM
by 6gvONxR4sf7o on 6/17/21, 4:33 PM
It turns out that a lot of things that initially seem trivial to precisely define aren't actually that precisely defined, like the length of the california coastline. This is, in my mind--and as a complete tangent--a great argument for wide programming and math education. When you're forced to be so goddamned precise all the time, it's very clear when an idea isn't fully defined.
by SethTro on 6/17/21, 6:03 PM
"ago--never": G docs 1 word vs OpenOffice 2 words.
"tiger-lillies--what": G docs 1 word vs OpenOffice 2 words (IDK what this should "really" be)
"Wanting?--Water": GD 3 words vs OO 2 words
In this case the disagreement springs, exclusively, from if the docs engine believes that double hyphens make a compound word (and potentially handling punctuation in the middle of such a compound word)
by thehappypm on 6/17/21, 3:00 PM
by PaulHoule on 6/17/21, 2:45 PM
Look at the waveform of speech and you will see long silent gaps inside "words" as well as there frequently being no gap between "words".
There are phrases like "Skinny Puppy" that can do the same job as a word, there are also structures smaller than words that people smush together to make words. The two even work together:
missile
anti-missile missile
anti-anti-missile missile missile
If you see "words" as the molecules of text there will always be an asymptote you can't overcome because segmenting text into words will sometimes introduce errors that you might not be able to recover from.by Pet_Ant on 6/17/21, 5:05 PM
"I work at the F.B.I. I like it there." "I work at the F.B.I., I like it there." "I work at the F.B.I. I like it there!"
It's not as simple as counting periods.
So that counting words can have corner cases is definitely understandable. Is "&" a word? It is literally just 'e' + 't' superimposed and "et" is definitely a word.
by LarryMade2 on 6/17/21, 2:57 PM
by asciimov on 6/17/21, 4:44 PM
I remember taking a typing class some 25 years ago and being told a that a word count is typically every 5 characters. That way someone doesn't pad out their word count by using lots of small words.
by jakub_g on 6/17/21, 5:57 PM
This is why each browser used to parse HTML differently.
This is why you'd have compat or even security issues because some software used \r\n for newlines splitting while other used \n.
Luckily the browser vendors formed WHATWG which created pretty precise specs which are maybe convoluted but at least everyone parses HTML in the same way, and each browser pretends to be every other browser for compatibility.
2021 is really great for web compat, maybe not all browsers implement every API, but existing APIs are accompanied by very thorough test suites (Web Platform Tests): https://github.com/web-platform-tests/wpt
Live results from nightly builds: https://wpt.fyi/results/?label=experimental&label=master&ali...
Having said that I don't see vendors aligning on definition on word count any soon due to corporate inertia, lack of incentives and lack of "Word editors consortium" (or is there any?)
A truly good function for word count might be pretty complex, and perhaps different for every language.
by dcolkitt on 6/17/21, 6:15 PM
[1]http://johnsalvatier.org/blog/2017/reality-has-a-surprising-...
by AbrahamParangi on 6/17/21, 4:43 PM
"I've seen things, you people wouldn't believe"
by Zababa on 6/17/21, 6:59 PM
Also, believing that complex problems are easy is not something only programmers do. I work currently a lot with Excel automation, and most people have no idea of what can be automated easily and what can't. I have some people coming that ask for automating a task they've never done manually and don't really know how to do precisely. I think that's the same mechanism of "overabstraction" that leads to people to say "WET instead of DRY" (Write Everything Twice instead of Don't Repeat Yourself).
by spoonjim on 6/17/21, 7:23 PM
by lbriner on 6/17/21, 3:57 PM
by bilater on 6/17/21, 5:57 PM
by sgtnoodle on 6/17/21, 5:57 PM
I can tell when there's a perf/promo cycle coming up at Google because all the core apps on my phone change their UIs and get buggier and slower.
by juancn on 6/17/21, 6:14 PM