by dropbox_miner on 6/16/22, 6:24 PM with 795 comments
by skohan on 6/17/22, 6:47 AM
I was working in C, and looking back I came up with a quite performant solution mostly by accident: all the memory allocated up front in a very cache-friendly way.
The first time I ran the program, it finished in a couple seconds. I was sure something must have failed, so I looked at the output to try to find the error, but to my surprise it was totally correct. I added some debug statements to check that all the data was indeed being read, and it was working totally as expected.
I think before then I had a mental model of a little person inside the CPU looking over each line of code and dutifully executing it, and that was a real eye-opener about how computers actually work.
by forinti on 6/16/22, 7:27 PM
If you hold up a sign with, say, a multiplication, a CPU will produce the result before light reaches a person a few metres away.
by bcatanzaro on 6/16/22, 7:37 PM
2. But, there is a real limit to the speed of a particular piece of code. You can try finding it with a roofline model, for example. This post didn't do that. So we don't know if 201ms is good for this benchmark. It could still be very slow.
by journey_16162 on 6/17/22, 8:38 AM
I don't use a high-end laptop and I'm not eager to upgrade is because I can relate to the average user of the software I develop. I saw plenty of popular web apps feeling really sluggish.
by tpoacher on 6/17/22, 11:28 AM
Don't get me wrong, pandas is a nice library ... but the odd thing is, numpy already has, like, 99% of that functionality built in in the form of structured arrays and records, is super-optimised under the hood, and it's just that nobody uses it or knows anything about it. Most people will have never heard of it.
To me pandas seems to be the sort of library that because popular because it mimics the interface of a popular library from another language that people wanted to migrate to (namely dataframes from R), but that's about it.
Compounding this, is that, it is now becoming an effective library to do things, even if backward, because the network effect means that people are building stuff to work on top of pandas, rather than on top of numpy.
The only times I've had to use pandas in my personal projects was either:
a) when I needed a library that 'used pandas rather than numpy' to hijack a function I couldn't care writing by myself (most recently seaborn heatmaps, and exponentially weighted averages - both relatively trivial things to do with pure numpy, and probably faster, but, eh. Leftpad mentality etc ...)
b) when I knew I'd have to share the code with people who would then be looking for the pandas stuff.
I'm probably wrong, but ...
by w0mbat on 6/17/22, 4:05 AM
by jerf on 6/16/22, 7:15 PM
I have a hard time using (pure) Python anymore for any task that speed is even remotely a consideration for anymore. Not only is it slow even at the best of times, but so many of its features beg you to slow down even more without thinking about it.
by charlie0 on 6/16/22, 7:54 PM
by Taywee on 6/16/22, 8:13 PM
The pure C++ version is so fast, it finishes before you even start it!
by porcoda on 6/16/22, 7:17 PM
by dragontamer on 6/16/22, 7:55 PM
Its hilarious how quickly things work these days if you just used the 90s-era APIs.
Its also fun to play with ControlSpy++ and see the dozens, maybe hundreds, of messages that your Win32 windows receive, and imagine all the function calls that occur in a short period of time (ie: moving your mouse cursor over a button and moving it around a bit).
by hamstergene on 6/16/22, 7:50 PM
Think mobile game that could last 8 hours instead of 2 of it wasn’t doing unnecessary linear searches on timer in JavaScript.
by tiffanyh on 6/16/22, 7:28 PM
NIM should be part of the conversation.
Typically, people trade slower compute time for faster development time.
With NIM, you don’t need to make that trade-off. It allows you to develop in a high-level but get C like performance.
I’m surprise its not more widely used.
by user_7832 on 6/16/22, 7:19 PM
And on a slightly ranty note, Apple's A12z and A14 are still apparently "too weak" to run multiple windows simultaneously :)
by eterm on 6/16/22, 7:19 PM
by justsomeuser on 6/17/22, 8:11 AM
by etaioinshrdlu on 6/16/22, 7:26 PM
by vjerancrnjak on 6/16/22, 7:16 PM
My usual 1-to-1 translations result in C++ being 1-5% of Python exec time, even on combinatorial stuff.
by pdimitar on 6/17/22, 11:50 AM
+-------------------------------------------------+
| People really do love Python to death, do they? |
+-------------------------------------------------+
I find that extremely weird. As a bystander who never relied on Python for anything important, and as a person who regularly had to wrestle with it and tried to use it several times, the language is non-intuitive in terms of syntax, ecosystem, package management, different language version management, probably 10+ ways to install dependencies by now, subpar standard library and an absolute cosmic-wide Wild West state of things in general. Not to mention people keep making command-line tools with it, ignoring the fact that it often takes 0.3 seconds to even boot.Why would a programmer that wants semi-predictable productivity choose Python today (or even 10 years ago) remains a mystery to me. (Example: I don't like Go that much but it seems to do everything that Python does, and better.)
Can somebody chime in and give me something better than "I got taught Python in university and never moved on since" or "it pays the bills and I don't want to learn more"?
And please don't give me the fabled "Python is good, you are just biased" crap. Python is, technically and factually and objectively, not that good at all. There are languages out there that do everything that it does much better, and some are pretty popular too (Go, Nim).
I suppose it's the well-trodden path on integrating with pandas and numpy?
Or is it a collective delusion and a self-feeding cycle of "we only ever hired for Python" from companies and "professors teach Python because it's all they know" from universities? Perhaps this is the most plausible explanation -- inertia. Maybe people just want to believe because they are scared they have to learn something else.
I am interested in what people think about why is Python popular regardless of a lot of objective evidence that as a tech it's not impressive at all.
by mulmboy on 6/16/22, 10:19 PM
by pointernil on 6/16/22, 9:18 PM
Software/System Developers using 'good enough' stacks/solutions are externalising costs for their own benefit.
Making those externalities transparent will drive alot of the transformation needed.
by andrewclunn on 6/16/22, 9:50 PM
by illys on 6/20/22, 12:27 PM
You could have had those discussion at anytime since the upgraded computers and microprocessors have become compatible with the previous generation (i.e. the x86 and PC lines).
The point is that software efficiency measurement has never changed: it is human patience. The developers and their bosses decide the user can wait a reasonable time for the provided service. It is one-to-five seconds for non-real-time applications, it is often about a target framerate or refresh in 3D or real-time applications... The optimization stops when the target is met with current hardware, no matter how powerful it is.
This measure drives the use of programming languages, libraries, data load... all getting heavier and heavier when more processing power gets available. And that will probably never change.
Not sure about it? Just open your browser debugger on the Network tab and load the Google homepage (a field, a logo and 2 buttons). I just did: 2.2 MB, loaded in 2 seconds. It is sized for current hardware and 100 Mbps fiber, not for the actually provided service!
by djmips on 6/17/22, 9:15 PM
by muziq on 6/16/22, 7:08 PM
by xupybd on 6/16/22, 7:55 PM
Using Pandas in production might make sense if your production system only has a few users. Who cares if 3 people have to wait 20 minutes 4 times a year? But if you're public facing and speed equals user retention then no way can you be that slow.
by vlovich123 on 6/16/22, 9:11 PM
No. O3 is fine. -ffast-math is dangerous.
by mg on 6/16/22, 8:06 PM
https://codegolf.stackexchange.com/questions/215216/high-thr...
An optimized assembler implementation is 500 times faster than a naive Python implementation.
By the way, it is still missing a Javascript entry!
by reedjosh on 6/16/22, 9:10 PM
Then rewrite it with a more performant language or cython hooks.
Developing features quickly is greatly aided by nice tools like Python and Pandas. And these tools make it easy to drop into something better when needed.
Eat your cake and have it too!
by abraxas on 6/16/22, 8:11 PM
by wodenokoto on 6/16/22, 6:57 PM
by modeless on 6/16/22, 7:46 PM
by tonto on 6/17/22, 7:24 AM
by ineedasername on 6/16/22, 9:20 PM
This is for normal computer tasks-- browser, desktop applications, UI. The exception to this seem to be tasks that were previously bottlenecked by HDD speeds which have been much improved by solid state disks.
It amazes me, for example, that keeping a dozen miscellaneous tabs open in Chrome will eat roughly the same amount of idling CPU time as a dozen tabs did a decade ago, while RAM usage is 5-10x higher.
by fullstackchris on 6/17/22, 12:15 AM
/s
Sorry for the rude sarcasm, but isn't this a post truly just about the efficiency pitfalls of Python? (or any language / framework choice for that matter)
Of course modern computers are lightning fast. The overhead of every language, framework, and tool will add significant additional compute however, reducing this lightning speed more and more with each complex abstraction level.
I don't know, I guess I'm just surprised this post is so popular, this stuff seems quite obvious.
by varispeed on 6/16/22, 7:44 PM
For instance running unoptimised code can eat a lot of energy unnecessarily, which has an impact on carbon footprint.
Do you think we are going to see regulation in this area akin to car emission bands?
Even to an extent that some algorithms would be illegal to use when there are more optimal ways to perform a task? Like using BubbleSort when QuickSort would perform much better.
by Ultimatt on 6/17/22, 11:06 AM
by pelorat on 6/16/22, 9:50 PM
by aaaaaaaaaaab on 6/16/22, 8:05 PM
by physicsguy on 6/16/22, 8:29 PM
I agree though. I used these tricks a lot in scientific computing. Go to the world outside and people are just unaware. With that said - there is a cost to introducing those tricks. Either in needing your team to learn new tools and techniques, maintaining the build process across different operating systems, etc. - Python extension modules on Windows for e.g. are still a PITA if you’re not able to use Conda.
by wdroz on 6/17/22, 11:39 AM
[0] -- https://www.pola.rs/
by avianes on 6/17/22, 11:34 AM
As an example, with an ILP ~4 instruction/cycle at 5GHz we get 20 billion instructions executed each second in a single core. This number is not really tangible but it shocks
by FirstLvR on 6/16/22, 8:22 PM
Nothing really happened at the end but it's a funny history in the office
by quickthrower2 on 6/16/22, 11:54 PM
by Someone on 6/17/22, 9:56 AM
[…]
Took ~8 seconds to do 1000 calls. Not good at all :(
Isn’t that 8ms per call, way faster than the target performance? Or should that “500ms” be “*500 μs”?
by hermitcrab on 6/17/22, 7:15 AM
by Zetaphor on 6/17/22, 5:47 PM
by xvilka on 6/17/22, 12:12 PM
by FpUser on 6/17/22, 3:11 AM
Believe me I do. This is why my backends are single file native C++ with no Docker/VM/etc. The performance on decent hardware (dedicated servers rented from OVH/Hetzner/Selfhost) is nothing short of amazing.
by lucidguppy on 6/17/22, 1:15 AM
by Shorel on 6/17/22, 7:01 PM
by bawolff on 6/17/22, 8:13 AM
by Havoc on 6/17/22, 10:52 AM
Every cloud / SaaS is throwing free tier compute capacity at people and it’s just overwhelming (in a good way I suppose)
by thanzex on 6/17/22, 10:57 AM
It could be a bit overkill, but whenever I'm writing code on top of optimizing data structures and memory allocations I always try to minimize the use of if statements to reduce the possibility of branch prediction errors. Seeing woefully unoptimized python code being used in a production environment just breaks my heart.
by liprais on 6/16/22, 8:02 PM
by dqpb on 6/17/22, 3:33 AM
by thrwyoilarticle on 6/16/22, 10:43 PM
>double score_array[]
by javajosh on 6/16/22, 7:51 PM
by tintor on 6/16/22, 9:58 PM
by jiggawatts on 6/16/22, 10:44 PM
E.g.: call a "ping" function that does no computation using different styles.
In-process function call.
In-process virtual ("abstract") function.
Cross-process RPC call in the same operating system.
Cross-VM call on the same box (2 VMs on the same host).
Remote call across a network switch.
Remote call across a firewall and a load balancer.
Remote call across the above, but with HTTPS and JSON encoding.
Same as above, but across Availability Zones.
In my tests these scenarios have a performance range of about 1 million from the fastest to slowest. Languages like C++ and Rust will inline most local calls, but even when that's not possible overhead is typically less than 10 CPU clocks, or about 3 nanoseconds. Remote calls in the typical case start at around 1.5 milliseconds and HTTPS+JSON and intermediate hops like firewalls or layer-7 load balancers can blow this out to 3+ milliseconds surprisingly easily.
To put it another way, a synchronous/sequential stream of remote RPC calls in the typical case can only provide about 300-600 calls per second to a function that does nothing. Performance only goes downhill from here if the function does more work, or calls other remote functions.
Yet, every enterprise architecture you will ever see, without exception has layers and layers, hop upon hop, and everything is HTTPS and JSON as far as the eye can see.
I see K8s architectures growing side-cars, envoys, and proxies like mushrooms, and then having all of that go across external L7 proxies ("ingress"), multiple firewall hops, web application firewalls, etc...
by streamlining on 6/17/22, 6:40 AM
With Nixos I switch between Gnome 40 (I do like the Gnome workflow) and i3 w/ some Xfce4 packages, but lately on my older machine the performance of Gnome (especially while running Firefox) is so sluggish in comparison that I may have switched back permanently now.
by newaccount2021 on 6/16/22, 7:34 PM