from Hacker News

Ask HN: Is anyone using PyPy for real work?

by mattip on 7/31/23, 10:53 AM with 180 comments

I have been the release manager for PyPy, an alternative Python interpreter with a JIT [0] since 2015, and have done a lot of work to make it available via conda-forge [1] or by direct download [2]. This includes not only packaging PyPy, but improving on an entire C-API emulation layer so that today we can run (albeit more slowly) almost the entire scientific python data stack. We get very limited feedback about real people using PyPy in production or research, which is frustrating. Just keeping up with the yearly CPython release cycle is significant work. Efforts to improve the underlying technology needs to be guided by user experience, but we hear too little to direct our very limited energy. If you are using PyPy, please let us know, either here or via any of the methods listed in [3].

[0] https://www.pypy.org/contact.html [1] https://www.pypy.org/posts/2022/11/pypy-and-conda-forge.html [2] https://www.pypy.org/download.html [3] https://www.pypy.org/contact.html

  • by ggm on 7/31/23, 11:52 AM

    I'm using pypy to analyse 350m DNS events a day, through python cached dicts to avoid dns lookup stalls. I am getting 95% dict cache hit rate, and use threads with queue locks.

    Moving to pypy definitely speeded me up a bit. Not as much as I'd hoped, it's probably all about string index into dict and dict management. I may recode into a radix tree. Hard to work out in advance how different it would be: People optimised core datastructs pretty well.

    Uplift from normal python was trivial. Most dev time spent fixing pip3 for pypy in debian not knowing what apts to load, with a lot of "stop using pip" messaging.

  • by reftel on 7/31/23, 12:25 PM

    I use it at work for a script that parses and analyzes some log files in an unusual format. Wrote a naive parser with a parsing combinator library. It was too slow to be usable with CPython. Tried PyPy and got a 50x speed increase (yes, 50 times faster). Very happy with the results, actually =)
  • by macNchz on 7/31/23, 12:38 PM

    I put PyPy in production at a previous job, running a pretty high traffic Flask web app. It was quick and pretty straightforward to integrate, and sped up our request timings significantly. Wound up saving us money because server load went down to process the same volume of requests, so we were able to spin down some instances.

    Haven’t used it in a bit mostly because I’ve been working on projects that haven’t had the same bottleneck, or that rely on incompatible extensions.

    Thank you for your work on the project!

  • by ADcorpo on 7/31/23, 11:50 AM

    This post is a funny coincidence as I tried today to speed-up a CI pipeline running ~10k tests with pytest by switching to pypy.

    I am still working on it but the main issue is psycopg support for now, as I had to install psycopg2cffi in my test environment, but it will probably prevent me from using pypy for running our test suite, because psycopg2cffi does not have the same features and versions as psycopg2. This means either we switch our prod to pypy, which won't be possible because I am very new in this team and that would be seen as a big, risky change by the others, or we keep in mind the tests do not run using the exact same runtime as production servers (which might cause bugs to go unnoticed and reach production, or failing tests that would otherwise work on a live environment).

    I think if I ever started a python project right now, I'd probably try and use pypy from the start, since (at least for web development) there does not seem to be any downsides to using it.

    Anyways, thank you very much for your hard work !

  • by PaulHoule on 7/31/23, 11:40 AM

    I use CPython most of the time but PyPy was a real lifesaver when I was doing a project that bridged EMOF and RDF, particularly I was working with moderately sized RDF models (say 10 million triples) with rdflib.

    With CPython, I was frustrated with how slow it was, and complained about it to the people I was working with, PyPy was a simple upgrade that sped up my code to the point where it was comfortable to work with.

  • by eigenvalue on 7/31/23, 3:04 PM

    Thanks for reminding me to look at PyPy again. I usually start all my new Python projects with this block of commands that I keep handy:

    Create venv and activate it and install packages:

      python3 -m venv venv
      source venv/bin/activate
      python3 -m pip install --upgrade pip
      python3 -m pip install wheel
      pip install -r requirements.txt
    
    
    I wanted a similar one-liner that I could use on a fresh Ubuntu machine so I can try out PyPy easily in the same way. After a bit of fiddling, I came up with this monstrosity which should work with both bash and zsh (though I only tested it on zsh):

    Create venv and activate it and install packages using pyenv/pypy/pip:

      if [ -d "$HOME/.pyenv" ]; then rm -Rf $HOME/.pyenv; fi && \
      curl https://pyenv.run | bash && \
      DEFAULT_SHELL=$(basename "$SHELL") && \
      if [ "$DEFAULT_SHELL" = "zsh" ]; then RC_FILE=~/.zshrc; else RC_FILE=~/.bashrc; fi && \
      if ! grep -q 'export PATH="$HOME/.pyenv/bin:$PATH"' $RC_FILE; then echo -e '\nexport PATH="$HOME/.pyenv/bin:$PATH"' >> $RC_FILE; fi && \
      if ! grep -q 'eval "$(pyenv init -)"' $RC_FILE; then echo 'eval "$(pyenv init -)"' >> $RC_FILE; fi && \
      if ! grep -q 'eval "$(pyenv virtualenv-init -)"' $RC_FILE; then echo 'eval "$(pyenv virtualenv-init -)"' >> $RC_FILE; fi && \
      source $RC_FILE && \
      LATEST_PYPY=$(pyenv install --list | grep -P '^  pypy[0-9\.]*-\d+\.\d+' | grep -v -- '-src' | tail -1) && \
      LATEST_PYPY=$(echo $LATEST_PYPY | tr -d '[:space:]') && \
      echo "Installing PyPy version: $LATEST_PYPY" && \
      pyenv install $LATEST_PYPY && \
      pyenv local $LATEST_PYPY && \
      pypy -m venv venv && \
      source venv/bin/activate && \
      pip install --upgrade pip && \
      pip install wheel && \
      pip install -r requirements.txt
    
    Maybe others will find it useful.
  • by pdw on 7/31/23, 12:27 PM

    We don't. To be honest, I didn't realize PyPy supported Python 3. I thought it was eternally stuck on Python 2.7.

    So the good: It apparently now supports Python 3.9? Might want to update your front page, it only mentions Python 3.7.

    The bad: It only supports Python 3.9, we use newer features throughout our code, so it'd be painful to even try it out.

  • by mkl on 7/31/23, 11:26 AM

    You should probably put "Ask HN:" in your title.

    Personally I don't use PyPy for anything, though I have followed it with interest. Most of the things I need to go faster are numerical, so Numba and Cython seem more appropriate.

  • by q3k on 7/31/23, 11:31 AM

    I use PyPy quite often as a 'free' way to make some non-numpy CPU-bound Python script faster. This is also the context for when I bring up PyPy to others.

    The biggest blocker for me for 'defaulting' to PyPy is a) issues when dealing with CPython extensions and how quite often it ends up being a significant effort to 'port' more complex applications to PyPy b) the muscle memory for typing 'python3' instead of 'pypy3'.

  • by cpburns2009 on 7/31/23, 2:04 PM

    We use PyPy extensively at my employer, a small online retailer, for the website, internal web apps, ETL processes, and REST API integrations.

    We use the PyPy provided downloads (Linux x86 64 bit) because it's easier to maintain multiple versions simultaneously on Ubuntu servers. The PyPy PPA does not allow this. I try to keep the various projects using the latest stable version of PyPy as they receive maintenance, and we're currently transitioning from 3.9/v7.3.10 to 3.10/v7.3.12.

    Thank you for all of the hard work providing a JITed Python!

  • by v3ss0n on 7/31/23, 6:08 PM

    Nice to meet you here mattip.We had used PyPy for several years and I had raise this several times that only thing lacking PyPy is marketing ( and wrong information on cpyext unsupported ). PyPy gave us 8x performance boost on average, 4x min , 20x on especially JSON operation on long loops.

    PyPy should had become standard implemention and it would save a lot of investment on Fast python

    I tried to shill PyPy all the time but thanks to outdated website and weird reason of hetapod love ( at least put something on GitHub for discovery sick) , the devs who won't bother to look anything further than a GitHub page frawns upon me thinking PyPy is outdated and inactive project.

    PyPy is one of the most ambitious project in opensource history and lack of publicity make me scream internally.

  • by rsecora on 7/31/23, 12:37 PM

    I use it for data transformation, cleanup and enrichment. (TXT, CSV, Json, XML, database) to (TXT, CSV, JSON, XML, database).

    Speed up of 30x - 40x. The highest speedup on those that require logic in the transformation. (lot of function calls, numerical operations and dictionary lookups).

  • by ghj on 7/31/23, 6:05 PM

    Copying from an older comment of mine shilling Pypy https://news.ycombinator.com/item?id=25595590

    PyPy is pretty well stress-tested by the competitive programming community.

    https://codeforces.com/contests has around 20-30k participants per contest, with contests happening roughly twice a week. I would say around 10% of them use python, with the vast majority choosing pypy over cpython.

    I would guesstimate at least 100k lines of pypy is written per week just from these contests. This covers virtually every textbook algorithm you can think of and were automatically graded for correctness/speed/memory. Note that there's no special time multiplier for choosing a slower language, so if you're not within 2x the speed of the equivalent C++, your solution won't pass! (hence the popularity of pypy over cpython)

    The sheer volume of advanced algorithms executed in pypy gives me huge amount of confidence in it. There was only one instance where I remember a contestant running into a bug with the jit, but it was fixed within a few days after being reported: https://codeforces.com/blog/entry/82329?#comment-693711 https://foss.heptapod.net/pypy/pypy/-/issues/3297.

    New edit from that previous comment: there's now a Legendary Grandmaster (ELO rating > 3000, ranking 33 out of hundreds of thousands) who almost exclusively use pypy: https://codeforces.com/submissions/conqueror_of_tourist

  • by eigenvalue on 7/31/23, 3:22 PM

    I do think it would be very useful to have an online tool that lets you paste in your requirements.txt and then tells you which of the libraries have been recently verified to work properly with PyPy without a lot of additional fuss.

    Also, you might want to flag the libraries that technically "work" but still require an extremely long and involved build process. For example, I recently started the process of installing Pandas with pip in a PyPy venv and it was stuck on `Getting requirements to build wheel ...` for a very long time, like 20+ minutes.

  • by Twirrim on 7/31/23, 3:00 PM

    I was experimenting with some dynamic programming 0/1 knapsack code last week. PyPy available through the distro (7.3.9) was making a reasonable speed up, but not phenomenally. Out of curiousity I grabbed the latest version through pyenv (7.3.12) and it looks like some changes between them suddenly had the code sit in a sweet spot with it, I saw a couple of orders of magnitude better performance out of it. Good work.

    I'm rarely using python in places at work where it would suit it (lots of python usage, but they're more on the order of short run tools), but I'm always looking for chances and always using it for random little personal things.

  • by twp on 7/31/23, 1:05 PM

    Yes. We have a legacy Python-based geospatial data processing pipeline. Switching from CPython to PyPy sped it up by a factor of 30x or so, which was extremely helpful.

    Thank you for your amazing work!

  • by ant6n on 7/31/23, 1:24 PM

    When I worked at Transit App, I built a backend pre-processing pipeline to compress transit and osm data in python [1] and also another pipeline to process transit map data in python [2]. Since the Ops people complained about how long it took to compress the transit feeds (I think London took 10h each time something changed), I migrated everything to Pypy. Back then that was a bit annoying cuz it meant I had to remove numpy as a requirement, but other than that there were few issues. Also it meant we were stuck on 2.7 for quite a while, so long that I hadnt prepared a possible migration to 3.x. The migration happened after I left. Afaik they still use pypy.

    Python is fun to work with (except classes…), but its just sooo slow. Pypy can be a life saver.

    [1] https://blog.transitapp.com/how-we-shrank-our-trip-planner-t... [2] https://blog.transitapp.com/how-we-built-the-worlds-pretties...

  • by wiz21c on 7/31/23, 11:45 AM

    I don't use PyPy because when I'm stuck with performance issues, I go to numpy and if it really doesn't work I go to cython/numba (because it means that 99% of my python code continue to work the same, only the 1% that gets optimized is different; if I'd go PyPy, I'd have to check my whole code again). I do mostly computational fluid dynamics.

    (nevertheless, PyPy is impressive :-) )

  • by oebs on 7/31/23, 4:02 PM

    I'm maintaining an internal change-data-capture application that uses a python library to decode mysql binlog and store the change records as json in the data lake (like Debezium). For our most busiest databases a single Cpython process couldn't process the amount of incoming changes in real time (thousands of events per second). It's not something that can be easily parallelized, as the bulk of the work is happening in the binlog decoding library (https://github.com/julien-duponchelle/python-mysql-replicati...).

    So we've made it configurable to run some instances with Pypy - which was able to work through the data in realtime, i.e. without generating a lag in the data stream. The downside of using pypy was increased memory usage (4-8x) - which isn't really a problem. An actually problem that I didn't really track down was that the test suite (running pytest) was taking 2-3 times longer with Pypy than with CPython.

    A few months ago I upgraded the system to run with CPython 3.11 and the performance improvements of 10-20% that come with that version now actually allowed us to drop Pypy and only run CPython. Which is more convenient and makes the deployment and configuration less complex.

  • by eslaught on 7/31/23, 4:36 PM

    We use PyPy for performing verification of our software stack [1], and also for profiling tools [2]. The verification tool is basically a complete reimplementation of our main product, and therefore encodes a massive amount of business logic (and therefore difficult to impossible to rewrite in another language). As with other users, we found the switch to PyPy was seamless and provides us with something like a 2.5x speedup out of the box, with (I think) higher speedups in some specific cases.

    We eventually rewrote the profiler tool in Rust for additional speedups, but as mentioned for the verification engine, it's probably too complicated to ever do that so we really appreciate drop-in tools like PyPy that can speed up our code.

    [1]: https://github.com/StanfordLegion/legion/blob/master/tools/l...

    [2]: https://github.com/StanfordLegion/legion/blob/master/tools/l...

  • by waysa on 7/31/23, 11:52 AM

    I used PyPy with SymPy when I was helping out a mathematician-friend. SymPy is not exactly fast, a free performance boost was very welcome.
  • by t90fan on 7/31/23, 12:03 PM

    I can't remember exactly what the use case was but we used at my old work (Start up providing a Web CDN/WAF type service, think the kind of stuff CloudFlare does nowadays) in ~2013 for some sort of batch processing analytics/billing type job, using MRJob and AWS Elastic Map Reduce over a seriously large data set.

    The performance of PyPy over CPython saved us loads and loads time and thus $$$s, from what I can recall.

  • by tgbugs on 7/31/23, 6:26 PM

    We use pypy3 on musl via gentoo in production to run dataset validation pipelines. The easiest place to see that we use pypy3 is probably [1]. The build process and patches we carry are under [2].

    We also use pypy3 to accelerate rdflib parsing and serialization of various RDF formats. See for example [3].

    Thanks to you and the whole PyPy team!

    1. https://github.com/tgbugs/dockerfiles/blob/6f4ad5d873b7ab267...

    2. https://github.com/tgbugs/dockerfiles/blob/6f4ad5d873b7ab267...

    3. https://github.com/SciCrunch/sparc-curation/blob/0fdf393e26f...

  • by fragebogen on 8/1/23, 11:50 AM

    I'm running a constrained convex optimization project at work, where we need as close to real time (<10s is great, <1min is acceptable) responses for a web interface.

    Basically I'm using a SciPy exclusively for the optimization routine:

    * minimize(method="SLSQP") [0]

    * A list comprehention which calls ~10-500 pre-fitted PchipInterpolator [1] functions and stores the values as a np.array().

    The Pchip functions (and it's first derivatives) are used in the main opt function as well as in several constraints.

    Most jobs took about 10 seconds but the long tail might take up to 10 min some times. I tried the pypy 3.8 (7.3.9), and saw similar compute times on the shorter jobs, but roughly ~2x slower compute times on the heavier jobs. This obviously was not what I expected, but I had very limited experience with pypy and didn't know how to debug further.

    Eventually python 3.10 came around and gave 1.25x speed increase, and then 3.11 which gave another 1.6-1.7x increase which gave a decent ~2x cumulative speedup, but the occasional heavy jobs still stay in the 5 min range and would have been nicer in the 10-30s obviously.

    Still I would like to say that trying pypy out was a quite smooth experience, staying within scipy land, took me half a day to switch and benchmark. But if anyone else has experience with pypy and scipy, knowing some obvious pitfalls, it would be much appreciated to hear.

    [0] https://docs.scipy.org/doc/scipy/reference/optimize.minimize...

    [1] https://docs.scipy.org/doc/scipy/reference/generated/scipy.i...

  • by Apreche on 7/31/23, 1:18 PM

    I don’t actually use PyPY, but I’m very aware of it. My understanding is that the only reason to use PyPy instead of the default Python is for performance gains. For the vast majority of projects I work on, the performance of our code on the CPU is almost never the bottleneck. The slowness is always in IO, databases, networks, etc.

    That said, if I do ever run into a situation where I need my code to perform better, PyPy is high on my list of things to try. It’s nice to know it’s an option.

  • by cool-RR on 7/31/23, 6:08 PM

    Hi Matti. I'm happy to see that you're doing community outreach. I haven't tried PyPy in a while. The general impression I have about PyPy is that as soon as you try to do anything a little bit complicated, things break in unexpected ways and there's little support. Also, I love using Wing IDE for debugging, and if I'm not mistaken it can't debug PyPy code.

    I'm currently doing multi-agent reinforcement learning research using RLlib, which is part of Ray. I tried to install a PyPy environment for it. It failed because Ray doesn't provide a wheel for it:

        Could not find a version that satisfies the requirement ray (from versions: none)
    
    My hunch is that even Ray did provide that, there would have been some other roadblock that would have prevented me from using PyPy.
  • by oxmane on 7/31/23, 4:24 PM

    At Alooma (https://www.linkedin.com/mwlite/company/alooma) we've been running all our integrations with data sources using PyPy. Main motivation was indeed performance gains.

    FWIW, since I've seen it mentioned, we've also been using psycopg2cffi to access Postgres sources.

    The product now lives (at least partially) as Datastream on GCP (https://cloud.google.com/datastream/docs/overview). I'm not sure though if it's still running on PyPy.

    I could try and connect with the folks still working on it, if you're interested.

  • by lsferreira42 on 7/31/23, 1:31 PM

    I'm building a bot detector api to use with our CDN and using pypy was decided on day one, without pypy the performance is just not there.

    Also in my day job we use pypy in all our python deployments, to be fair until now I thought that everybody would develop in python, test in pypy for an easy speed boost and only got back to python if pypy was slower than cpython

  • by _han on 7/31/23, 1:13 PM

    I didn't hear about PyPy before, but I think you're doing great work.

    I would be interested in seeing benchmarks where PyPy is compared with more recent versions of CPython. https://www.pypy.org/ currently shows a comparison with CPython 3.7, but recent releases of CPython (3.11+) put a lot of effort into performance which is important to take into account.

  • by wg0 on 7/31/23, 12:50 PM

    While the community is here, anyone has embedded pypy as scriptable language for some larger program? Like Inkscape or scripting as part of a rule engine. Or for that, CPython is more suitable?
  • by bofaGuy on 7/31/23, 1:17 PM

    My biggest issue is that DataDog doesn’t support PyPy. Out of curiosity, I made a new branch of our app and took out DataDog and observed a significant improvement in performance when using PyPy vs CPython on the same branch (but can’t remember how much).
  • by IshKebab on 7/31/23, 12:32 PM

    I've never used it because the (unknown) effort of switching and the chance of compatibility issues have always made it unappealing compared to just switching to a faster language.

    If I could just `pip3 install pypy` and then set an environment variable to use it or something like that then I'd give it a try. It does feel a bit like adding a jet pack to a rowing boat though. I know some people use Python in situations where the performance requirement isn't "I literally don't care" but surely not very many?

    Obviously if it was the default that would be fantastic.

  • by btown on 7/31/23, 6:08 PM

    A sub-question for the folks here: is anyone using the combination of gevent and PyPy for a production application? Or, more generally, other libraries that do deep monkey-patching across the Python standard library?

    Things like https://github.com/gevent/gevent/issues/676 and the fix at https://github.com/gevent/gevent/commit/f466ec51ea74755c5bee... indicate to me that there are subtleties on how PyPy's memory management interacts with low-level tweaks like gevent that have relied on often-implicit historical assumptions about memory management timing.

    Not sure if this is limited to gevent, either - other libraries like Sentry, NewRelic, and OpenTelemetry also have low-level monkey-patched hooks, and it's unclear whether they're low-level enough that they might run into similar issues.

    For a stack without any monkey-patching I'd be overjoyed to use PyPy - but between gevent and these monitoring tools, practically every project needs at least some monkey-patching, and I think that there's a lack of clarity on how battle-tested PyPy is with tools like these.

  • by RMPR on 7/31/23, 4:16 PM

    I don't. I work for a company where we always try to track the latest stable version of Python. Right now we are on 3.11, and unfortunately Pypy is lagging behind.
  • by PartiallyTyped on 7/31/23, 1:21 PM

    Hey, you might want to delete the link to https://mesapy.org/rpython-by-example in https://doc.pypy.org/en/latest/architecture.html as it is pointing to a resource that people are unable to access.
  • by saltcured on 8/1/23, 4:33 PM

    I used it for real once over a decade ago, when I had to help some researchers who wanted to load an archive of Twitter JSON dumps into an RDBMS. This was basically cleaning/transliterating data fields into CSV that could bulk-import into PostgreSQL. I think we were using Python 2.7 back then.

    1. The same naive deserialization and dict processing code ran much faster with PyPy.

    2. Conveniently, PyPy also tolerated some broken surrogate pairs in Twitter's UTF8 stream, which threw exceptions when trying to decode the same events with the regular Python interpreter.

    I've had some web service code where I wished I could easily swap to PyPy, but these were conservative projects using Apache + mod_wsgi daemons with SE-Linux. If there were a mod_wsgi_pypy that could be a drop-in replacement, I would have advocated for trials/benchmarking with the ops team.

    Most other performance-critical work for me has been with combinations of numpy, PyOpenCL, PyOpenGL, and various imaging codecs like `tifffile` or piping numpy arrays in/out of ffmpeg subprocesses.

  • by ahallan on 7/31/23, 1:31 PM

    I've used it at work to speed up some standard Python code (without any c-bound library usage). It sped up the code by 5 times.

    I've deployed used the pypy:3.9 image on docker.

    One thing I did notice is that it was significantly faster on my local machine vs when I tried to deploy it using an AWS lambda/fargate. I know this is because of virtualization/virtual-cpu, but there was not much I could do to improve it.

  • by danielpassy on 8/9/23, 1:39 PM

    If I'm not mistaken, at Buser Brasil, Brazilian Flix Bus, the destination search is powered by PyPy https://www.buser.com.br/
  • by wenc on 7/31/23, 3:23 PM

    I actually donated to the Pypy project in the past but I don’t use it.

    Two reasons for my hesitation:

    1) Cpython is fast enough for most things I need to do. The speed improvement from Pypy is either not enough or not necessary.

    2) Lingering doubts about subtle incompatibility (in terms of library support) that I might have to spend hours getting to the bottom of.

    I already work long hours and don’t have bandwidth to tinker. With Cpython, although slow, I can be assured is the standard surface that everyone targets, and I can google solutions for.

    It’s the subtle things that i waste a lot of time on. It’s analogous to an Ubuntu user trying to use Red Hat. They’re both Linuxes but the way things are done are different enough that they trip you up.

    The only way to get out of this quandary is for Pypy to be a first class citizen. Guido will never endorse this so this means a bunch of us will always have hesitation putting it into production systems.

  • by comboy on 7/31/23, 3:58 PM

    A bit meta. It seems like it would be nice to have no-action tickets for open source projects.

    Quite often you would want to just thank somebody, or say that you would prefer it that way and don't understand why is it this way or it would be cool to have this or that, but of course opening ticket on github feels like wasting time of the maintainer and especially when you have some feedback like e.g. what would you like to see or what you do and don't like it feels entitled because well you can do it yourself, you can fork etc.

    It would need to be low friction for both sides. Preferably with no way to respond so that there's zero pressure and little time waste for maintainers.

    Mail feels like you want something, it works for thank you but still feels bad on receiving end when you just ignore them.

  • by CurriedHautious on 7/31/23, 5:07 PM

    What is the compatibility of PyPy with a typical web server deployment? I am currently looking at testing compatibility with Tornado -> SQL Alchemy -> psycopg2. It seems like the C-extensions are a common tripping point. I see the recommendation to use psycopg2cffi, but it seems that package's last release was 2019 :(

    SQL Alchemy actually points to PyPy in its recommendations of things to try in ORM performance. https://docs.sqlalchemy.org/en/20/faq/performance.html#resul...

  • by landtuna on 7/31/23, 3:56 PM

    I used it in a situation where replacing Python code with a C-implemented module was not efficient because there were too many small objects being marshaled in and out of PyObjects. PyPy let everything stay in Python-land and still run quickly.
  • by Qem on 7/31/23, 3:34 PM

    I can't really use it at work, due to restrictive corporative policy in place. I don't control my workstation setup, and I'm only allowed vanilla CPython there. Regarding this, I wished PyPy were pip installable from inside CPython, like Hylang and Pyston do.

    But while programming as a hobby at home, mostly small-scale simulations, PyPy is my default interpreter for Python. It seems PyPy has a sweet spot on code written relying heavily on OOP style, with a lot of method calls and self invocation. I consistently get 8-10x speed improvements.

  • by hnfong on 7/31/23, 5:14 PM

    I use pypy as a drop-in replacement for CPython for some small data crunching scripts of my hobby projects. Might not count as "real work", but getting "free" speed ups is very nice and I'm very grateful for the PyPy project for providing a performant alternative to CPython.

    I was close to trying pypy on a production django deployment (which gets ~100k views a month), but given that the tiny AWS EC2 instance we're running it on is memory bound, the increased pypy memory usage made it impractical to do so.

  • by alfalfasprout on 7/31/23, 4:50 PM

    Years ago, yes I used it.

    Nowadays, to be honest, everything that I need to be fast in Python is largely around numerical code which either calls out to C/C++ (via numpy or some ML library) or I use numba for. And these are either slower w/ PyPi or won't work.

    HTTP web servers are notoriously slow in Python (even the fastest ones like falcon) but I found they either didn't play nicely with Pypi or weren't any faster. In large part because if the API does any kind of "heavy lifting" they can't be truly concurrent.

  • by claytonjy on 7/31/23, 1:48 PM

    Can someone ELI5 why pypy doesn't or can't work with C-based packages like numpy or psycopg? I know nothing of how pypy does its magic.

    If we could use pypy, while still using those packages, I think it'd be the go-to interpreter. Why can't pypy optimize everything else, and leave the C stuff as-is?

    How does pypy handle packages written in other languages, like rust? can I use pypy if I depend on Pydantic?

  • by ideasman42 on 7/31/23, 12:51 PM

    If it was relatively up to date with Python3 I'd use it, but as it lags behind considerably I avoid it, even for personal work.
  • by justinc-md on 7/31/23, 4:39 PM

    I used PyPy extensively at a previous employer. The use case was to accelerate an application that was CPU-bound because of serde, which could not be offloaded using multiprocessing. PyPy resulted in a 10x increase in message throughput, and made the project viable in python. Without PyPy, we would have rebuilt the application in Java.
  • by Aqueous on 7/31/23, 4:56 PM

    Can you tell us the obstacles for incorporating learnings from or even backporting the work from PyPi back into CPython?
  • by vogu66 on 7/31/23, 7:10 PM

    I've actually come across and started using Pyjion recently (https://github.com/tonybaloney/pyjion); how does Pypy compare, both in terms of performance and purpose? There seems to be a lot of overlap...
  • by garyrob on 7/31/23, 11:54 AM

    I've never ended up using PyPy other than to play with it. Numba has worked very well for me for real code.
  • by qeternity on 7/31/23, 3:59 PM

    We have evaluated PyPy but actually found Pyston to be more performant in most of our use cases (even with extensive JIT warming). That project unfortunately seems unmaintained now, but I am hoping that the improvements will be upstreamed into CPython.
  • by pyuser583 on 7/31/23, 1:16 PM

    I don’t use it, but I’d like to.

    The big obstacle is that for while we would have multiple execution environments. It’s not like we could flip a switch and all Dockerfiles are using PyPy.

    Plus I don’t think AWS Lambda supports it.

    If I could go back in time, we would use it from the beginning.

  • by kzrdude on 7/31/23, 12:18 PM

    I wonder if programs like Rye, that distribute python in a way similar to Rust's rustup, can help. Rye already supports pypy, you can just pull down pypy3.9 at will into any particular python project managed by rye.
  • by radus on 7/31/23, 2:47 PM

    I don’t use it because I make frequent use of scientific libraries. If it were possible to use on a function by function basis, with a decorator like numba, I would definitely give it a go.
  • by garashovb on 8/1/23, 6:34 AM

    David Beazley - PyCon 2012 Keynote Talk (Tinkering with PyPy)

    https://youtu.be/6_-5XZzJyt0

  • by zapregniqp on 8/1/23, 12:38 PM

    Yes, I have personally used for some system admin tasks. I've used PyPy to write scripts and tools for automating tasks due to its performance benefits.
  • by woopwoop24 on 7/31/23, 4:43 PM

    not using it, but thank you for the work you put in, highly appreciated, we only strive, because so many of us put in the work :)
  • by czbond on 7/31/23, 5:50 PM

    Question - new'ish to Python. Could I use PyPy with dataframes / pandas / Ray?
  • by password4321 on 7/31/23, 5:41 PM

    Time to add some opt-out telemetry! (runs for the hills)

    So... thanks for not doing that.

  • by ComplexSystems on 7/31/23, 3:58 PM

    I would like to, but aren't there issues using it with NumPy and SciPy?
  • by nurettin on 8/1/23, 6:19 PM

    I use it to speed up NEAT-Python for simulations
  • by m_antis89 on 7/31/23, 2:44 PM

    > Is anyone using PyPy for real work? Yes.
  • by andrewstuart on 7/31/23, 11:49 AM

    I’ve been aware of it for a long time.

    I don’t use it.

    Why would I use it, what’s the compelling benefit?

  • by ceeam on 7/31/23, 12:01 PM

    I liked Psyco a lot, it was totally awesome and with very few bugs (CPython differences) but that was looong ago. PyPy looks and feels like a monstrosity, it builds longer than most software for once, which is off-putting. I would be more interested in a Python JIT which is more like LuaJIT to Lua.