Python has earned a reputation for its power, flexibility, and ease of use. These advantages make it widely used in a wide variety of applications, workflows, and fields. But in terms of language design, its natural interpretive power, and its runtime dynamics, Python is always an order of magnitude slower than machine-local languages like C or C++. SPOTO involves some high-edge technologies and various certification exams.
For years, developers have proposed various workarounds for Python's speed limitations. For example, you can write performance-intensive tasks in C and wrap it in Python, as many machine learning libraries do. Or you can use Cython, a project that allows you to add runtime type information to C to compile to C, in this way allows you to use Python code.
But the workaround is never ideal. Wouldn't it be nice if we were able to use the existing Python program as it is and run it faster? This is exactly what PyPy allows you to do.
PyPy and CPython
PyPy is a direct replacement for the Python interpreter CPython. CPython compiles Python into intermediate bytecode and then interprets it by the virtual machine, while PyPy uses real-time (JIT) compilation to convert Python code to the assembly language of the local machine.
Performance improvements can be significant depending on the task being performed. On average, PyPy speeds up Python by about 7.6 times and some tasks by 50 times or more. The CPython interpreter does not perform the same optimizations as PyPy at all, and may never be because this is not one of its design goals.
The best part is that developers need little or no effort to unlock the benefits that PyPy provides. Just replace CPython with PyPy and most of them are done. Some exceptions are discussed below, but PyPy's goal is to run existing, unmodified Python code and provide automation speed improvements.
PyPy currently supports Python 2 and Python 3 through different versions of the project. In other words, you need to download a different version of PyPy, depending on the version of Python you are running. PyPy's Python 2 branch has been around for a long time, but so far, the speed of the Python 3 version has improved a lot. PyPy currently supports Python 3.5 (released version) and Python 3.6 (beta version).
In addition to supporting all core Python languages, PyPy can also be used with most tools in the Python ecosystem, such as pip for packaging or virtualenv for virtual environments. Most Python packages, even those with C modules, will run as they are. Of course, there are some limitations, and we will introduce some restrictions below.
How does PyPy work?
PyPy uses dynamic language optimization techniques from other just-in-time compilers. It analyzes the running Python program to determine the type of information when creating and using objects in the program and then uses that type of information as a guide to speed things up. For example, if a Python function uses only one or two different object types, PyPy generates machine code to handle these specific situations.
PyPy optimizations are handled automatically at runtime, so you usually don't need to adjust their performance. Advanced users may try to use PyPy's command line options to generate faster code for special situations, but this is rarely needed.
PyPy also deviates from the way CPython handles some internal functions, but it also tries to preserve compatible behavior. For example, PyPy handles garbage collection differently than CPython. Not all objects are reclaimed as soon as they are out of scope, so Python programs running under PyPy may take up more memory than when running under CPython. But you can still use Python advanced garbage collection controls exposed through the GC module, such as gc.enable(), gc.disable(), and gc.collect().
If you want to get information about PyPy's JIT (real-time) behavior at runtime, PyPy includes a module pypyjit that exposes a lot of JIT-related information to your Python application. If one of your features or modules does not perform well on the JIT, then pypyjit gives you detailed statistics about it.
Another PyPy-specific module, __pypy__ exposes other PyPy-specific features, so it's very useful for writing applications that take advantage of these features. Due to the dynamic nature of Python's operation, it is possible to build Python applications that use these features when PyPy exists, and ignore them when they don't exist.
PyPy restrictions
It may be that PyPy is as magical as magic, but it is not magical. PyPy also has certain limitations that can weaken or eliminate the effectiveness of certain programs. Hey, PyPy is not a complete universal replacement for the CPython runtime.
PyPy is best for pure Python applications
PyPy performs best in "pure" Python applications, in other words, it works best in applications written in Python without any other language. Due to the way PyPy mimics CPython's native binary interface, Python packages that interface with C libraries (such as NumPy) are not so outstanding.
PyPy developers have solved this problem and made PyPy more compatible with most Python packages that rely on C extensions. For example, Numpy is now very compatible with PyPy. However, if you want to be most compatible with C's extensions, use CPython.
PyPy for long-running programs
One side effect of PyPy's optimization of Python programs is that programs that run longer have the most benefit from PyPy optimization. The longer the program runs, the more runtime type information PyPy can collect, and the more optimization it can make. Once and for all Python scripts won't benefit from this kind of thing. For example, a benefiting Python application typically has a long-running behavior or runs continuously in the background of the web framework.
PyPy is not precompiled
PyPy compiles Python code, but it is not a compiler for Python code. Due to the way PyPy performs its optimization and the inherent dynamics of Python, it is not possible to issue and reuse the generated JITted code as a stand-alone binary. Each program must be compiled for each run. If you want to compile Python into faster code that can be run as a standalone application, then use Cython, Numba, or the current experimental Nuitka project.