A High-Performance, Parallel Virtual Machine for Python
OPEN ACCESS
Loading...
Author / Producer
Date
2019
Publication Type
Doctoral Thesis
ETH Bibliography
yes
Citations
Altmetric
OPEN ACCESS
Data
Rights / License
Abstract
Today’s hardware is increasingly parallel, and modern programming languages
must thus allow a programmer to use this parallelism effectively.
For languages that depend on a virtual machine (vm), it is the responsibility
of the vm to execute code efficiently and in parallel. In the case of Python, a
popular dynamic language, several vms exist, but none of them can deliver
high performance and parallel execution at the same time.
The reason for the lack of such a high-performance, parallel vm for Python
lies with Python’s concurrency semantics, which is based on a strong
memory model and atomic operations on large entities. Such a concurrency
semantics does not map well to modern hardware architectures, which is
why parallel vms must emulate this semantics under parallel execution
through expensive synchronization.
This dissertation introduces a new approach to the design and construction
of high-performance, parallel vms with a concurrency semantics such
as Python’s. The introduction of large-scale atomicity to the implementation
language of a vm lets a vm developer specify the concurrency semantics
independently from the vm’s synchronization mechanism. Thereby, the
used synchronization mechanism becomes an exchangeable vm component.
For the high-performance execution of Python code, just-in-time compilation
is an essential concern. Unfortunately, Python’s strong memory
model inhibits basic compiler optimizations under concurrency. Hence, to
allow a compiler to optimize effectively, the concept of Parallel Worlds is
introduced.
Parallel Worlds transparently isolate concurrent computations from each
other, and thereby allow for effective optimizations under the assumption of
no concurrency. The transparent isolation of Parallel Worlds is supported by
a special-purpose software transactional memory system (stm). Apart from
isolation, this stm is the key enabler for the efficient parallel execution
of Python code. Parallel execution builds on the speculative execution
capability of the stm.
The product of this dissertation is PyPy-stm, a high-performance, parallel
vm for Python. With PyPy-stm, multi-threaded Python programs can take
advantage of the parallelism in modern hardware. On a set of benchmark
programs, PyPy-stm outperforms established Python vms such as CPython,
Jython, IronPython, and PyPy. Compared with PyPy, the current top-of-class
in program performance, PyPy-stm achieves speedups in the range
of 1.5 to 8.0× with 8 threads available. These results confirm the viability
of the approach and show that PyPy-stm deserves the designation as a
high-performance, parallel vm for Python.
Permanent link
Publication status
published
External links
Editor
Contributors
Book title
Journal / series
Volume
Pages / Article No.
Publisher
ETH Zurich
Event
Edition / version
Methods
Software
Geographic location
Date collected
Date created
Subject
Dynamic language; Virtual machine; Just-in-time Compilation; Performance; Parallel computing; Parallel programming language; Parallelism; Transactional memory
Organisational unit
03422 - Gross, Thomas (emeritus) / Gross, Thomas (emeritus)