Python / Cython / Java / Go / Rust

I’m in love with Python. At first I had to use it because I had to make and maintain some Subversion hooks, and hated its indentations, that reminded me of Fortran.

Some years ago I rediscovered it, and I love it as a programming language, it is a language you can have fun programming with, powerful, logical and complete. And it is open source. Recently I discovered Django, and it has become my web framework of choice.

But this Python has a problem. It is not fast as a rattle snake, but slow as a snail. As you may know, Python is interpreted. Unlike C or C++, you cannot compile a machine code executable that the CPU can run directly. The interpreter takes the source file and generates a “*.pyc” byte-code file that is then read for execution. It is a step further than simply interpreting the source code, like BASIC, but it is not real machine code compilation. There are other languages, like Java, that use the same technique.

So let’s do something to accelerate a language we love. Can we compile Python? We have “Cython”, a tool we can use to generate machine code from a Python source code.  Cool! So we have it! Fast Python code, the panacea. Well, it depends. Let’s have a look.

Let’s take a little piece of code that concatenates strings, in Python:

def test_fun():
 s = ""
 for i in range(100000):
 s = s + "/" + str(i)

def main():

if __name__ == "__main__": main()

If we execute it with python interpreter:

$ time python
real 0m3.766s
user 0m1.812s
sys 0m1.955s

With cython, a C source can be generated, that you can compile with gcc. Let’s give it a chance:

$ cython -2 --embed
$ gcc -pthread -fPIC -fwrapv -O2 -Wall -fno-strict-aliasing -I/usr/include/python2.7 -lpython2.7 -o test test.c
$ time ./test
real 0m6.895s
user 0m3.293s
sys 0m3.552s

Oh, no! Compilation <> acceleration. I’m sure this test is too simple, optimizations can be done and all that, maybe cython is good at some specifical tasks, etc. But as you can see, the time as almost double. Not very promising. C code generated by Cython is not optimized at all. In this particular case, it is pesimized.

If we want fast programs, I’m afraid Python is not the choice. But if you want to program fast, Python is your language.

Let’s try to fall in love with some other languages. Well, I have been a Java programmer for years, and I used to love it. I suspect Java will be faster. Let’s translate that simple program to Java. I don’t want to use StringBuffer, instead I want to do it as clear and simple as it is in my Python example above:

package javatest;

public class Test {
   public static void main(String []args) {
     String s = "";
     for (int i=0;i<100000;i++){
        s = s + "/" + Integer.toString(i);

How faster will it be?
time will tell us the truth:

$ time java javatest.Test

real	0m36.517s
user	0m37.533s
sys	0m0.277s

Oh, no! My good old Java is slower than Python for this simple task! What can I do now?
It is not two times slower, but TEN times slower. I’m sure that with StringBuffer we can do something better, maybe another day (or you can give me the answer in the comments below).

I’m afraid I will have to learn a new programming language. No problem, I like it. I have heard of two languages with cool names: Go and Rust

Go is a quite new language by people at Google. It is interpreted, very easy to learn and quite interesting. The translation will be something like this:

package main

import (

func main() {
   s := ""
   for i := 0; i < 100000; i++ {
        s = s + "/" + strconv.Itoa(i)

The mecanism for running Go is similar to that of Python: you generate a byte-code file and run it. So let’s Go:

$ time go run test.go 

real	0m2.789s
user	0m2.363s
sys	0m0.093s

Good! Faster than Python, 1 second below Python’s mark. Very promising.

Our other option for today’s little benchmark is Rust. Rust is compilable, like C.   I am not a Rust expert, so I’m sure my translation below could be improved in many ways. In addition to that, Rust development is very active and the language definition is changing to some degree.

fn main() {

    let mut count: int = 0;
    let mut s = "".to_string();
    let mut count_s = "".to_string();
    let bar = "/".to_string();

    loop {
       count += 1;
       count_s = count.to_string();
       s = s + bar.as_slice() + count_s.as_slice();
       if count == 100000 {
$ rustc
$ time ./test

real	0m0.038s
user	0m0.037s
sys	0m0.001s

What? 38 milliseconds? Rust is similar to C in performance, and this result was expected. Really good result in terms of performace, and Rust is being developed to be a enjoyable language, at least more than C. I agree that C is THE language if you want to program close to the machine, but Rust can be a good alternative if you want to have fun while programming. You have even some web frameworks for it: nickel and Iron, for example.

Web frameworks for Go are also available, and they are very active. Go will go very far (I didn’t want to make more word plays with the name, I swear, but it is too easy…).

I know this benchmark is a I-do-not-know-what-to do-before-going-to-bed-let’s-do-a-benchmark, not very comprehensive, not very accurate, not scientific at all. Comparing Rust with the rest of the languages is not fair. But it served me to have an overview of these programming languages in terms of performance, and wanted to share it with the community.

In summary:

  • Python: 3.8s
  • Cython compiled: 6.9s
  • Java: 36.5s
  • Go: 2.8s
  • Rust: 38ms

6 thoughts on “Python / Cython / Java / Go / Rust”

  1. I also got around 3.6s for the pure python loop.

    However, if you use a list comprehension in python:

    def test_fun_listcom():
    s = “”.join([“/” + str(i) for i in range(100000)])

    I get 42.2 ms !.

    And with the following Cython code:

    cimport cython

    def cy_test_fun():
    str_list = []
    cdef int i
    for i in range(100000):
    s = “”.join(str_list)

    I get down to 34.7 ms !.

    Liked by 1 person

  2. Using list list comprehension in python:

    def test_fun():
    s = “”.join([“/” + str(i) for i in range(100000)])

    I get 42.2 ms.

    And with the following cython code:

    cimport cython

    def cy_test_fun():
    str_list = []
    cdef int i
    for i in range(100000):
    s = “”.join(str_list)

    I get down to 34.7.

    Liked by 1 person

  3. Moreoever,

    If you just replace:

    s = s+ “/” + str(i)


    s += “/” + str(i)

    you go from

    3.5s to 62.2ms, because in the first case you reallocate the string instead of just extending it.

    Liked by 1 person

  4. Wow! Thanks for your comments. You’re right, I was not looking for tricks in order to keep it as simple as possible, but in the real world tricks help a lot. The last change is impressive, very simple and very instructive as well. And Python is amazing!

    Liked by 1 person

  5. converting this
    # — for loop python
    def f(n, m):
    output = 0
    for i in range(n):
    output += i % m
    return output

    %timeit f(1000000, 42)
    best of 3: 58.1 ms per loop

    to proper cython
    # — cython for loop
    def c(int n):
    cdef int i, output
    for i in range(n):
    output += i % 42
    return output

    %timeit c(1000000)
    best of 3: 1.43 ms per loop

    I mean, that speed up is pretty amazing. However, one should have a good understanding of C to really take advantage of the optimizations of cython


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s