homme.io
Clean.Precise.Quick.
..
PAX ROMANA
SAKURA
Фотография
Философия
Искусство
История
C/C++
DBMS
Oracle
Спорт
Linux
Lua
IT

Infinitum.Aeterna
2024.Китай
Иран в лицах
2023.Иран
2023.06.Москва
2023.Стамбул
2023.ЗИМА
2022.11.Турция
2022.ОСЕНЬ
2022.08.Зубовка
2022.07.Турция
2022.Раменское
2022.ЛЕТО
2022.Архангельское
2022.Парк 50-летия Октября
2022.Санкт-Петербург
2022.Ярославль
2022.03.Зубовка
2022.Кокошкино
2022.Сочи
2022.ВЕСНА
2022.02.Царицыно
2022.Стамбул
2022.02.Коломенское
2022.ЗИМА
2021.Зубовка
2021.ОСЕНЬ
2021.Египет
2021.Раменское
2021.ЛЕТО
2021.Дивеево
2021.Азов
2021.02.Зоопарк
2021.Карелия
2020.Санкт-Петербург
2020.Турция
2020.Аносино
2020.Азов
2020.Верея
2020.Арктика
2020.Греция
2019.Турция
2019.Зубовка
2019.Дагестан
2019.Дагестан+
2019.Египет
2019.Италия
2019.Куликово поле
2019.Калуга
2019.02.Танцы
2019.Байкал
2018.Переславль
2018.Плес
2018.Березка
2018.Крым
2018.Азов
2018.Калининград
2018.Санкт-Петербург
2018.Эльбрус
2017.Турция
2015.Египет
2013.Египет
2013.Рим
Разное

Synthetic Performance Test: GCC vs Intel ICC vs LuaJIT vs LuaJIT+FFI vs JavaScript

Hardware

Intel i7 2.3 GHz 4 cores (8 logical), 16GB RAM, SSD, macOS Mojave + Parallels 12;
VM Oracle Linux 7.6 configured with 4 cores, 8 GB RAM

C Test File

#include stdio.h
 
#define N 4000
#define S 1000
 
struct t {
        double a, b, f;
};
 
 
int main (int argc, char **argv) {
        int i, j;
        struct t t[N];
 
        for(i=0; i
                t[i].a = 0;
                t[i].b = 1;
                t[i].f = i * 0.25;
        };
 
        for(j=0; j
                for(i=0; i
                        t[i].a += t[i].b * t[i].f;
                        t[i].b -= t[i].a * t[i].f;
                }
                printf("%.6f\n", t[1].a);
        }
 
        return 0;
}


 

GCC (4.8.5)

gcc lua_perf.c -o lua_perf -O3 -Wall -march=native -ftree-parallelize-loops=4 -floop-parallelize-all -ftree-vectorize

time ./lua_perf > /dev/null

real    0m5.604s
user    0m22.263s
sys    0m0.118s

top
404 root      20   0   36468   4624   1356 R 400.0(%CPU)  0.1(%MEM)   0:09.08 lua_perf

GCC8 (8.3.1 from devtoolset-8)

real    0m5.695s
user    0m22.632s
sys    0m0.124s

 

ICC (19.0.4.235)

/opt/intel/system_studio_2019/bin/icc lua_perf.c -O2 -o lua_perf_icc -no-prec-div -ipo -xSSE4.2 -parallel

time ./lua_perf_icc > /dev/null

real    0m5.322s
user    0m21.186s
sys    0m0.074s

 

top
14344 root      20   0  247848   6588   3100 R 400.0 (%CPU)  0.1(%MEM)   0:14.55 lua_perf_icc           

Lua Test File


local N = 4000 
local S = 1000 
local t = {} 
for i = 0, N do 
  t[i] = { a = 0, b = 1, f = i * 0.25 } 
end 
for j = 0, S-1 do 
  for i = 0, N-1 do 
    t[i].a = t[i].a + t[i].b * t[i].f 
   t[i].b = t[i].b - t[i].a * t[i].f 
  end 
  print(string.format("%.6f", t[1].a)) 
end

 

LuaJIT

time /usr/local/openresty/luajit/bin/luajit lua_perf.lua > /dev/null

real    3m4.680s
user    3m4.612s
sys    0m0.042s

15581 root      20   0   38572  28360   2000 R 100.0(%CPU)  0.4(%MEM)   0:04.08 luajit

 

Lua+FFI Test File


--collectgarbage('setpause', 2000)
local ffi = require("ffi")
ffi.cdef[[
typedef struct { double a, b, f; } table_elem;
]]
local N = 140000
local S = 110000
local t = ffi.new("table_elem[?]", N)
for i = 0, N-1 do
  t[i].a = 0.0
  t[i].b = 1.0
  t[i].f = i * 0.25
end

for j = 0, S-1 do
  for i = 0, N-1 do
    t[i].a = t[i].a + t[i].b * t[i].f
    t[i].b = t[i].b - t[i].a * t[i].f
  end
  print(string.format("%.6f", t[1].a))
end

LuaJIT+FFI

time /usr/local/openresty/luajit/bin/luajit lua_perf_ffi.lua > /dev/null

real    0m22.603s
user    0m22.589s
sys    0m0.012s

15625 root      20   0   17920   7508   1968 R 100.0(%CPU)  0.1(%MEM)   0:08.00 luajit

 

JavaScript Test File

class lua_perf {
        public double a, b, f;
        static final int N=4000;
        static final int S=1000;
 
        public static void main (String[] argv) {
                int i, j;
                lua_perf[] t = new lua_perf[N];
                for(i=0; i
                        t[i] = new lua_perf();
                        t[i].a = 0;
                        t[i].b = 1;
                        t[i].f = i * 0.25;
                };
 
                for(j=0; j
                        for(i=0; i
                                t[i].a += t[i].b * t[i].f;
                                t[i].b -= t[i].a * t[i].f;
                        }
                        System.out.println(t[1].a);
                }
        }
}

JavaScript (without any optimize in JVM)

java version "1.8.0_201" Java(TM) SE Runtime Environment (build 1.8.0_201-b09) Java HotSpot(TM) 64-Bit Server VM (build 25.201-b09, mixed mode)


/u01/app/oracle/product/19.0.0/dbhome_1/jdk/bin/javac lua_perf.java

time /u01/app/oracle/product/19.0.0/dbhome_1/jdk/bin/java lua_perf > /dev/null

real    0m33.052s
user    0m32.849s
sys    0m0.425s

15516 root      20   0 4452640  37940  16056 S 100.7(%CPU)  0.6(%MEM)   0:07.33 java

Files size

-rwxr-xr-x. 1 root root 8.6K Aug  9 18:13 lua_perf
-rw-r--r--. 1 root root  572 Aug  9 17:43 lua_perf.c
-rw-r--r--. 1 root root  822 Aug  9 19:04 lua_perf.class
-rw-r--r--. 1 root root  438 Aug  9 17:25 lua_perf_ffi.lua
-rwxr-xr-x. 1 root root 8.3K Aug  9 18:26 lua_perf_gcc8
-rwxr-xr-x. 1 root root  28K Aug  9 18:28 lua_perf_icc
-rw-r--r--. 1 root root  777 Aug  9 19:04 lua_perf.java
-rw-r--r--. 1 root root  380 Aug  9 19:03 lua_perf.lua

Conclusions

GCC and ICC have similar performance; Intel is a little bit faster(1.07x) in this particular test. LuaJIT_FFI has C-like performance but needs parallelism for speed of C programs compiled with parallel option. LuaJIT (NOT FFI) has not bad performance for script language. JavaScript has good performance, but loves many RAM as usually. Pure Lua (NOT JIT) has not been considered as it will be a priory too slow in this test.

sdmrnv, 2019-08-09 [0.533ms, s]