GitHub - DreamWeave-MP/rubic0n: OpenResty's Branch of LuaJIT 2 · GitHub
Skip to content

DreamWeave-MP/rubic0n

 
 

Repository files navigation

Name

openresty/luajit2 - OpenResty's maintained branch of LuaJIT.

Table of Contents

Description

This is the official OpenResty branch of LuaJIT. It is not to be considered a fork, since we still regularly synchronize changes from the upstream LuaJIT project (https://github.com/LuaJIT/LuaJIT).

OpenResty extensions

Additionally to synchronizing upstream changes, we introduce our own changes which haven't been merged yet (or never will be). This document describes those changes that are specific to this branch.

New Lua APIs

table.isempty

syntax: res = isempty(tab)

Returns true when the given Lua table contains neither non-nil array elements nor non-nil key-value pairs, or false otherwise.

This API can be JIT compiled.

Usage:

local isempty = require "table.isempty"

print(isempty({}))  -- true
print(isempty({nil, dog = nil}))  -- true
print(isempty({"a", "b"}))  -- false
print(isempty({nil, 3}))  -- false
print(isempty({cat = 3}))  -- false

Back to TOC

table.isarray

syntax: res = isarray(tab)

Returns true when the given Lua table is a pure array-like Lua table, or false otherwise.

Empty Lua tables are treated as arrays.

This API can be JIT compiled.

Usage:

local isarray = require "table.isarray"

print(isarray{"a", true, 3.14})  -- true
print(isarray{dog = 3})  -- false
print(isarray{})  -- true

Back to TOC

table.nkeys

syntax: n = nkeys(tab)

Returns the total number of elements in a given Lua table (i.e. from both the array and hash parts combined).

This API can be JIT compiled.

Usage:

local nkeys = require "table.nkeys"

print(nkeys({}))  -- 0
print(nkeys({ "a", nil, "b" }))  -- 2
print(nkeys({ dog = 3, cat = 4, bird = nil }))  -- 2
print(nkeys({ "a", dog = 3, cat = 4 }))  -- 3

Back to TOC

table.clone

syntax: t = clone(tab)

Returns a shallow copy of the given Lua table.

This API can be JIT compiled.

Usage:

local clone = require "table.clone"

local x = {x=12, y={5, 6, 7}}
local y = clone(x)
... use y ...

Note: We observe 7% over-all speedup in the edgelang-fan compiler's compiling speed whose Lua is generated by the fanlang compiler.

Note bis: Deep cloning is planned to be supported by adding true as a second argument.

Back to TOC

jit.prngstate

syntax: state = jit.prngstate(state?)

Returns (and optionally sets) the current PRNG state (an array of 8 Lua numbers with 32-bit integer values) currently used by the JIT compiler.

When the state argument is non-nil, it is expected to be an array of up to 8 unsigned Lua numbers, each with value less than 2**32-1. This will set the current PRNG state and return the state that was overridden.

Note: For backward compatibility, state argument can also be an unsigned Lua number less than 2**32-1.

Note: When the state argument is an array and less than 8 numbers, or the state is a number, the remaining positions are filled with zeros.

Usage:

local state = jit.prngstate()
local oldstate = jit.prngstate{ a, b, c, ... }

jit.prngstate(32) -- {32, 0, 0, 0, 0, 0, 0, 0}
jit.prngstate{432, 23, 50} -- {432, 23, 50, 0, 0, 0, 0, 0}

Note: This API has no effect if LuaJIT is compiled with -DLUAJIT_DISABLE_JIT, and will return a table with all 0.

Back to TOC

jit.gcstats

syntax: stats = jit.gcstats(reset?)

Available only when this branch is built with -DLUAJIT_ENABLE_GCSTATS. Without that compile flag, jit.gcstats is not registered. The statistics are observability counters for the existing incremental collector; this is not a generational GC mode and does not make collection automatically faster.

The function returns a table snapshot with numeric counter fields. If reset is truthy, the snapshot contains the pre-reset values and the stored counters are then reset to zero; a following jit.gcstats() call starts from the new zeroed baseline. Counters are stored internally as uint64_t, but are returned as Lua numbers, so very large values may lose integer precision on builds where lua_Number is a double.

Current fields:

alloc_calls            free_calls             realloc_calls
alloc_bytes            free_bytes             realloc_bytes
new_gcobj_calls        step_calls             cycle_count
fullgc_calls           propagate_calls        propagate_bytes
atomic_calls           sweep_string_steps     sweep_root_steps
finalizer_scan_steps   finalizer_queued       finalizer_calls
finalizer_cfunc_calls  finalizer_cfunc_nup0_calls
finalizer_cfunc_upvalue_calls                 finalizer_lfunc_calls
finalizer_ffunc_calls  finalizer_other_calls  finalizer_error_calls
weak_tables            weak_slots_cleared
barrier_forward        barrier_back           barrier_upvalue
barrier_trace          jit_forced_exits

Stats-enabled builds in this fork expose sweep_udata_* counters for the mandatory sweep-time finalizer discovery path. Normal GC cycles discover userdata finalizers incrementally during the sweep phase instead of walking the userdata candidate list during atomic. These counters are contract-bound to this fork and are intended for validating that always-on mode. sweep_udata_preserved counts actual current-white preservation operations. The userdata itself is still preserved unconditionally before queueing. Metatables and collectable immediate __gc values are preserved only when runtime liveness shows they are still dead in the post-atomic sweep-udata window; already-live objects are counted as alive skips, and non-collectable __gc values are counted as no-preserve cases. The detailed fields are sweep_udata_preserve_udata, sweep_udata_preserve_mt_dead, sweep_udata_preserve_mt_alive_skip, sweep_udata_preserve_callable_dead, sweep_udata_preserve_callable_alive_skip, and sweep_udata_preserve_callable_nongc.

Stats-enabled builds also expose finalizer_direct_cfunc_* counters for the mandatory direct zero-upvalue C finalizer ABI, plus finalizer_nonresurrecting_cfunc_frees and finalizer_nonresurrecting_cfunc_fallbacks for the mandatory non-resurrecting direct-free ABI.

Counter groups include allocator calls and bytes (alloc_*, free_*, realloc_*), object allocation (new_gcobj_calls), incremental step and cycle progress (step_calls, cycle_count, fullgc_calls), marking and sweeping work (propagate_*, atomic_calls, sweep_*), finalizer activity (finalizer_scan_steps, finalizer_queued, finalizer_calls), weak table processing, write barriers, and JIT-forced exits. These counters are intended for comparing runs and investigating GC behavior; they should not be treated as exact semantic event counts for every allocation or object lifetime edge case.

Stats are disabled by default. Stats-enabled builds add counter writes on GC and allocation paths.

Related local tools:

  • tools/gc-validation-matrix.sh runs a conservative local matrix for GC-sensitive changes, including baseline, GC-stats, no-FFI, and assertion/API check legs.
  • bench/gcstats.lua runs diagnostic allocation scenarios using jit.gcstats(). It requires a stats-enabled build and should be used to compare deltas across builds/configurations, not as a standalone performance claim.

To compare this fork's mandatory sweep-time behavior against stock LuaJIT, rebuild and run the same focused benchmark filter with the same workload on each build.

make clean && make XCFLAGS='-DLUAJIT_ENABLE_LUA52COMPAT -DLUAJIT_ENABLE_GCSTATS'
./src/luajit bench/gcstats.lua --iterations 10000 --filter sweep-udata-

Sweep-udata stats fields are shown as n/a by the benchmark only when comparing against another LuaJIT build that does not carry this fork's mandatory sweep-time finalizer contract.

Back to TOC

math geometry

This fork exposes a small FFI-backed geometry surface under the standard math.* table. These are LuaJIT-side math values, not OpenMW/Sol userdata: engine APIs do not automatically accept them unless the engine adds explicit compatibility glue. No new build flag controls these APIs, but they are absent from builds compiled with -DLUAJIT_DISABLE_FFI.

Exposed constructors and helpers:

  • math.vector2, math.vector3, math.vector4
  • math.immutableVector2, math.immutableVector3, math.immutableVector4
  • vector property swizzles such as v.x, v.xy, or v.yxw; there is no :swizzle() method
  • math.box
  • math.transform.identity, math.transform.move, math.transform.scale, math.transform.rotate, math.transform.rotateX, math.transform.rotateY, and math.transform.rotateZ
  • math.color.rgb, math.color.rgba, math.color.hex, and math.color.commaString

The math.vector2/3/4 constructors intentionally create mutable vectors by default. Use the math.immutableVector2/3/4 constructors when immutable vector values are required. Single-component swizzles return a number; multi-component swizzles allocate a new vector cdata value matching the receiver's mutable or immutable kind.

Some convenience accessors allocate. In particular, box.vertices allocates a Lua table and eight immutable vectors. box.transform and color:asRgb()/color:asRgba() allocate new values.

Transform multiplication follows normal matrix composition order: A * B * v applies B to v first, then applies A. TransformM:inverse() throws for finite but non-invertible matrices.

Numeric geometry constructors perform type and arity checks, but generally do not finite-check numeric inputs. Vector, transform, and box values may carry NaNs and infinities through later math operations rather than laundering them into zero, identity, or another safe sentinel. Color constructors clamp normal numeric components to [0, 1]; NaN components remain visible. Engine boundaries should reject non-finite values before accepting them.

Back to TOC

thread.exdata

syntax: exdata = th_exdata(data?)

This API allows for embedding user data into a thread (lua_State).

The retrieved exdata value on the Lua land is represented as a cdata object of the ctype void*.

As of this version, retrieving the exdata (i.e. th_exdata() without any argument) can be JIT compiled.

Usage:

local th_exdata = require "thread.exdata"

th_exdata(0xdeadbeefLL)  -- set the exdata of the current Lua thread
local exdata = th_exdata()  -- fetch the exdata of the current Lua thread

Also available are the following public C API functions for manipulating exdata on the C land:

void lua_setexdata(lua_State *L, void *exdata);
void *lua_getexdata(lua_State *L);

The exdata pointer is initialized to NULL when the main thread is created. Any child Lua thread will inherit its parent's exdata, but still can override it.

Note: This API will not be available if LuaJIT is compiled with -DLUAJIT_DISABLE_FFI.

Note bis: This API is used internally by the OpenResty core, and it is strongly discouraged to use it yourself in the context of OpenResty.

Back to TOC

thread.exdata2

syntax: exdata = th_exdata2(data?)

Similar to thread.exdata but for a 2nd separate user data as a pointer value.

Back to TOC

New C API

lua_setexdata

void lua_setexdata(lua_State *L, void *exdata);

Sets extra user data as a pointer value to the current Lua state or thread.

Back to TOC

lua_getexdata

void *lua_getexdata(lua_State *L);

Gets extra user data as a pointer value to the current Lua state or thread.

Back to TOC

lua_setexdata2

void lua_setexdata2(lua_State *L, void *exdata2);

Similar to lua_setexdata but for a 2nd user data (pointer) value.

Back to TOC

lua_getexdata2

void *lua_getexdata2(lua_State *L);

Similar to lua_getexdata but for a 2nd user data (pointer) value.

Back to TOC

lua_resetthread

void lua_resetthread(lua_State *L, lua_State *th);

Resets the state of th to the initial state of a newly created Lua thread object as returned by lua_newthread(). This is mainly for Lua thread recycling. Lua threads in arbitrary states (like yielded or errored) can be reset properly.

The current implementation does not shrink the already allocated Lua stack though. It only clears it.

Back to TOC

LUA_GCSETSTEPSIZE

lua_gc(L, LUA_GCSETSTEPSIZE, kb);

Sets the incremental GC step-size quantum in KiB and returns the previous value in KiB. The corresponding Lua API is collectgarbage("setstepsize", kb).

The default is 1 KiB, matching the old fixed quantum. Non-positive inputs are clamped to 1 KiB. Very large inputs are clamped to the implementation maximum (currently 64 MiB). The step size is a pacing quantum used with the existing setstepmul multiplier and step work requests; it is not a memory limit and does not change the collector into a generational collector.

For example, collectgarbage("setstepsize", 0) returns the previous KiB value and stores 1, while collectgarbage("setstepsize", 1024 * 1024) returns the previous KiB value and stores the implementation maximum.

Back to TOC

New macros

The macros described in this section have been added to this branch.

Back to TOC

OPENRESTY_LUAJIT

In the luajit.h header file, a new macro OPENRESTY_LUAJIT was defined to help distinguishing this OpenResty-specific branch of LuaJIT.

HAVE_LUA_RESETTHREAD

This macro is set when the lua_resetthread C API is present.

Back to TOC

Optimizations

Updated JIT default parameters

We use more appressive default JIT compiler options to help large OpenResty Lua applications.

The following jit.opt options are used by default:

maxtrace=8000
maxrecord=16000
minstitch=3
maxmcode=40960  -- in KB

Back to TOC

String hashing

This optimization only applies to Intel CPUs supporting the SSE 4.2 instruction sets. For such CPUs, and when this branch is compiled with -msse4.2, the string hashing function used for strings interning will be based on an optimized crc32 implementation (see lj_str_new()).

This optimization still provides constant-time hashing complexity (O(n)), but makes hash collision attacks harder for strings up to 127 bytes of size.

Back to TOC

Compile-time unpack() optimization

This fork includes TurkeyMcMac's JIT optimization for compiling unpack() calls when the start and end indexes are constants, from https://github.com/TurkeyMcMac/LuaJIT/tree/compile-unpack.

Back to TOC

Metatable specialization

This fork includes TurkeyMcMac's JIT metatable specialization work from https://github.com/TurkeyMcMac/LuaJIT/tree/mtspec.

Back to TOC

Global environment specialization

This fork includes TurkeyMcMac's JIT specialization for global environment table accesses from https://github.com/TurkeyMcMac/LuaJIT/tree/globalspec. The recorder can guard on the function environment table identity for global reads and stores, while preserving normal table access semantics for value changes, environment replacement, missing globals, and __index/__newindex metamethod behavior.

Back to TOC

Improved allocation sinking

This fork includes XmiliaH's improved allocation sinking work from https://github.com/XmiliaH/LuaJIT/tree/improved-sinking. The port preserves this fork's snapshot restore fixes and uses conservative fallbacks for heavy nested sunk allocations when identity-index or raw FFI restore invariants would be at risk.

Back to TOC

Static userdata finalizer scan contract

This fork treats userdata finalizability as static once a userdata object has survived a GC finalizer-candidate scan. If the userdata metatable does not have __gc at that scan, the object is removed from the userdata finalizer-candidate chain and is kept only on the normal GC object list. It remains a normal collectable userdata object and is swept normally, but later adding __gc to that userdata metatable is unsupported and may not run a finalizer.

Adding __gc before the userdata reaches its first finalizer-candidate scan is still supported. Mutating userdata metatables or adding __gc after that point, including through debug, FFI, sandbox bypass access, or similar privileged mechanisms, is undefined for this build.

This is a restricted semantic and performance contract chosen by this fork to avoid repeatedly scanning long-lived userdata that cannot currently be finalized. It should not be treated as stock or general LuaJIT finalizer behavior.

Sweep-phase userdata finalizer discovery is mandatory in this OpenMW-oriented fork branch. This is intentional for static/native/leaf userdata finalizers common in OpenMW/native userdata workloads. The mode is contract-bound to this fork and is not stock LuaJIT behavior. In this mode, normal GC cycles skip the atomic userdata candidate walk and instead process the candidate segment incrementally before string/root sweeping.

This mode is only for static, native/leaf userdata finalizers. Late mutation of userdata metatables or __gc after a candidate has been scanned is still unsupported, including changes through debug, FFI, sandbox bypass access, or similar privileged mechanisms. Dead finalizable userdata themselves are always preserved until finalization. Their metatable and collectable immediate __gc value are preserved only if they are still dead when the post-atomic sweep-udata step queues the userdata; if they are already live, no additional preservation is needed. Arbitrary Lua closure dependency graphs are outside the contract. Shutdown may still use the traditional close-time separation path.

This changes an observable weak-key edge case from stock LuaJIT. Weak-key associations whose keys are dying finalizable userdata may be cleared before the userdata finalizers run in this mode. This is intentional under the static/native/leaf finalizer contract. Code using this mode must not rely on weak keys retaining dying finalizable userdata through finalizer execution. Weak-table torture cases that depend on arbitrary Lua __gc closure dependency graphs being discovered and preserved exactly like stock LuaJIT across tiny incremental GC steps are likewise outside this fork contract.

Back to TOC

Direct zero-upvalue C finalizer ABI

This fork branch always includes a dangerous direct zero-upvalue C finalizer ABI for embedders that own all userdata finalizers covered by it. It is not Lua-compatible generic finalizer behavior and is unsafe for applications that allow arbitrary Lua code or third-party C modules to install __gc handlers.

A full userdata whose __gc metamethod is a zero-upvalue C function may be called directly by the collector instead of through the normal protected finalizer path. The callable must obey this complete ABI contract:

  • the object is full userdata only;
  • the __gc callable is a zero-upvalue C function;
  • the userdata is passed at positive stack index 1;
  • the finalizer returns 0;
  • the finalizer must not throw or call lua_error;
  • the finalizer must not yield;
  • the finalizer must not call Lua or re-enter Lua;
  • the finalizer must not depend on upvalues or LUA_ENVIRONINDEX;
  • the finalizer must not depend on protected finalizer error handling.

Violating this contract can crash or corrupt the runtime. Use it only when the embedder owns every such finalizer and has audited them as native leaf destructors with simple stack use.

Before relying on this ABI in production, build with LUAJIT_ENABLE_GCSTATS and check finalizer telemetry on representative workloads. In particular, review finalizer_cfunc_nup0_calls, finalizer_direct_cfunc_calls, finalizer_direct_cfunc_nonzero_results, finalizer_cfunc_upvalue_calls, finalizer_lfunc_calls, finalizer_ffunc_calls, and finalizer_other_calls to confirm that the finalizers expected to use the direct ABI are zero-upvalue C functions returning zero, and that other finalizer kinds are understood.

Back to TOC

Non-resurrecting direct-free C finalizer ABI

This fork branch always includes an even narrower and more dangerous non-resurrecting direct-free ABI for embedders that can prove their native userdata finalizers never resurrect the userdata. It builds on the mandatory direct zero-upvalue C finalizer ABI and this fork's mandatory sweep-time finalizer-discovery contract. It is not compatible with the stock atomic userdata-finalizer discovery path because that path can preserve weak-key entries whose finalized userdata keys would otherwise be freed immediately by this ABI.

This is not sweep-time immediate finalization. Userdata are still discovered, queued on mmudata, and finalized during the normal finalizer phase. This ABI only changes normal GC GCSfinalize processing after sweep-time discovery and weak clearing. It is not used by the lua_close() finalizer drain. For an eligible full userdata selected from the normal GC queue, if its __gc metamethod is a zero-upvalue C function accepted by the direct C finalizer ABI, LuaJIT does not put the userdata back on the GC root list for possible resurrection. It calls the direct C finalizer and immediately frees the userdata with lj_udata_free.

Pre-call eligibility is intentionally identical to the direct C finalizer ABI's shape checks: full userdata only and zero-upvalue C __gc only, called with the userdata at stack index 1. The finalizer is still required to return 0, not throw, not yield, not re-enter Lua, not depend on upvalues or environments, and not depend on protected error behavior. A nonzero return count is a contract violation that can only be observed after the unprotected call; it is reported by telemetry and does not safely turn the call into a fallback. Cdata finalizers, Lua finalizers, C closures with upvalues, fast functions, callable tables, and all other generic finalizers fall back to the normal protected/resurrection- capable path.

Violating the no-resurrection contract can leave a live Lua reference to freed userdata and corrupt the runtime. Use this only when the embedder owns and audits every zero-upvalue C userdata finalizer covered by the ABI. With LUAJIT_ENABLE_GCSTATS, validate finalizer_nonresurrecting_cfunc_frees, finalizer_nonresurrecting_cfunc_fallbacks, finalizer_direct_cfunc_nonzero_results, and the finalizer dispatch counters on representative workloads before relying on this ABI in production.

Back to TOC

Updated bytecode options

New -bL option

The bytecode option L was added to display Lua sources line numbers.

For example, luajit -bL -e 'print(1)' now produces bytecode dumps like below:

-- BYTECODE -- "print(1)":0-1
0001     [1]    GGET     0   0      ; "print"
0002     [1]    KSHORT   1   1
0003     [1]    CALL     0   1   2
0004     [1]    RET0     0   1

The [N] column corresponds to the Lua source line number. For example, [1] means "the first source line".

Back to TOC

Updated -bl option

The bytecode option l was updated to display the constant tables of each Lua prototype.

For example, luajit -bl a.lua' now produces bytecode dumps like below:

-- BYTECODE -- a.lua:0-48
KGC    0    "print"
KGC    1    "hi"
KGC    2    table
KGC    3    a.lua:17
KN    1    1000000
KN    2    1.390671161567e-309
...

Back to TOC

Miscellaneous

  • Increased the maximum number of allowed upvalues from 60 to 120.
  • Various important bugfixes in the JIT compiler and Lua VM which have not been merged in upstream LuaJIT.
  • Removed the GCC 4 requirement for x86 on older systems such as Solaris i386.
  • In the Makefile file, make sure we always install the symlink for "luajit" even for alpha or beta versions.
  • Applied a patch to fix DragonFlyBSD compatibility. Note: this is not an officially supported target.
  • feature: jit.dump: output Lua source location after every BC.
  • feature: added internal memory-buffer-based trace entry/exit/start-recording event logging, mainly for debugging bugs in the JIT compiler. it requires -DLUA_USE_TRACE_LOGS when building LuaJIT.
  • feature: save g->jit_base to g->saved_jit_base before lj_err_throw clears g->jit_base which makes it impossible to get Lua backtrace in such states.

Back to TOC

Copyright & License

LuaJIT is a Just-In-Time (JIT) compiler for the Lua programming language.

Project Homepage: http://luajit.org/

LuaJIT is Copyright (C) 2005-2019 Mike Pall.

Additional patches for OpenResty are copyrighted by Yichun Zhang and OpenResty Inc.:

Copyright (C) 2017-2019 Yichun Zhang. All rights reserved.

Copyright (C) 2017-2019 OpenResty Inc. All rights reserved.

LuaJIT is free software, released under the MIT license. See full Copyright Notice in the COPYRIGHT file or in luajit.h.

Documentation for the official LuaJIT is available in HTML format. Please point your favorite browser to:

doc/luajit.html

Back to TOC

About

OpenResty's Branch of LuaJIT 2

Resources

Stars

Watchers

Forks

Packages

Contributors

Languages

  • C 73.7%
  • Lua 19.8%
  • Perl 3.4%
  • Makefile 1.1%
  • Batchfile 0.8%
  • Terra 0.6%
  • Other 0.6%