For a couple of years now I’ve been programming a lot in Lua. One of the things I had a hard time with was the relation between the states, threads (coroutines in Lua) and memory layout. This is mostly simple when doing some basic scripting, but once you start writing C libraries or library bindings you really need to be aware of all the nitty gritty details. Now throw in some multi-threading and you’ll get yourself in trouble in no time, if you’re not careful.
How does all this work?
Here’s a diagram I drew that would have helped me big time, if it would have been available when I needed it (it wasn’t so I drew it myself).
- This is the application that embeds the Lua environment. In case of the standalone Lua implementation, it’s just a wrapper around the Lua engine
- This is the code part of the Lua environment, it includes the compiler, the virtual machine, etc. It does not have any data nor any Lua code, all its data will always be stored in the Lua states. From an OS perspective this makes it easy to run multiple Lua states (on the same Lua engine) in parallel on different OS threads.
- Looking at a state from the perspective of a piece of Lua code, it is basically its universe. All Lua code runs within its own LuaState and cannot extent beyond that (unless specialized libraries like Lua Lanes, Rings or likewise, are being used)
- Multiple states can run in parallel on top of the same Lua engine
- A LuaState is not thread safe, only a single thread can safely access the LuaState at any given time without causing problems. Multiple states can run on separate OS threads on top of the same Lua engine.
- The state itself is in C represented as ‘lua_State *L‘ it is just the same as a coroutine, except that in this case (as the main thread) it has the global environment attached to it.
- Each LuaState has its own shared global environment accessible by all Lua code in that LuaState (except for the registry, which is not accessible from Lua)
- Access to the global environment can be manipulated to some extent
- The registry can be considered part of the global environment of a LuaState, but can only be accessed from C using the API. The registry is global, so it can be accessed from all C modules
- The global environment is created together with the main thread (coroutine), these are L1 and L5 in the picture)
- Each coroutine gets its own executionstack
- Coroutines are the Lua equivalent of threads, and they are often called threads (which is a common cause of confusion with newcomers) . Lua uses cooperative multitasking and not pre-emptive as most OS’es use
- The main thread in a LuaState (eg. L1 and L5), also behaves as a coroutine. It has its own execution stack, the only thing it cannot do is call yield(), to suspend execution and hand over control to its calling coroutine
‘lua_State *L’ parameter in API calls
- Whenever a C function is called (either a library, or an API call) a reference to the originating lua_State is passed.
- It is tempting to think of this reference as an ID to the LuaState, but it is not. It is a reference to a coroutine (or execution stack), so in the picture this could be any of the L1 to L8 coroutines (this has been updated in the 5.2 reference manual over the 5.1 version).
- In Lua 5.2 it is possible to get the L reference to the main thread of a LuaState (see LUA_RIDX_MAINTHREAD), which is the identifier for a state. Lua 5.1 has no means of doing this. For Lua 5.1 to identify the LuaState it is using, it must store some identifier in either the registry, an upvalue or in a function environment (the function environment has been deprecated in Lua 5.2)
- Upvalues are stored in a LuaState, where static variables are part of the C library. Neither can be accessed from Lua, only from C. C libraries are loaded once (similar to the Lua engine), but can be used from multiple LuaStates
- Upvalues are local to the combination of a LuaState and a C library
- Static variables are local to a C library, but shared to all LuaStates
Where to store and access data
In general, data should be stored in the LuaState. If you store data local in the C library, then you need to take precautions so the library can be used with multiple Lua states. When used with multithreading, then this shared data also requires locks for safe access.
Here’s a table showing access to library data depending on where it is stored and from where it is being accessed;
IMPORTANT: all of the above assumes ‘end-user’ Lua. So in several occasions where it is mentioned that something is not accessible from Lua then it might very well be possible to access it through the debug library (eg. the registry through debug.getregistry()).
Once you get the way it works it isn’t all that hard. It’s just that the official docs are sometimes hard to grasp if you don’t get the definitons entirely right. And especially for newcomers that is pretty hard (at least that’s what I found).
If you happen to find an error or an omission, please drop me a note below.