This content originally appeared on DEV Community and was authored by NDREAN
We present here a quick write-up to illustrate a simple way to distribute the local ETS database, a build-in in-memory fast database in the Erlang eco-system.
Why would you choose ETS and not MNESIA which is a build-in distributed solution? The main reason was the setup of MNESIA in a dynamic environment such as Kubernetes. Bad synchronisation and proper startup on node discovery can be difficult.
MNESIA has two modes, memory or disk. ETS has a disk-copy alter ego, DETS.
Note: you can still make a disk copy of the ETS table. If the table name is
:users
, then the command:ets.tab2file(:users, 'data.txt')
will save to disk (! use single quotes !).
Why would you need distributed data? When you interact with a "scaled" web app, you don't know with node you will reach so every node/pod should have the same data. Some data can be kept client-side (e.g. your shopping cart), but some also server-side, such as client credentials or cache.
Some ways to manage fast-read data:
- make the app stateless by using an external database such as
Redis
. This however comes with a noticeable overhead especially when you want to secure and make a Redis cluster. - save the data locally and implement sticky sessions with the load balancer or an ingress-affinity in the case of Kubernetes.
- cluster your nodes. The local database needs to be synchronised, reverse-proxied or not, using Kubernetes or not.
The drawback of using sticky sessions is that you may not be able to spread the load among the nodes because only new users will reach the new nodes.
We will use this last route since the BEAM natively supports clustering (Erlang's virtual machine that runs Elixir/Phoenix). Since we use a cluster, we have the PG module with PubSub for free, the second major point of Erlang's eco-system. Thus, no single point of failure, and no external dependency.
For our use case, using node discovery with libcluster
and synchronising ETS via PubSub on node startup was easy and reliable. We are able to follow easily 1.000 members.
Some code. We use a GenServer
to wrap all the calls to the database and the PubSub. We don't keep state - in fact, the state is just []
- but rather use the messaging capabilities between processes offered by the GenServer.
In the init
function, we create the database and subscribe to a topic. We also subscribed to Erlang's EPMD
internal server to monitor the nodes. With this listener, we broadcast every existing database on the :nodeup
event to every connected node. We also broadcast every new write to the database. Since the ETS database is a set
, they will be no duplicate or data lost.
Note. With this simple technic, if we preload a node with data, then this data won't be broadcasted to the other running nodes since this new node doesn't receive a
:nodeup
event, so can't broadcast its data. To recover from this, one node needs to go down, and up again.
The code below show how we store the session credentials, {email, token, uuid, iss}
, in ETS and synchronise the tables on each node whenever we save or modify a client session or when a new node joins the cluster.
defmodule MyApp.Repo do
use GenServer
alias :ets, as: Ets
require Logger
@topic "sync_users"
@sync_init 3_000
def start_link(opts \\ []),
do: GenServer.start_link(__MODULE__, [], name: __MODULE__)
def all,
do: Ets.tab2list(:users)
def backup,
do: Ets.tab2file(:users, 'data.txt')
def find_by_email(email),
do: Ets.lookup(:users, email) |> List.first()
def find_by_id(uuid),
do: Ets.match_object(:users, {:_, :_, uuid,:_}) |> List.first()
def save(email, token) do
[...]
GenServer.cast(__MODULE__, {:save_user, user})
end
@impl true
def init([]) do
:ok = :net_kernel.monitor_nodes(true)
:users = Ets.new(:users, [:set, :public, :named_table, keypos: 1])
:ok = Phoenix.PubSub.subscribe(MyApp.PubSub, @topic)
Logger.info("ETS table started...")
{:ok, []}
end
@impl true
def handle_info({:nodeup, node}, _) do
Logger.info("Node UP #{node}")
Process.send_after(self(), {:perform_sync, []}, @sync_init)
{:noreply, []}
end
@impl true
def handle_info({:perform_sync, []}, _) do
Phoenix.PubSub.broadcast_from(MyApp.PubSub, self(), @topic, {:sync, Repo.all()})
{:noreply, []}
end
@impl true
def handle_info({:sync, messages}, _) do
Ets.insert(:users, messages)
{:noreply, []}
end
@impl true
def handle_cast({:save_user, message}, _) do
Ets.insert(:users, message)
Phoenix.PubSub.broadcast_from(MyApp.PubSub, self(),@topic, {:save, message})
{:noreply, []}
end
@impl true
def handle_info({:save, message}, _) do
Ets.insert(:users, message)
{:noreply, []}
end
end
A word about configuration
It is supervised with:
{Phoenix.PubSub, name: MyApp.PubSub, adapter: Phoenix.PubSub.PG2}
The config is:
# config.exs
config :pwdless_gs, MyAppWeb.Endpoint,
[...],
pubsub_server: MyApp.PubSub,
This content originally appeared on DEV Community and was authored by NDREAN

NDREAN | Sciencx (2022-07-04T02:34:13+00:00) Distribute an ETS database in a cluster. Retrieved from https://www.scien.cx/2022/07/04/distribute-an-ets-database-in-a-cluster/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.