Distribute an ETS database in a cluster

This content originally appeared on DEV Community and was authored by NDREAN

We present here a quick write-up to illustrate a simple way to distribute the local ETS database, a build-in in-memory fast database in the Erlang eco-system.

Why would you choose ETS and not MNESIA which is a build-in distributed solution? The main reason was the setup of MNESIA in a dynamic environment such as Kubernetes. Bad synchronisation and proper startup on node discovery can be difficult.

MNESIA has two modes, memory or disk. ETS has a disk-copy alter ego, DETS.

Note: you can still make a disk copy of the ETS table. If the table name is :users, then the command :ets.tab2file(:users, 'data.txt') will save to disk (! use single quotes !).

Why would you need distributed data? When you interact with a "scaled" web app, you don't know with node you will reach so every node/pod should have the same data. Some data can be kept client-side (e.g. your shopping cart), but some also server-side, such as client credentials or cache.

Some ways to manage fast-read data:

make the app stateless by using an external database such as Redis. This however comes with a noticeable overhead especially when you want to secure and make a Redis cluster.
save the data locally and implement sticky sessions with the load balancer or an ingress-affinity in the case of Kubernetes.
cluster your nodes. The local database needs to be synchronised, reverse-proxied or not, using Kubernetes or not.

The drawback of using sticky sessions is that you may not be able to spread the load among the nodes because only new users will reach the new nodes.

We will use this last route since the BEAM natively supports clustering (Erlang's virtual machine that runs Elixir/Phoenix). Since we use a cluster, we have the PG module with PubSub for free, the second major point of Erlang's eco-system. Thus, no single point of failure, and no external dependency.

For our use case, using node discovery with libcluster and synchronising ETS via PubSub on node startup was easy and reliable. We are able to follow easily 1.000 members.

Some code. We use a GenServer to wrap all the calls to the database and the PubSub. We don't keep state - in fact, the state is just [] - but rather use the messaging capabilities between processes offered by the GenServer.

In the init function, we create the database and subscribe to a topic. We also subscribed to Erlang's EPMD internal server to monitor the nodes. With this listener, we broadcast every existing database on the :nodeup event to every connected node. We also broadcast every new write to the database. Since the ETS database is a set, they will be no duplicate or data lost.

Note. With this simple technic, if we preload a node with data, then this data won't be broadcasted to the other running nodes since this new node doesn't receive a :nodeup event, so can't broadcast its data. To recover from this, one node needs to go down, and up again.

The code below show how we store the session credentials, {email, token, uuid, iss}, in ETS and synchronise the tables on each node whenever we save or modify a client session or when a new node joins the cluster.

defmodule MyApp.Repo do
  use GenServer
  alias :ets, as: Ets
  require Logger

  @topic "sync_users"
  @sync_init 3_000

  def start_link(opts \\ []),
   do: GenServer.start_link(__MODULE__, [], name: __MODULE__)

  def all,
    do: Ets.tab2list(:users)

  def backup,
    do: Ets.tab2file(:users, 'data.txt')

  def find_by_email(email),
    do: Ets.lookup(:users, email) |> List.first()

  def find_by_id(uuid),
    do: Ets.match_object(:users, {:_, :_, uuid,:_}) |> List.first()

  def save(email, token) do
    [...]
    GenServer.cast(__MODULE__, {:save_user, user})
  end


  @impl true
  def init([]) do
    :ok = :net_kernel.monitor_nodes(true)
    :users = Ets.new(:users, [:set, :public, :named_table, keypos: 1])
    :ok = Phoenix.PubSub.subscribe(MyApp.PubSub, @topic)
    Logger.info("ETS table started...")
    {:ok, []}
  end

  @impl true
  def handle_info({:nodeup, node}, _) do
    Logger.info("Node UP #{node}")
    Process.send_after(self(), {:perform_sync, []}, @sync_init)
    {:noreply, []}
  end

  @impl true
  def handle_info({:perform_sync, []}, _) do
    Phoenix.PubSub.broadcast_from(MyApp.PubSub, self(), @topic, {:sync, Repo.all()})
    {:noreply, []}
  end

  @impl true
  def handle_info({:sync, messages}, _) do
    Ets.insert(:users, messages)
    {:noreply, []}
  end


  @impl true
  def handle_cast({:save_user, message}, _) do
    Ets.insert(:users, message)
    Phoenix.PubSub.broadcast_from(MyApp.PubSub, self(),@topic, {:save, message})
    {:noreply, []}
  end

  @impl true
  def handle_info({:save, message}, _) do
    Ets.insert(:users, message)
    {:noreply, []}
  end
end

A word about configuration

It is supervised with:

{Phoenix.PubSub, name: MyApp.PubSub, adapter: Phoenix.PubSub.PG2}

The config is:

# config.exs
config :pwdless_gs, MyAppWeb.Endpoint,
  [...],
  pubsub_server: MyApp.PubSub,

This content originally appeared on DEV Community and was authored by NDREAN

Print Share Comment Cite Upload Translate Updates

APA

NDREAN | Sciencx (2022-07-04T02:34:13+00:00) Distribute an ETS database in a cluster. Retrieved from https://www.scien.cx/2022/07/04/distribute-an-ets-database-in-a-cluster/

MLA

" » Distribute an ETS database in a cluster." NDREAN | Sciencx - Monday July 4, 2022, https://www.scien.cx/2022/07/04/distribute-an-ets-database-in-a-cluster/

HARVARD

NDREAN | Sciencx Monday July 4, 2022 » Distribute an ETS database in a cluster., viewed ,<https://www.scien.cx/2022/07/04/distribute-an-ets-database-in-a-cluster/>

VANCOUVER

NDREAN | Sciencx - » Distribute an ETS database in a cluster. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2022/07/04/distribute-an-ets-database-in-a-cluster/

CHICAGO

" » Distribute an ETS database in a cluster." NDREAN | Sciencx - Accessed . https://www.scien.cx/2022/07/04/distribute-an-ets-database-in-a-cluster/

IEEE

" » Distribute an ETS database in a cluster." NDREAN | Sciencx [Online]. Available: https://www.scien.cx/2022/07/04/distribute-an-ets-database-in-a-cluster/. [Accessed: ]

rf:citation

» Distribute an ETS database in a cluster | NDREAN | Sciencx | https://www.scien.cx/2022/07/04/distribute-an-ets-database-in-a-cluster/ |

Please log in to upload a file.

There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.

A word about configuration

Related Posts