Ecto: An Introduction to Elixir's Database Toolkit

Gints Dreimanis
Article by Gints Dreimanis
November 16th, 2021
14 min read

If you have tried your hand at doing web development with Phoenix, you have definitely run into Ecto. It’s the go-to database library for Elixir programmers. Unfortunately, Phoenix tutorials usually don’t cover Ecto deeply enough to develop an intuition of why things are the way they are.

1914xps7.pasted_image_0_(3)_(1).png

In this article, we’ll instead try to understand how Ecto works. First, we’ll use Ecto to play around with a basic blog database. After that, we’ll dive deeper into different Ecto’s modules and what they offer.

If this is the first time you encounter Elixir (or Phoenix), I suggest you start slow and check out our guides to Elixir and Phoenix.

What is Ecto?


Ecto logo

Ecto is the go-to database toolkit in the Elixir ecosystem, usually used to interact with SQL databases like Postgres and MySQL. It is very powerful and can be used for all the interaction with databases you need, such as inserting, validating, changing, and querying data.

It has four main components:

  • Schema. Schemas are maps from database tables into Elixir structs; the module provides everything you need to create them.
  • Changeset. Changesets help validate the data that you want to insert into the database or modify.
  • Repo. This is the main point through which any interaction with the database is called.
  • Query. A macro-powered DSL for doing composable queries with Elixir-like syntax.

You are free to use any combination of these modules in your application, according to your requirements. There is no need to use all of them.

What are the benefits of using Ecto?

Most likely, the choice of using Ecto will come as a consequence of other choices that you’ve made. If you are using Phoenix and Elixir, the chance is you will use Ecto. 🙃

Generally, Ecto offers the same benefits as other things in the Elixir ecosystem: we focus on modularity and explicitness and of course it has macros.

Explicit. Ecto is explicit about what is happening: there is not so much behind-the-scenes magic as in other libraries. For example, if you don’t preload associated items, the framework will not fetch them for you. While it can cause some headaches in the start, it prevents you from making mistakes that you will forever remember when your project goes viral on Product Hunt.

Configurable. Ecto is very flexible. For example, by default, Ecto is made to connect to SQL databases like MySQL & PostgreSQL. But you can technically use it to connect to whichever data source your heart prefers like MongoDB or ETS, among other things.

Pick-and-choose. While Ecto is frequently called a framework, it consists of a few modules, most of which you can use or not use, depending on your preferences. Schema-less queries, pure SQL instead of Ecto.Query, just using changesets for data validation – you want it, you can do it. I’m not judging.

Ecto tutorial: create a database for a blog

Okay, now that you have a sense of what Ecto is and what it is used for, let’s delve right in!

In this tutorial, we’ll set up a basic blog database for a Phoenix project using Ecto. It will contain users, posts, and comments.

Set up the project

First, let’s create a blank Phoenix project.

mix phx.new blog

After that, cd blog and run mix ecto.create. This will create the database for the project.

Design the database

In our database, we will have three tables.

  1. Users. It will have a field for username.
  2. Posts. It will have a field for post text. It will also have a foreign key referencing a user.
  3. Comments. It will have a field for comment text. It will also have a foreign key that references a user and a foreign key that references a post.

For the reasons of simplicity, we’ll skip categories, tags, titles, meta titles, and other fluff.

Create a database migration

First off, we need to generate a blank migration.

mix ecto.gen.migration initial

After that, open the migration file with a code editor. You can find it in priv/repo/migrations.

Let’s fill in the initial migration with our desired tables.

defmodule Blog.Repo.Migrations.Initial do
  use Ecto.Migration

  def change do
    create table ("users") do
      add :username, :string

      timestamps()
    end

    create table ("posts") do
      add :user_id, references (:users)
      add :post_text, :text

      timestamps()
    end

    create table ("comments") do
      add :user_id, references (:users)
      add :post_id, references (:posts)
      add :comment_text, :text

      timestamps()
    end
  end
end

This migration replicates the database structure we wanted, and also adds timestamps for each entry. You can read more about migrations in the documentation.

Create schemas

After that, we need to create schemas (here and below, I mean Ecto schemas, not SQL schemas), which will help Ecto understand what is in the database.

We won’t bother that much about the location of the schemas. In Phoenix, the usual architecture uses contexts, which are out of the scope of this article.

Create a lib/blog/schemas and create the following schemas inside it:

user.ex

defmodule Blog.Schemas.User do
  use Ecto.Schema
  import Ecto.Changeset

  schema "users" do
    field :username, :string
    has_many :post, Blog.Schemas.Post
    has_many :comment, Blog.Schemas.Comment

    timestamps()
  end

  def changeset(user, params \\ %{}) do
    user
    |> cast(params, [:username])
    |> validate_required([:username])
  end
end

post.ex

defmodule Blog.Schemas.Post do
  use Ecto.Schema
  import Ecto.Changeset

  schema "posts" do
    field :post_text, :string

    has_many :comment, Blog.Schemas.Comment
    belongs_to :user, Blog.Schemas.User

    timestamps()
  end

  def changeset(post, params \\ %{}) do
    post
    |> cast(params, [:post_text])
    |> validate_required([:post_text])
  end
end

comment.ex

defmodule Blog.Schemas.Comment do
  use Ecto.Schema
  import Ecto.Changeset

  schema "comments" do
    field :comment_text, :string

    belongs_to :post, Blog.Schemas.Post
    belongs_to :user, Blog.Schemas.User

    timestamps()
  end

  def changeset(comment, params \\ %{}) do
    comment
    |> cast(params, [:comment_text])
    |> validate_required([:comment_text])
  end
end

Migrations vs. schemas

At some point, you might start to wonder why you are doing essentially the same thing twice. 🤔

While migrations are responsible for giving us the database tables, schemas are responsible for how Ecto looks at those tables. You can have multiple Ecto schemas for one table, depending on how you want to access it, for example. You can even have schemas without matching tables.

Try out working with the database

Now, we can play around with the database.

Let’s run the project via iex -S mix.

First off, some alias preparations:

iex(1)> alias Blog.Repo
Blog.Repo
iex(2)> alias Blog.Schemas.{User, Post, Comment}
[Blog.Schemas.User, Blog.Schemas.Post, Blog.Schemas.Comment]

After that, we can insert a user with a post.

iex(3)> {:ok, user} = Repo.insert(%User{username: "dostoevsky007"})

iex(4)> {:ok, post} = Repo.insert(%Post{post_text: "ALEXEY Fyodorovitch Karamazov was the third son of Fyodor Pavlovitch Karamazov, a landowner well known in our district in his own day, and still remembered among us owing to his gloomy and tragic death, which happened thirteen years ago, and which I shall describe in its proper place.", user: user})

After that, let’s insert a new user with a comment on the post.

iex(5)> {:ok, user2} = Repo.insert(%User{username: "tolstoy1828"})

iex(6)> {:ok, comment} = Repo.insert(%Comment{comment_text: "Happy families are all alike; every unhappy family is unhappy in its own way.", post: post, user: user2})

Now, let’s try to get a post from the database.

iex(7)> post = Repo.get(Post, 1)

Let’s assume that we want to read the comments of this post. If we try to access the comments of this post, something odd happens.

iex(8)> post.comments
#Ecto.Association.NotLoaded<association :comments is not loaded>

Classic Ecto. It does not get comments unless it is told to do that, nor does it fetch the comments we want to see behind the scenes. If we want to fetch the comments, we have to explicitly tell it to get those with Repo.preload.

post = Repo.get(Post, 1) |> Repo.preload [:comments]

Now, we have the comments as well.

iex(9)> post.comments
[
  %Blog.Schemas.Comment{
    __meta__: #Ecto.Schema.Metadata<:loaded, "comments">,
    comment_text: "Happy families are all alike; every unhappy family is unhappy in its own way.",
    id: 1,
    inserted_at: ~N[2021-11-08 22:45:18],
    post: #Ecto.Association.NotLoaded<association :post is not loaded>,
    post_id: 1,
    updated_at: ~N[2021-11-08 22:45:18],
    user: #Ecto.Association.NotLoaded<association :user is not loaded>,
    user_id: 2
  }
]

Great insights as always, tolstoy1828.

Write out functions for manipulating database records

Afterwards, we can create a separate module that will be called by our supposed frontend.

defmodule Blog.App do

  alias Blog.Schemas.{User, Post, Comment}
  alias Blog.Repo
  alias Ecto.Changeset
  import Ecto.Query

  def create_post(params, user_id) do
    user = Repo.get(User, user_id)

    %Post{}
    |> Post.changeset(params)
    |> Changeset.put_assoc(:user, user)
    |> Repo.insert()
  end

  def delete_post(id) do
    Post
    |> Repo.get(id)
    |> Repo.delete()
  end

  def create_user(params) do
    %User{}
    |> User.changeset(params)
    |> Repo.insert()
  end

  def delete_user(id) do
    user = Repo.get(User, id)
    Repo.delete(user)
  end

  def create_comment(params, user_id, post_id) do
    user = Repo.get(User, user_id)
    post = Repo.get(Post, post_id)

    %Comment{}
    |> Comment.changeset(params)
    |> Changeset.put_assoc(:user, user)
    |> Changeset.put_assoc(:post, post)
    |> Repo.insert()
  end

  def delete_comment(id) do
    Comment
    |> Repo.get(id)
    |> Repo.delete()
  end

  def display_post(id) do
    Post
    |> Repo.get(id)
    |> Repo.preload([:user, comments: :user])
  end

  def list_posts() do
    query = from p in Post,
              order_by: [desc: p.id],
              preload: :user

    Repo.all(query)
  end
end

Here, we have just some boring functions to add and delete stuff from our database, as well as functions for getting the whole post list with users (for the main page) and for displaying the post (for the post page).

There are two things here that we didn’t cover yet.

  • Changeset.put_assoc puts the associated records inside the record we are building. You can read an excellent article about Ecto associations on AppSignal’s blog.

  • list_posts() uses Ecto’s query language to get the records in descending order of their id and to preload their users. You can read more about the query language in the documentation.

How does Ecto work?

Now that we have had some hands-on experience with Ecto, let’s dive deeper into how some of its parts (modules) work.

Schemas

Schemas map data from the tables in your database to Elixir structs. Each of these, in the end, results in an Elixir struct with the name of the schema module.

iex(10)> %Blog.Schemas.User{}
%Blog.Schemas.User{
  __meta__: #Ecto.Schema.Metadata<:built, "users">,
  comments: #Ecto.Association.NotLoaded<association :comments is not loaded>,
  id: nil,
  inserted_at: nil,
  posts: #Ecto.Association.NotLoaded<association :posts is not loaded>,
  updated_at: nil,
  username: nil
}

You can use everything that Elixir offers for struct manipulation to juggle with these structs.

iex(11)> user = %User{}

iex(12)> %User{user | username: "soulofgogol"}
%Blog.Schemas.User{
  __meta__: #Ecto.Schema.Metadata<:built, "users">,
  comments: #Ecto.Association.NotLoaded<association :comments is not loaded>,
  id: nil,
  inserted_at: nil,
  posts: #Ecto.Association.NotLoaded<association :posts is not loaded>,
  updated_at: nil,
  username: "soulofgogol"
}

In case the data we want to insert into them might be invalid or faulty, a good practice is to use changesets for validation.

Changesets

A changeset is a data structure that tells Ecto how to change your data. Its corresponding module – Ecto.Changeset – contains functions to create and operate with these data structures.

Here’s what an empty changeset looks like:

%Ecto.Changeset<action: nil, changes: %{}, errors: [], data: nil, valid?: false>

There are two ways to make a changeset: change and cast.

Both of these functions make a changeset out of a struct and a map with some data.

The difference is that cast has an argument for a list of parameters that can be changed, and therefore is better to use when working with external data or when you have fields that you wouldn’t want to accidentally change.

iex(13)> changeset = Ecto.Changeset.cast(%User{}, %{username: "puhskinnotification"}, [:username])
#Ecto.Changeset<
  action: nil,
  changes: %{username: "puhskinnotification"},
  errors: [],
  data: #Blog.Schemas.User<>,
  valid?: true
>

iex(14)> changeset = Ecto.Changeset.cast(%User{}, %{username: "puhskinnotification"}, [])
#Ecto.Changeset<action: nil, changes: %{}, errors: [],
 data: #Blog.Schemas.User<>, valid?: true>

If we do not list a parameter as relevant for us, it is not included in the changeset.

Validations and constraints

After creating a changeset, you can validate it and verify it against constraints.

These are rather similar. The difference is that validations are enforced by Ecto, while constraints are enforced by the database.

For example, there is no way to validate uniqueness without interacting with the database, so you need to pass your record through unique_constraint(). For each constraint, there also needs to be a matching constraint in the database. In this example, your database table would need to have an unique index. Otherwise, an error won’t be raised.

But back to validations. Validations are very useful and should be your go-to way of validating stuff that goes into the database.

Here’s an example of passing a changeset through a validation.

iex(15)> changeset = Ecto.Changeset.cast(%User{}, %{username: "chekhovitout"}, [:username])
#Ecto.Changeset<
  action: nil,
  changes: %{username: "chekhovitout"},
  errors: [],
  data: #Blog.Schemas.User<>,
  valid?: true
>

iex(16)> changeset = Ecto.Changeset.validate_required(changeset, [:username])
#Ecto.Changeset<
  action: nil,
  changes: %{username: "chekhovitout"},
  errors: [],
  data: #Blog.Schemas.User<>,
  valid?: true
>

If we don’t have a username, we get back an invalid changeset with the error added to changeset.errors.

`iex(17)> changeset = Ecto.Changeset.cast(%User{}, %{}, [])`
#Ecto.Changeset<action: nil, changes: %{}, errors: [],
data: #Blog.Schemas.User<>, valid?: true>

iex(18)> changeset = Ecto.Changeset.validate_required(changeset, [:username])
#Ecto.Changeset<
 action: nil,
 changes: %{},
 errors: [username: {"can't be blank", [validation: :required]}],
 data: #Blog.Schemas.User<>,
 valid?: false
>

There are a lot of ready-made validations present, but you can also write your own custom validation functions and call them with validate_change.

If you would like to read more about validations and constraints, this article is very nice.

Where to define changesets?

Changesets are usually defined in the module together with the schemas they use.

defmodule Blog.Schemas.Post do
  use Ecto.Schema
  import Ecto.Changeset

  schema "posts" do
    field :post_text, :string

    has_many :comments, Blog.Schemas.Comment
    belongs_to :user, Blog.Schemas.User

    timestamps()
  end

  def changeset(post, params \\ %{}) do
    post
    |> cast(params, [:post_text])
    |> validate_required([:post_text])
  end
end

You can create more than one changeset to cover multiple contexts of insertion/change, if necessary.

Repo

While OOP language libraries like Ruby’s ActiveRecord act on (or create the illusion that they act on) objects – for example, you would write Post.find(1) – Ecto uses the Repo module as a mediator between you and the database, to which you submit commands that you want to get executed.

Repo.get(Post, 1)

Here’s a good article on the differences between ActiveRecord vs. Ecto.

If you need something, you ask Repo. If you want to insert something, you give it to Repo. If you don’t ask, you shall not receive. All communication with the database goes through Repo.

Ecto query language

While the Repo module has some basic calls to the database, the chance is you will need something more. In the Ecto.Query module, you will find the tools to create custom, composable queries with the Ecto DSL.

For example, previously we needed to get all the posts in a descending order and preload their users.

We wrote the query like this:

    query = from p in Post,
              order_by: [desc: p.id],
              preload: :user

If we wanted to ditch useless data, we could have written a join:

    query = from p in Post,
              join: u in User, on: p.user_id == u.id,
              order_by: [desc: p.id],
              select: %{post_text: p.post_text, username: u.username}

I will not delve into the syntax of the DSL; basic information is covered very well in the Ecto documentation.

Using SQL fragments

Of course, it is also possible to use SQL fragments if you find the query language is not expressible enough or if you’re not used to it yet.

While that’s not useful for our example database, here’s an example from the official documentation:

from p in Post,
  where: is_nil(p.published_at) and
         fragment("lower(?)", p.title) == ^title

You can also use the fragments in migrations.

add :timestamp, :utc_datetime, default: fragment("(now() AT TIME ZONE 'utc')"), null: false

And you can just write raw SQL queries, if you’d like.

Further reading

I’ve given a quick introduction, but there’s a lot still to explore out there. Each of the items I talked about are covered better in different places; I only tried to bring all the knowledge in one place for your convenience.

Here is a load of awesome beginner- and intermediate-level materials that you can check out to learn more about the toolkit.

  • Thinking in Ecto. If you check out just one resource from this list, watch this talk. It covers what is special about Ecto, and why it is the way it is.
  • Programming Ecto. Not much to say – it is the premier Ecto book. If you are one of those people that can read a book, it can be useful to pick it up.
  • Ecto basics playlist from Alchemist Camp. I’m a fan of the content Alchemist Camp is putting out, and this is no different. This is a rather large YouTube playlist of 13 videos that goes through some of the basics of Ecto.
  • Ecto documentation. It’s well written and should be able to answer most of your questions.
  • The Ecto lessons of Elixir School. Elixir School’s Ecto section contains a lot of useful info and code examples on changesets and queries.
  • The Little Ecto Cookbook. A cookbook with recipes on how to achieve common Ecto tasks. It’s curated by Dashbit, the company working on Elixir.
  • Leveling up with Ecto. If Thinking in Ecto was not enough and you would like to see another conference talk by Darin Wilson, this one is a bit more in-depth.

If you would like to read more articles about Elixir, be sure to follow us on Twitter or Dev.

Ecto: An Introduction to Elixir's Database Toolkit
More from Serokell
elixir in production statuspalelixir in production statuspal
introduction to phoenix thumbnailintroduction to phoenix thumbnail
What is big data thumbnailWhat is big data thumbnail