Tax and Compliance — 6 min
Engineering — 15 min
After joining Remote and starting to find my way around our extensive data model, I quickly noticed how often I was repeating the Repo.preload/2 call to navigate through our data model.
The explicit preload strategy, instead of the automatic eager loading provided by other frameworks (like Rails' ActiveRecord, for example) serves as a deterrent to the unfortunately common problem of having N+1 queries popping up in Production.
With Ecto, you need to be deliberate upfront about whether you need the association loaded for you. Otherwise, you'll get this result when trying to reach out to the country.addresses.
1#Ecto.Association.NotLoaded<association :addresses is not loaded>
The following snippet shows that an association isn't loaded by default. Only after "preloading" it the association becomes "filled" with records coming from the DB.
1iex()> country = Repo.get(Country, 2)2%EctoExplorer.Schemas.Country{...}3iex()> country.addresses4#Ecto.Association.NotLoaded<association :addresses is not loaded>5iex()> country = Repo.preload(country, :addresses)6%EctoExplorer.Schemas.Country{...}7iex()> country.addresses8[%EctoExplorer.Schemas.Address{...}, ...]
The explicitness of calling Repo.preload/2 to load an association is more than welcome in Production given the safety it brings to the table, but it becomes a nuisance when we are in a local IEx shell trying to smoothly navigate our data model, hopping from Ecto association to Ecto association:
1iex()> country = Repo.get(Country, 2)2%EctoExplorer.Schemas.Country{3 addresses: #Ecto.Association.NotLoaded<association :addresses is not loaded>,4 currencies: #Ecto.Association.NotLoaded<association :currencies is not loaded>,5 flag: #Ecto.Association.NotLoaded<association :flag is not loaded>,6 code: "ECU"7 id: 2,8 name: "Ecuador",9 ...10}11# 1st Repo.preload/2 for the country.flag12iex()> country = Repo.preload(country, :flag)13# ...14iex()> country.flag15%EctoExplorer.Schemas.Flag{16 colors: "YBR",17 country: #Ecto.Association.NotLoaded<association :country is not loaded>,18 country_id: 2,19 orientation: "horizontal"20}21# 2nd Repo.preload/2 for the country.currencies22iex()> country = Repo.preload(country, :currencies)23# ...24iex()> country.currencies25[26 %EctoExplorer.Schemas.Currency{27 code: "USD",28 symbol: "$"29 },30 %EctoExplorer.Schemas.Currency{31 code: "SUC",32 symbol: "Suc"33 }34]35# 3rd Repo.preload/2 for the country.addresses36iex()> country = Repo.preload(country, :addresses)37# ...38iex()> country.addresses |> Enum.at(0)39%EctoExplorer.Schemas.Address{40 city: "city_ECU_1",41 country_id: 2,42 first_line: "first_line_ECU_1",43 postal_code: "postal_code_ECU_1"44 ...45}
As you can see above, the pattern is always the same:
You got the struct record from the DB,
You now need to explore one of its associations
You need to resort to Repo.preload/2 before actually checking any of the association records
Rinse and repeat 🔃
I was getting really tired of the Repo.preload/2 dance, and it got to a point where I decided to scratch my own itch.
At first, I even considered a quick and dirty hack 🔨 that would consist of a vim macro that would write the Repo.preload/2 call for me 🙈 . This approach might have solved my pain, since I always have both an IEx shell running (as my REPL) and vim inside tmux, but I'm pretty sure this wouldn't be much useful for anyone else but me 😅
I started thinking about what I needed to streamline the Repo.preload/2 usage:
The end goal would be to be able to "chain" one or more association accesses and the Repo.preload/2 calls would happen behind the scenes automatically; this way we would avoid the dreaded look of the Ecto.Association.NotLoaded 🪦
Repo.preload/2 already let's us pass an association "chain" on the second argument (e.g. Repo.preload(flag, [:country, :addresses])), so the new approach needs to improve on what's already provided out of the box by Ecto;
If we consider the last example, a streamlined version of the preload/2 function like X(flag, country.addresses) would already be an improvement. Note that X would be the new function and the second argument is not a string. This improvement would already save some keypresses every day, but I would need to turn the country.addresses part into [country: :addresses]. Maybe some metaprogramming sprinkles would help here? 🤓
By now I was almost certain that my path implied the usage of metaprogramming, so I tried to figure out what I'd get if I had no restrictions on the amount of metaprogramming I'm willing to use. The goal is to chain Ecto association "hops", and since these hops are similar in spirit to map accesses, if we support expressions like flag.country.addresses, we would be keeping a known Elixir pattern.
For this to work though, I would need to somehow override the . (dot) access to cater to our specific needs. From what I gathered from the Elixir source, extending the dot access would not be easy since its implementation is really intertwined with the language "core".
If I can't use the dot access for the Ecto navigation, I'd like to have something as similar as possible to it. I know that Elixir has a lot of operators provided by the Kernel module that are macro-based (check, for example, the Kernel.), so I dived into the source of these macros to understand how they come to life 🕵️♂️
What I found is that most of these macros use a different macro syntax than what I'm used to see, but this is exactly what allows one to write a ||| macro that would behave like an operator and used like foo|||bar.
1iex()> defmodule Example do2...()> # "operator" macro syntax3...()> defmacro left ||| right do4...()> "#{left} AND #{right}"5...()> end6...()>7...()> # instead of the "traditional" macro syntax8...()> defmacro my_operator(left, right) do9...()> "#{left} AND #{right}"10...()> end11...()> end12{:module, Example, <<70, 79, 82, ...>>, {:my_operator, 2}}13iex()> import Example14Example15iex()> "foo"|||"bar" # note the usage of `|||` as an "operator"16"foo AND bar"17iex()> my_operator("foo", "bar")18"foo AND bar"
💡 Note the defmacro left X right, do: ... way of defining the macro instead of the more conventional defmacro X(left, right), do: ... that is commonly seen and used.
I felt I was on the right track, I just needed to define an operator symbol for the Ecto navigation that would somehow convey the meaning of "navigating through Ecto associations". I settled on the ~> operator since the ~ looks like a wave 🌊 and the > points forward, hence it would be used to sail through a sea of Ecto navigations (cheesy I know 🙈 , but I didn't come up with a better mnemonic 😅 ).
If I define the macro now and we simply inspect what we get on both left and right parameters, you'll see that we get the quoted form of both params, as every regular macro does 🌈
1iex()> defmodule Example2 do2...()> defmacro left ~> right do3...()> IO.inspect(left, label: "Left")4...()> IO.inspect(right, label: "Right")5...()>6...()> :ok7...()> end8...()> end9{:module, Example2, <<70, 79, 82, ...>>, {:~>, 2}}10iex()> import Example211Example212iex()> foo~>a13Left: {:foo, [line: 15], nil}14Right: {:a, [line: 15], nil}15:ok16iex()> foo~>a.b17Left: {:foo, [line: 16], nil}18Right: {{:., [line: 16], [{:a, [line: 16], nil}, :b]}, [no_parens: true, line: 16], []}19:ok20iex()> foo~>a.b.c.d21Left: {:foo, [line: 17], nil}22Right: {{:., [line: 17],23 [24 {{:., [line: 17],25 [26 {{:., [line: 17], [{:a, [line: 17], nil}, :b]},27 [no_parens: true, line: 17], []},28 :c29 ]}, [no_parens: true, line: 17], []},30 :d31 ]}, [no_parens: true, line: 17], []}32:ok
For each of these expressions, we need to convert the quoted form of the navigation part (the right-hand side of the expression, after the ~>) to a list of steps.
We'll create this list of steps by traversing the quoted right parameter, using the Macro.postwalk/3 function (you can check the full code here, which is a bit more long due to the handling of indexes, ie., steps with an index like country.addresses[3]):
1@doc false2def _steps(quoted_right) do3 quoted_right4 |> Macro.postwalk(%{visited: [], steps: []}, fn5 # ...6 {:., _, _} = node, acc ->7 acc = accumulate_node(acc, node)8 {node, acc}9 {first_step, _, _} = node, acc when is_atom(first_step) ->10 acc = accumulate_node(acc, node, %Step{key: first_step})11 {node, acc}12 step, acc when is_atom(step) ->13 acc = accumulate_node(acc, step, %Step{key: step})14 {step, acc}15 # ...16 node, acc ->17 acc = accumulate_node(acc, node)18 {node, acc}19 end)20end21defp accumulate_node(%{visited: visited} = acc, node) do22 %{acc | visited: [node | visited]}23end24defp accumulate_node(%{steps: steps} = acc, node, %Step{} = step) do25 %{accumulate_node(acc, node) | steps: [step | steps]}26end
In a nutshell, the postwalk logic visits each node of the right-hand side AST and accumulates the steps as a list of %Step{} structs. The _steps/1 function tests illustrate how the function works (you can find the tests here):
1test "makes steps for a basic right-hand side" do2 rhs = quote do: foo3 assert [%Step{key: :foo}] == Subject.steps(rhs)4end5test "makes steps for a 2-hop right-hand side" do6 rhs = quote do: foo.bar7 assert [%Step{key: :foo}, %Step{key: :bar}] == Subject.steps(rhs)8end9test "makes steps for a 5-hop right-hand side" do10 rhs = quote do: foo.bar.baz.bin.yas11 assert [12 %Step{key: :foo},13 %Step{key: :bar},14 %Step{key: :baz},15 %Step{key: :bin},16 %Step{key: :yas}17 ] == Subject.steps(rhs)18end19test "makes steps for a basic right-hand side with index" do20 rhs = quote do: foo[99]21 assert [%Step{key: :foo, index: 99}] == Subject.steps(rhs)22end
By converting each step into its own %Step{} structure we are able to keep things tidy 🧹.
As you might have guessed by now, here's how the step list of the flag~>country.addresses expression looks like:
1iex()> rhs = quote do: country.addresses2{{:., [], [{:country, [], Elixir}, :addresses]}, [no_parens: true], []}3iex()> EctoExplorer.Resolver._steps(rhs)4{^rhs,5 %{6 expected_index_steps: 0,7 steps: [8 %EctoExplorer.Resolver.Step{index: nil, key: :addresses},9 %EctoExplorer.Resolver.Step{index: nil, key: :country}10 ],11 visited: [12 {{:., [], [{:country, [], Elixir}, :addresses]}, [no_parens: true], []}, #413 {:., [], [{:country, [], Elixir}, :addresses]}, #314 :addresses, #215 {:country, [], Elixir} #116 ]17 }}
By looking at the visited list from the bottom up, we can figure out what happened:
Visited the country AST node (#1);
Then visited the addresses AST node (#2);
Then visited the :. (dot) navigation node (#3);
And finally visited the full expression that matches exactly the rhs value (#4).
With the navigation steps turned into a pretty list of %Step{} structs, we now need to preload the "current" struct with the immediate next step of the list and we'll obtain a new "current" struct. This approach will be repeated until we don't have any more remaining steps.
1@doc false2def _resolve(current, %Step{key: step_key, index: nil} = step) do3 case Map.get(current, step_key) do4 %Ecto.Association.NotLoaded{} ->5 current = Preloader.preload(current, step_key)6 _resolve(current, step)7 nil ->8 Logger.warn("[Current: #{inspect(current)}] Step '#{step_key}' resolved to `nil`")9 nil10 value ->11 value12 end13end
As you can see above, we try to get the current.step_key value and if it's an association that isn't loaded yet, we call the Preloader.preload/2 that behind the scenes relies on the Ecto.Repo.preload/2 function to fetch the association. Otherwise we just return the current.step_key value.
At this stage the gist of the streamlined Ecto navigation is behind us. We just need to offer an easy way for everyone to use the ~> navigation operator on their own IEx shells.
The EctoExplorer module defines its own __using__/1 macro so that people can use it to have access to the ~> operator. Since the Ecto navigation needs an actual Ecto Repo to fetch the data from the DB, the repo option is required when using the EctoExplorer module:
1defmacro __using__(repo: repo) do23 if Mix.env() not in [:dev, :test] do45 IO.puts(67 "You're using EctoExplorer on the `#{Mix.env()}` environment.\\nEctoExplorer isn't in any way optimized for Production usage, and forces the preload of each association. Use with care!"89 )1011 end1213 {:ok, _pid} =1415 repo1617 |> Macro.expand(__ENV__)1819 |> maybe_start_repo_agent()2021 quote do2223 import unquote(__MODULE__)2425 end2627end
For the navigation logic to be able to use the repo, it needs to have access to it, hence we start an Agent process to keep it. Notice that we Macro.expand/2 the repo value since its value is quoted (💡 recall that any macro parameter is passed in its quoted form, so Foo.Repo would look like {:__aliases__, [alias: false], [:Foo, :Repo]}).
To use the ~> operator, is now just a matter of using the EctoExplorer module:
1iex()> use EctoExplorer, repo: EctoExplorer.Repo23EctoExplorer45iex()> f = Repo.get(Flag, 2)67%EctoExplorer.Schemas.Flag{89 __meta__: #Ecto.Schema.Metadata<:loaded, "flags">,1011 colors: "YBR",1213 country: #Ecto.Association.NotLoaded<association :country is not loaded>,1415 country_id: 2,1617 id: 2,1819 orientation: "horizontal"2021}2223iex()> f~>country.addresses[0].postal_code242511:20:01.467 [debug] QUERY OK source="countries" db=0.0ms idle=27.6ms2627SELECT c0."id", c0."name", c0."code", c0."population", c0."id" FROM "countries" AS c0 WHERE (c0."id" = ?) [2]282911:20:01.470 [debug] QUERY OK source="addresses" db=0.0ms idle=30.2ms3031SELECT a0."id", a0."first_line", a0."postal_code", a0."city", a0."country_id", a0."country_id" FROM "addresses" AS a0 WHERE (a0."country_id" = ?) ORDER BY a0."country_id" [2]3233"postal_code_ECU_1"
Note that the second f~>country.addresses[0].postal_code expression performed two queries for us, one to retrieve the country, and a second one to retrieve the country addresses association 🎉 .
You can find the library on Github, hope it's as useful to you as it is to me.
By now we walked through most of the EctoExplorer library, but so far we didn't talk about the context that made it possible 🗺️ . The pain with the bazillion of Repo.preload/2 calls, the initial thought process, the additional exploration, and the eventual solution all happened during my normal scope of work here at Remote.
As much as I like to dabble with Elixir metaprogramming, balancing my personal time (family, hobbies) with my daily work means that a library like could never be ready so quickly if it wasn’t for the incredible flexibility of Remote’s weekly learning time.
At Remote we are encouraged to invest two to four hours of our week to expand our knowledge and skills, and this isn't just a pretty bullet point in a slideshow presentation. Our team managers proactively encourage us to use and treasure this time. This learning window is something we truly believe makes us better as an organization. Every journey feels much better when you are allowed to grow each step of the way 🐾
This time I decided to scratch my Repo.preload/2 itch, let's see what comes next!
Subscribe to receive the latest
Remote blog posts and updates in your inbox.
Tax and Compliance — 6 min
Tax and Compliance — 6 min
Tax and Compliance — 7 min
Tax and Compliance — 6 min