Converting Ruby code to idiomatic Elixir
Elixir and Ruby are thought to be similar, because their syntaxes have some similarities.
But are they really?
We’ll rewrite this sample Ruby script that separates vowels from consonants in Elixir and for each transformation step we’ll measure the execution time.
Original Ruby code
# Immutable version
# vowels_immut.rb
module Vowels
VOWELS = %w[a e i o u].freeze
def self.find_vowels
1_000_000.times do
"HelloWorld".downcase.chars.partition { |c| VOWELS.include?(c) }.map(&:join)
end
end
end
Vowels.find_vowels
time ruby ./vowels_immut.rb
real 0m5.622s
user 0m5.551s
sys 0m0.034s
# Mutable version
# vowels_mut.rb
module Vowels
VOWELS = %w[a e i o u].freeze
def self.find_vowels
1_000_000.times do
input = "HelloWorld"
input.downcase!
found_vowels = ""
found_consonants = ""
input.each_char do |c|
if VOWELS.include?(c)
found_vowels << c
else
found_consonants << c
end
end
end
end
end
Vowels.find_vowels
time ruby vowels/vowels_mut.rb
real 0m3.755s
user 0m3.587s
sys 0m0.048s
First attempt
all Elixir examples are compiled first and then executed with the following command line
elixirc <filename>; time elixir -e 'Vowels.find_vowels'
# vowels_0.ex
defmodule Vowels do
@vowels ~w(a e i o u)
def find_vowels do
for _ <- 1..1_000_000 do
"HelloWorld"
|> String.downcase
|> String.codepoints
|> Enum.partition(&Enum.member?(@vowels, &1))
|> Tuple.to_list
|> Enum.map(&(Enum.join/1))
end
end
end
real 0m5.354s
user 0m4.894s
sys 0m0.443s
Very bad, just Ruby translated to Elixir
Second attempt: Typical map/reduce job, removing for loop and going full functional
The for loop is actually list comprehension and returns the value of the
computation, just like executing map.
We can save some time discarding the accumulated result by using Enum.each
# vowels_1.ex
defmodule Vowels do
@vowels ~w(a e i o u)
def find_vowels do
1..1_000_000
|> Enum.each(fn(_) ->
"HelloWorld"
|> String.downcase
|> String.split("")
|> Enum.reduce(["", ""], fn
ch, [vows, cons] when ch in @vowels ->
[vows <> ch, cons]
ch, [vows, cons] ->
[vows, cons <> ch]
end)
end)
end
end
real 0m5.859s
user 0m5.684s
sys 0m0.127s
Not faster, but it’s starting to look like idiomatic Elixir code that uses
key features such as pattern matching
and map/reduce.
For reference using Enum.each
gave us a huge boost compared to the same code
executed in a for loop
# reduce in a for loop
real 0m16.220s
user 0m14.610s
sys 0m1.307s
The strength of Elixir is not running a small kernel of code in a tight loop
Pattern matching and recursion are.
Elixir has powerful pattern matching abilities, including binary strings.
defmodule Vowels do
@vowels 'aeiou'
def find_vowels do
1..1_000_000 |> Enum.each(fn(_) ->
find_vowels("HelloWorld" |> String.downcase, "", "")
end)
end
defp find_vowels(<<vowel::utf8>> <> rest, vowels, consonants) when vowel in @vowels do
find_vowels(rest, vowels <> <<vowel>>, consonants)
end
defp find_vowels(<<consonant::utf8>> <> rest, vowels, consonants) do
find_vowels(rest, vowels, consonants <> <<consonant>>)
end
defp find_vowels(<<>>, vowels, consonants) do
{vowels, consonants}
end
end
real 0m2.502s
user 0m2.421s
sys 0m0.115s
We are getting somewhere now, this is a good boost in performances, and it’s all due to writing more idiomatic code.
Can we do better?
String concatenation is slow in Elixir (and Erlang), we have another way of representing
strings in Elixir: IO Lists.
Io lists are just lists of binaries, that can be arbitrary nested, that Elixir and Erlang know how to handle
like they were strings.
iex(1)> IO.puts(["a", [["b", "c"], "d"], [[["e"]]] ])
abcde
:ok
Let’s apply this new knowledge to our program
defmodule Vowels do
@vowels 'aeiou'
def find_vowels do
1..1_000_000 |> Enum.each(fn(_) ->
find_vowels("HelloWorld" |> String.downcase, [], [])
end)
end
defp find_vowels(<<vowel::utf8>> <> rest, vowels, consonants) when vowel in @vowels do
find_vowels(rest, [vowel | vowels], consonants)
end
defp find_vowels(<<consonant::utf8>> <> rest, vowels, consonants) do
find_vowels(rest, vowels, [consonant | consonants])
end
defp find_vowels(<<>>, vowels, consonants) do
{vowels, consonants}
end
end
real 0m1.375s
user 0m1.311s
sys 0m0.109s
This simple change gave us another 43% improvement.
What more can we do?
One thing that comes to mind is that we are downcasing the whole string every time
we run find_vowels
. We can extend the @vowels
char list (notice the single quote,
that’s not a string in Elixir, it is a list of characters) to include upper case
vowels as well, if we don’t mind the vowel casing.
defmodule Vowels do
@vowels 'aeiouAEIOU'
def find_vowels do
1..1_000_000 |> Enum.each(fn(_) ->
find_vowels("HelloWorld", [], [])
end)
end
defp find_vowels(<<vowel::utf8>> <> rest, vowels, consonants) when vowel in @vowels do
find_vowels(rest, [vowel | vowels], consonants)
end
defp find_vowels(<<consonant::utf8>> <> rest, vowels, consonants) do
find_vowels(rest, vowels, [consonant | consonants])
end
defp find_vowels(<<>>, vowels, consonants) do
{vowels, consonants}
end
end
real 0m0.513s
user 0m0.454s
sys 0m0.105s
This change alone reduced the execution time to 1⁄3 of the previous version.
Ten times better than the first Elixir and Ruby versions.
Last but not least we can add @compile :native
as Module attribute and gain
another 15%.
defmodule Vowels do
@compile :native
...
real 0m0.439s
user 0m0.385s
sys 0m0.085s
The lesson here is to always try to write the most idiomatic code possbile to gain
the maximum benefits and to profile your code, instead of assuming.
To measure and profile Elixir code you can use https://github.com/parroty/exprof.