While migrating a ruby app using active-record from using character
columns in our postgres database to uuid
s I learned about a difference in using the hash syntax versus the array syntax in where clauses.
TLDR; Prefer the hash syntax over array syntax. When using data from an untrusted source like the web in a where
clause for non string columns you will need to validate it like you’d validate your models using a save
if you want predictable behaviour.
Consider the following table and test data
create_table :players do |t|
t.uuid :player_id
t.number_of_games :number_of_games
end
class Player < ActiveRecord::Base
end
Player.create!(player_id: SecureRandom.uuid, number_of_games: 1)
Player.create!(player_id: SecureRandom.uuid, number_of_games: 0)
It is not uncommon for applications to use inputs from the web in queries that are directly taken from an url. For instance consider the following Sinatra code.
get '/player/:id' do
Player.where(player_id: params[:id])
end
Unlike when saving ActiveRecord::Base
objects, where you typically validate before saving, validation is usually skipped when querying. Since active-record takes case of building the query in order to avoid injection attacks this is considered safe.
Now let’s do some querying to see the difference between the hash and array syntax:
# The hash syntax
[8] pry(main)> Player.where(player_id: 'foo').first
DEBUG, 20161118 13:49:14-363: Player Load (0.2ms) SELECT "players".* FROM "players" WHERE "players"."player_id" = $1 ORDER BY "players"."id" ASC LIMIT 1 [["player_id", nil]]
=> nil
Nothing is returned. This makes sense since 'foo'
isn’t a valid uuid
. As you can see active-record converts this to player_id = NULL
which always yields to false since NULL = NULL
is never true (unlike NULL is NULL
).
What about the array syntax?
# the array syntax
[9] pry(main)> Player.where('player_id = ?', 'foo').first
DEBUG, 20161118 13:58:36-126: Player Load (8.1ms) SELECT "players".* FROM "players" WHERE (player_id = 'foo') ORDER BY "players"."id" ASC LIMIT 1
ActiveRecord::StatementInvalid: PG::InvalidTextRepresentation: ERROR: invalid input syntax for uuid: "foo"
LINE 1: ...CT "players".* FROM "players" WHERE (player_id = 'foo') ORDER BY "players"."id" ASC LIMIT 1
Unlike the hash syntax this fails with a ActiveRecord::StatementInvalid
. I was never aware of this difference until we migrated to actual uuid‘s in the database. Some specs failed that were using invalid uuid’s as test data. We fixed the code by always using the hash syntax (since that is the behaviour we wanted)
Then we also checked how this would work for integer
columns…
First let’s check the array syntax:
# the array syntax
[19] pry(main)> Player.where('number_of_games = ?', 'foo').first
DEBUG, 20161118 14:14:30-711: Player Load (1.3ms) SELECT "players".* FROM "players" WHERE (number_of_games = 'foo') ORDER BY "players"."id" ASC LIMIT 1
ActiveRecord::StatementInvalid: PG::InvalidTextRepresentation: ERROR: invalid input syntax for integer: "foo"
LINE 1: ...layers.* FROM "players" WHERE (number_of_games = 'foo') ORDER BY "players"."id" ASC LIMIT 1
This is consistent with the uuid example, both fail with a ActiveRecord::StatementInvalid
.
How about the hash syntax?
# The hash syntax
[18] pry(main)> Player.where(number_of_games: 'foo').first
DEBUG, 20161118 14:08:09-310: Player Load (0.2ms) SELECT "players".* FROM "players" WHERE "players"."number_of_games" = $1 ORDER BY "players"."id" ASC LIMIT 1 [["number_of_games", 0]]
=> #<Player:0x007fed85cfeef8 id: 2, player_id: "0728be62-e766-4c29-ad52-be4a43dc7f9f", number_of_games: 0>
Wait what? This actually returns our second player because the value 'foo'
is transformed into 0
. Although this is reported as bug a couple of times, it was dismissed because it is considered consistent with a create
or update
operation.
So I guess we have to live with this difference for now. If you were not aware of this, it leads to unexpected behaviour when using it like:
get '/players/number_of_games/:count' do
Player.where(number_of_games: params[:count])
end
As we saw this will return all players with 0 number_of_games when querying it like ‘/players/number_of_games/foobar’
So to have the same behaviour of uuid
's (and predictable behaviour) the above code can to be written as
get '/players/number_of_games/:count' do
valid_integer?(params[:count]) ?
Player.where(number_of_games: params[:count]) :
Player.none
end
helpers do
def valid_integer?(string)
Integer(string)
rescue
false
end
end