Atomic conditional inserts with complex conditions

I was going to blog about the software I’m currently writing, but that’s not happening, yet (although it’s been beta-test-ready for about a week and, given that I’ve made just under 100 subversion larger-than-ideal check-ins in the last seven days, it’s been getting refinements literally almost by the hour).  Instead, I’m going to share a little trick I came up with to do atomic conditional insertions into a database without using triggers or conflicts.  Indeed, I’m using little old sqlite3, so there’s nothing fancy going on in this database at all.  Sure, the software I’m currently working on is probably going to be open sourced at some point, but this would just get buried in there.

It’s all really quite simple.  Instead of just stating it straight up, we’ll build it from the ground up.  Consider the case of inserting an e-mail address into a table, regardless of case. ((Yes, we can create a case-insensitive index in sqlite3 or convert all e-mail addresses into a single case, but this syntax generalizes to every other database system I’ve worked with and also allows for much more complex queries — and keeps people’s e-mail addresses intact with the case they prefer.))

First, we begin with a a basic INSERT statement.  Nothing up my sleeve here!  Using python and qmark notation, we have

'''INSERT INTO users (email) VALUES (?)''', (email,)

Not very conditional-y at all.  Next, we note that we can insert from the results of a SELECT:

'''INSERT INTO users (email) (SELECT ?)''', (email,)

We seem to have introduced complexity without complex conditionals.  But note that we can use the cross-product properties of SELECT to create a 1 x 1 matrix if the e-mail address will be unique (i.e., is not yet in the database) and 0 x 0 matrix otherwise!  Here’s how:

'''SELECT 1 FROM (SELECT COUNT(*) AS theCount FROM users WHERE UPPER(email) = UPPER(?)) AS temp WHERE temp.theCount = 0''' % (email,)

Look!  We’re almost there.  We return 1 x 1 matrix [1] if the e-mail address is not yet in the database (the part that reads UPPER(email) = UPPER(?)) and an empty set if it is.  Now, instead of inserting the value 1, we insert the e-mail address.  We thus get

'''INSERT INTO users (email) (SELECT ? FROM (SELECT COUNT(*) AS theCount FROM users WHERE UPPER(email) = UPPER(?)) AS temp WHERE temp.theCount = 0''' % (email, email)

This generalizes to more complex queries that take the form

'''INSERT INTO the_table (c1, c2, ..., cn) (SELECT v1, v2, ..., vn FROM (SELECT COUNT(*) AS theCount FROM (select conditions)) AS temp WHERE temp.theCount = 0''' % (v1, v2, ..., vn, ...)

where one wishes to insert values v1 through vn into columns c1 through cn if select conditions are true, inserting appropriate variables in the qmark tuple as appropriate to feed the conditions.  Because the INSERT is being performed in the same query as the SELECTs, we’re still in atomic-land.  And there you have it.  Portable, pretty-much- arbitrarily-complex conditions…  simply!

Leave a Reply

Your email address will not be published. Required fields are marked *