Data integrity ain't that mythical

Via lesscode. Alex Bunardzic, who might have recently found himself hated by many 'database geeks', quoted this interview with David Heinemeier Hansson and wrote another article on the myth of data integrity, where he debunked the 'sacred cows' of the corporate IT practices -- the data integrity of the relational database. His main points were:

  1. Data integrity enforced by the database (store procedures, events, triggers, etc) is a myth.
  2. Ruby on Rails shattered all these to pieces.
  3. Data integrity must rely on application code.

Actually I have no experience with RoR. However, if RoR is built on the mentality to "shatter all database integrity to pieces", I do not think it will out-live Bubble 2.0 Web 2.0, and its usage might be severely restricted to little gadget web applications like those produced by 37signals.

As of "data integrity", yes -- we need those in the database. Where should your defense be against data corruption? As close to the data as possible, or somewhere out there in the same layer as the web services? In a complex B2C/B2B application with multiple entry points, you really cannot afford to have only half the records updated in the transaction, when the state of the system is concurrently updated via WS, CORBA and COM, all developped by different team of programmers!

However, I have to admit that I once thought the same way as DHH. 5 years ago when we started the company, things are wrapped in ORM, and database (Firebird in our case) was nothing but a data store. All SQL calls were simplified to the bare minimal, in the hope that our application might one day be ported to different RDBMS (which never happened). All data integrity was enforced inside the application code behind CORBA.

It might sound like an utopia of ultra-portable code, but man, wasn't I wrong? Performance sucks with ORM when you need to join half a dozen tables. And when you start adding programmers of all levels into the project, when application API is exposed in various forms to serve different customers, and when many of our clients' database grow to 10's of gigabytes (no problem to Firebird, btw) -- without a proper data integrity enforcement inside the database has since turned around and bite us -- multiple times!

Ain't I glad that I have lived out the "myth", the "myth" that a "good" programmer should write generic DB code and wrap everything around in ORM so you don't need to see a line of SQL at all! Ha!