More on CouchDB

So yesterday I gave a peek at CouchDB. Now let’s get a bit more in depth going step by step.

CouchDB is:

  • A document database server, accessible via a RESTful JSON API.

Instead of dealing in terms of tables and rows one works with collections of documents.  Documents are objects composed of named fields.  Additionally instead of using SQL queries to manipulate and retreive the data, HTTP queries (the RESTful part) containing serialized javascript (the JSON part).  For example instead of using a SQL query such as DELETE FROM table WHERE table.id = 2 one would make a HTTP “delete” request (that’s right folks, HTTP has more than “get” and “post”) containing the serialized javascript which defines the documents to be deleted.

  • Ad-hoc and schema-free with a flat address space.

There is no predefined structure of the data stored a CouchDB, documents are simply accepted as is.  Structure is later formed by views which collect and organize data.

  • Distributed, featuring robust, incremental replication with bi-directional conflict detection and management

CouchDB scales easily and quickly.  It is able to have multiple physical servers running the same database, all performing full CRUD operations, even when they are not able to communicate with each other.  Sounds like magic?  How do they do they possibly do this?  To be honest I’m still wrapping my brain around the details of this, if you would like to start digging in however check out their wiki article on distrbuted updates and replication.

  • Query-able and index-able, featuring a table oriented reporting engine that uses Javascript as a query language.

This is where those views I mentioned before come in.  Using javascript you can define what data you want to work with as well as the structure of said data.  Once you’ve defined a view you’ll probably want to use it more than once, here’s where the indexes come in.  Much like traditional databases CouchDB uses indexes to optimize the process of finding documents which belong to a particular view.  This allows the database to remain fast, responsive, and efficient.

And CouchDB is not:

  • A relational database.

CouchDB does not base all of its data on tables (a.k.a. relations) therefore it’s not a relational database.

  • A replacement for relational databases.

CouchDB is organized and works in a fundamentally different way than relational databases, accordingly you cannot simply use it as a “drop-in” replacement for MySQL or PostgreSQL.  Going a step further there are a variety of applications where a relational database simply makes more sense than a document database, therefore CouchDB doesn’t make sense for those applications.  On the other hand if you are starting a system from scratch or doing a fundamental re-write of a system, and a document database makes sense for your application then you may want to consider CouchDB.

  • An object-oriented database. More specifically, CouchDB is not meant to function as a seamless persistence layer for an OO programming language.

Pretty straight forward, don’t try to simply store your OO programming language objects as documents in CouchDB and expect everything to be ok.  This is not what CouchDB is designed to do so it won’t be able to directly support all of the details and nuances of your data model(s) that your OO language can.

You should now have a basic understanding of what CouchDB is and if its of interest to you.  Be warned that CouchDB is still an Alpha product and is in no way ready for prime time, production applications.  It is something that I think will become important and useful as it matures.  Keep an eye here as well as the CouchDB Site to see what develops with CouchDB.

~ by seanoc on October 12, 2007.

Leave a comment