Friday, July 31, 2009

FairCom C-Tree

  • similar to: SQLite

FairCom is not an illogical choice to follow InterSystems; both companies' databases claim to be among the fastest on the market.

Also, both are "developers'" platforms, designed less with a general-purpose audience in mind and more with a techie audience. Both originate from successful companies that have been in business for a long time, and yet are not so well known outside tech circles.

So what are the differences?

What most people like about FairCom cTree is the access they get to the source code, which allows them to interact with the database through various interfaces, native, ADO, ODBC, etc. I guess that this is also possible with mySQL, SQLite, and perhaps PostGreSQL as well. FairCom predates (or is a contemporary) of most of these products.

Where FairCom differs from Intersystems is that its product is even less open, the cTree Ace SQLExplorer tool notwithstanding. It takes minimal admin effort and it seems targeted at turnkey or embedded systems developers, with its heavy access on C-application layer programming. You can certainly access cTree from C#, but the product is written in C and has a C developer audience in mind first; if performance is its main selling point (which makes sense: connecting from a JVM through a JDBC/ODBC bridge to, say, a remote Cache gateway which will in turn translate the code to native requests is probably akin to entering virtual machine hell), then staying close to the core system is compulsory. More on performance later.

Another thing that Cache and C-Tree have in common (but where they also differ) is that they provide different "views" into the database engine: hierarchical/sparse arrays/B-Trees in the case of Cache, C-Trees with ISAM and SQL interfaces in the case of C-Tree. Relational databases are based on, if memory serves, B-Trees (or B+ trees). However, SQL Server for example, keeps the relational engine very close to the B-Tree structure (time to review those Kalen Delaney books); in fact, I found the whole interaction between the set-based SQL and row-based processing engine quite fascinating.

Both Cache and C-Tree take a slightly different approach; the various interfaces into their storage engines are clearly provided for convenience only; back in the day, as far as I recall, Db-Lib was the library of choice for SQL Server as well (makes you wonder where does TDS live now?) The bottom line is that if you are going to use Cache or C-Tree, you should use the native interfaces; there is no other reason why you would choose C-Tree over a mainstream product such as SQL Server or Oracle, or even mySQL.

C-Tree uses ISAM as its innermost data structure; this harkens back to the mainframe days, and what it means is is that data is accessed directly through indexes, as opposed to allowing the query optimizer to decide which indexes to use (for a relational database).

As per Wikipedia, ISAM data is fixed-length. Indexes are stored in a separate tables and not in the leaves of data tables. MySQL functions on the same principle. A relational mechanism can exist on top of the ISAM structures. A more detailed presentation of the technicalities of working with the system can be found here. 

You can see more details of the structure here - how each table corresponds to a data/index file pair.

The reason I am likening it to SQLite is that it is a niche product that caters to a well-defined group: developers of embedded or turnkey systems (which is not dissimilar to who SQLite targets - remember that it is the db of choice for iTunes and Adobe AIR).