Monday 5 February 2007

Foreword

The purpose of this blog is to gain peer-review for my non-traditional design of database.

The Rhizome-Tesseract (RhiTess for short) design could (theorically) wipe the floor with other RDBMS approaches under the following curcumstances.
  • One or more tables with more than 10^7 rows.
  • A storage medium (disk), where sequential reading is at least 10^2 faster than reading random blocks.
  • A dataset size of greater than 10 times physical RAM.

The project currently has a few working modules (written in C++ and Ruby), but is not yet suitable for even an alpha release. I suffer from a lack of free time.

I believe the advantages of the RhiTess design include:
  • Select queries (with aggregate fields) on tables with more than 10^7 rows are performed 10^4 - 10^7 times faster, than with traditional indices. e.g. SELECT sum(value) FROM sales WHERE sale_date BETWEEN '01/01/06' AND '31/12/06' AND value > 10.00
  • Select queries with multiple conditions can be 10^2 faster. e.g. SELECT * FROM products WHERE description LIKE "%book%" AND weight > 0.2
  • Other database operations are of similar speed to traditional designs (0.5-2 times as fast).
  • Disk space required for storing the data and indices is typically 10-40% that of a traditional RDBMS.

The design, to be described in the following blog entries, may well already be in use. If so, please let me know.

The blog will take the form of a brain-dump, rather than a well-written paper. This is useful to provide prior-art against any subsequent software-patent claims.

If there is sufficient interest in the blog, I will work it into a coherent paper.

I may also be making major mistakes in my performance calculations. My understanding of current databases in limited - it is not my field. I am familiar with the workings of C-ISAM/Informix SE and MySQL MyISAM tables. I am also aware of clustered indices and datacubes. There is much else in DB technology of which I am ignorant. Please forgive, and inform, me if my design is naive and misguided.

Of course, the principle reason for writing this blog is to encourage the guys from Google to offer me a job ;-)

No comments: