Latest Articles Latest Articles

Apr 21, 2008 Powered By SQLite
Apr 20, 2008 Double-Time Blogs Are Back!
Sep 05, 2006 Finding Prime Numbers Using C++
Jul 01, 2006 Finding Prime Numbers - A Simple Assembly Language Example Using NASM

Powered By SQLite

BookGoldMine.com uses SQLite as its backend. The site is almost one and half years old and SQLite has consistently performed well, including times of fairly high traffic for this type of website - 10,000+ page views in one day. SQLite is still relatively unknown (but is far more popular now than when it was when I first started using it a few years ago) to many developers, or at least web developers. At this point I have to assume that anyone working on embedded applications or desktop applications that need to work at least part of the time offline (e.g. for applications that run on tablet PC) has at least considered SQLite, as these are the kinds of applications it REALLY shines in. My goal here is to show web developers how SQLite can help them as well.

First, I want to point out that I'm not pushing immature technology. SQLite is being used in a TON of applications. It is used as a back end in Mozilla Firefox, Apple Mac OS X, Skype and even the iPhone. You can see more high profile projects using SQLite here.

So why would a web developer want to use SQLite? Tons of reasons; SQLite is:

  • relatively scalable (more on that later)
  • free (open source and at no cost)
  • in-process
  • serverless
  • self-contained
  • transactional

Each of the line items above is discussed in detail on SQLite's about page. The first item on the list probably has most of the web developers reading this cringing. What do I mean by relatively scalable? Let's start with database size limits:

Internally, SQLite can handle databases up to 2^40 bytes (1 terabyte) in size. But the backend interface to POSIX and Win32 limits files to 2^31 (2 gigabytes). SQLite arbitrarily limits the amount of data in one row to 1 megabyte. There is a single #define in the source code that can be changed to raise this limit as high as 16 megabytes if desired. There is a theoretical limit of about 2^32 (4 billion) rows in a single table, but there is no way to test this limit without exceeding the maximum file size, so it is not really an issue. There is also a theoretical limit of about 2^32 tables and indices, but again it is not really possible to reach this limit due to the file size constraint.

Most websites have very little data (remember, even a megabyte is a lot of content; the main BookGoldMine.com database is less than 5 megabytes. I suspect that if you have two gigabytes of data (or even the chance for having that much data) in the first place, you wouldn't even be considering a product like SQLite.

How about scalability from a concurrency standpoint? According to the SQLite FAQ,

Multiple processes can have the same database open at the same time. Multiple processes can be doing a SELECT at the same time. But only one process can be making changes to the database at any moment in time, however. SQLite uses reader/writer locks to control access to the database

We are aware of no other embedded SQL database engine that supports as much concurrency as SQLite. SQLite allows multiple processes to have the database file open at once, and for multiple processes to read the database at once. When any process wants to write, it must lock the entire database file for the duration of its update. But that normally only takes a few milliseconds. Other processes just wait on the writer to finish then continue about their business. Other embedded SQL database engines typically only allow a single process to connect to the database at once.

However, client/server database engines (such as PostgreSQL, MySQL, or Oracle) usually support a higher level of concurrency and allow multiple processes to be writing to the same database at the same time. This is possible in a client/server database because there is always a single well-controlled server process available to coordinate access. If your application has a need for a lot of concurrency, then you should consider using a client/server database. But experience suggests that most applications need much less concurrency than their designers imagine.

So, if the website you are developing is like the vast, vast majority of websites out there, the fact that SQLite blocks when it writes is not a show stopper. Most websites do not update very often at all (relatively speaking; meaning not writing to the database every few milliseconds). I suspect that if you're writing the next Travelocity, SQLite was not a contender anyway.

If you're still reading this, it means that the website you are developing is not highly concurrent and does not use many Gigabytes of data. Good. You still probably have scalability questions. SQLite has this "rule of thumb",

SQLite usually will work great as the database engine for low to medium traffic websites (which is to say, 99.9% of all websites). The amount of web traffic that SQLite can handle depends, of course, on how heavily the website uses its database. Generally speaking, any site that gets fewer than 100K hits/day should work fine with SQLite. The 100K hits/day figure is a conservative estimate, not a hard upper bound. SQLite has been demonstrated to work with 10 times that amount of traffic.

...If your website is so busy that you are thinking of splitting the database component off onto a separate machine, then you should definitely consider using an enterprise-class client/server database engine instead of SQLite.

Now that we've gotten all of the weaknesses and potential pitfalls out of the way, let's talk about what kind of projects would benefit from SQLite. I'll start off with an example that hits close to home. At one time or another, I've owned and maintained several websites running on shared hosting services (these are the typical $60 - $300 a year plans). All of them used a database backend to some capacity - MySQL, SQL Server and MS Access. The results were always less than spectacular; performance and uptime were relatively low. The culprit was always the backend.

I can sense that some readers are rolling their eyes at me now. "SQL Server and MySQL couldn't perform? This guy is insane!" I'm not arguing that those two DBMSs are bad; I'm arguing that they perform generally bad on shared hosting services - which I imagine constitute the vast majority of websites out there. Why? Because shared hosted services put lots of users on their DBMS. That means that you have a good chance of a poorly written query (e.g. report) bogging down the entire database server. Compound a few of those and your website runs dog slow (if it runs at all). SQLite can run directly on the web server, which generally isn't taxed nearly as heavily as the database server. You might be wondering why MS Access performed poorly. After all, MS Access is file based just like SQLite and it can run on the web server as well. MS Access kept giving me file locking issues. Granted, it could have been poorly written queries (I was green at the time), but the experience was enough to keep me away from Access.

Another advantage to using SQLite in a shared environment is that shared environments almost always limit you to only one DBMS database and most limit you to much less disk space on the DBMS than on your web server (e.g. 100MB vs 4GB). SQLite allows you to separate databases logically and allows you to use more of that disk space you paid for anyway.

Even if you're not using a shared hosting plan, you may still want to consider SQLite. Because SQLite is in-process, it may be able to provide a faster response than a DBMS such as MySQL (and database administration is not necessary). Jared Greeno did some benchmark testing and summarized his findings,

SQLite performs at the same or better speed vs. MySQL when dealing with "single-threaded" activty, i.e. only one process reading and writing to the database. SQLite's locking must be taken into account when inserting into a table, however. Further tests on concurrent access are in order.

Which tells me that if you're website/database is getting lots of concurrent hits, MySQL is the better performer; while a website with less traffic (the majority of websites) will see better performance with SQLite. All things being equal of course; there are other reasons to want to use MySQL than just performance.

SQLite has proven itself to be very stable and fast in the desktop and embedded field. I look forward to seeing it power many small and midsize websites that need a good backend. DBMSs can handle these types of tasks easily; unfortunately they do not perform very well in many shared hosted environments. SQLite offers an excellent alternative for websites running in this type of environment.