Indexing Our PostgreSQL Database

Published in Database on Oct 23, 2020 by Brendan Quinlan

Recently, we've been diving into a few performance bottlenecks on several key pages in our app. Being the content management and collaboration platform we are, a typical user will find themselves most often editing and viewing the details and content of a listing.

Unfortunately, these pages were the biggest cultprits of sluggish performance. Given how core they are to our user experience, performance improvements were long overdue.

Using The Heroku New Relic APM Add-on

To start, because we use Heroku for deployment and sever management, we could leverage their New Relic APM add-on to inspect response times on these pages. New Relic's tools for web tansactions offer a fine level of granularity into server performance with segmented response times.

The screenshot above provides an example of this breakdown, showing the different segments of the transaction and their affect on total response time. After quite of bit of sleuthing, some key trends became clear accross these problematic listing pages: several of the the postgres select operations we're bogging down entire page loads.

Now, this wasn't entirely surprising. Listing related pages on Aryeo are typically content heavy - for a property website, for example, dozens of images, videos and other property data are pulled in to construct a single visual tour of that listing. Being a content management platform, we host a lot of images.... millions and millions of images. Therefore, pulling in the images for a listing may require a query accross a database table of multiple millions of records.

To Index Or Not To Index, That Is The Question

That being said, these queries were still a bit too long for our liking. So we dug further. And inevitably, we struck gold. Indexing, or the lack thereof. We weren't indexing some of these key database relations.

PostgreSQL doesn't automatically index foreign keys

PostgreSQL is our database system of choice at Aryeo. When defining a database relation - the listing an image belongs to, for example - foreign keys are used. What we didn't know was that PostgreSQL does not index foreign key relations automatically, a trait of some other relational database structures such as MySQL.

Why is indexing important?

Per the PosgreSQL docucumentation,

"an index allows the database server to find and retrieve specific rows much faster than it could do without an index."

There are additionally a number of different indexing algorithms that drive these performance enhancements, though that discussion is out of the scope of this article.

For our purposes, with select queries on database tables with millions of records, indexing could be gamechanging.

When to avoid indexing?

Indexes are fantastic for foreign key relations, as typically once created, the foreign key of a record won't be updated all that often.

However, additional consideration should be given to indexing database columns that are frequently updated. With indexing, insert and update operations carry added overhead, as upon each inertion or update the index structure must be updated as well. Depending on your database structure, this con may outweight the pro of the added performance benefits of select operations.

Adding Indexes Via A New Laravel Migration

We got to work adding indexes to several of these sluggish database operations. Laravel, our PHP backend framework, comes with a clean way to update database schemas out-of-the-box using migrations. Using a new migration, these updates were quite simple.

Here's an excerpt of one such migration.

 class AddIndexesToFkLookups extends Migration
 {
     public function up()
     {
         Schema::table('photos', function (Blueprint $table) {
             $table->index('uuid');
             $table->index('listing_id');
         });
		
       	 ...
     }

     /**
      * Reverse the migrations.
      *
      * @return void
      */
     public function down()
     {
         Schema::table('photos', function (Blueprint $table) {
             $table->dropIndex(['uuid']);
             $table->dropIndex(['listing_id']);
         });

         ...
     }
 }

The migration above adds indexes to the uuid and listing_id columns of our photos table, one of a number of tables included in this update.

Performance In Production

As suggested above, indexing does have some performance considerations. Overhead is added when creating new or updating current indexes.

Therefore, we did have some concerns prior to running the migration above on our production database:

  • With millions of records accross a number of tables, such a migration may take some time. How long might this take?
  • While the index structure is generating, will our database be "locked up", giving users of Aryeo no way to write new information to the database?

Tests on production database copies in our development environments set an expectation of about a 60 second runtime for the migration. Long, but not the end of the world if we needed to take the app offline for some late night maintenence.

Inevitably, however, this runtime didn't matter. The concern over run time was predicated on the concept of being "locked out" of our database while this migration ran. Following some thorough testing and consultation of some helpful friends in the Laravel community, we were able to dismiss this "locked out" concern. While running the migration, the database should still perform nominally without issue.

So, we were ready to go. And upon release, here's what happened

Select operations for our photos table:

Select operations for our groups (the organization users belong to) table:

Select operations for our property websites table:

That's A Wrap

Accross the board, indexing brought some massive performance boosts to our app. Especially on content heavy pages like property websites where several of these queries can pile up, indexing really shines.

The effects compound to provide faster page loads and an overall better user experience.