How to identify missing indexes in PostgreSQL

·

3 min read

This might sound familiar to you:

You join a new project with existing applications. The application and database schema does not look overwhelmingly complex. Yes, the queries span across multiple tables, but nothing unusual.

Anyways, the query performance remains far below your expectations. Having a short look into it, you realize that there are no or only a few indexes. It is clear that there needs to be some improvement. Now there are different approaches to get things started. You can

  • Analyze the database schema
  • Analyze the code base and queries
  • Analyze slow queries logged in the database
  • Analyze table statistics

In this post, I will focus on the last point. Let's check out the PostgreSQL table statistics and how they might help identify missing indexes. Just be aware, that this might not be the most complete solution, but it helps tackle some low-hanging fruits.

Table statistics: seq scan vs. index scan

PostgreSQL provides the table pg_stat_user_tables giving us some basic information about the tables e.g. how they are searched within your queries.

Check out the PostgreSQL documentation for details: https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-PG-STAT-ALL-TABLES-VIEW

For us the most interesting fields are

  • seq_scan : bigint
  • idx_scan : bigint

They indicate if the tables are usually searched sequentially or by an index.

As a rule of thumb, if a table is large, then we would prefer to scan it via index. As the data might not fit into the memory. And reading from the hard disk tends to be slow.

Analyzing statics

Here comes a handy query:

select
    relname,
    seq_scan - coalesce(idx_scan, 0) "seq_vs_idx",
    seq_scan,
    idx_scan,
    pg_relation_size(relname::regclass) "rel_size",
    case
        when seq_scan - coalesce(idx_scan, 0) > 0 then 'Check indexes'
        else ''
    end "hint"
from
    pg_stat_user_tables
where
    schemaname = 'public' -- or whatever schema you are using
    and pg_relation_size(relname::regclass) > 80000
order by
    seq_vs_idx desc;

It lists the tables of a certain size and hints if you should check whether all indexes exist or not. The most relevant entries are on top.

Next, you will find an example, which should be analyzed in more detail. At least the first ten entries, look like they are missing an index.

image.png

Check indexes of the table

The next query will list all the indexes of your table.

select
    tablename,
    indexname,
    indexdef
from
    pg_indexes
where
    schemaname = 'public' -- or whatever schema you are using
and
    tablename = 'foobar'  -- or whatever table you are using
order by
    tablename,
    indexname;

Here we want to make sure, that we have at least an index of the primary key. And if we already know other columns used in queries, we might want to index those columns, too.

Add missing indexes

There is some more theory about indexes in PostgreSQL, that would exceed this post. Here is a good starting point: https://www.postgresql.org/docs/9.1/indexes.html

In general, here is an example of how an index could be added:

create index INDEX_NAME on
SCHEMA.TABLE_NAME
    using btree (FIELD);

Further readings

This post was inspired by

Did you find this article valuable?

Support Erik RW by becoming a sponsor. Any amount is appreciated!