This might sound familiar to you:
You join a new project with existing applications. The application and database schema does not look overwhelmingly complex. Yes, the queries span across multiple tables, but nothing unusual.
Anyways, the query performance remains far below your expectations. Having a short look into it, you realize that there are no or only a few indexes. It is clear that there needs to be some improvement. Now there are different approaches to get things started. You can
- Analyze the database schema
- Analyze the code base and queries
- Analyze slow queries logged in the database
- Analyze table statistics
In this post, I will focus on the last point. Let's check out the PostgreSQL table statistics and how they might help identify missing indexes. Just be aware, that this might not be the most complete solution, but it helps tackle some low-hanging fruits.
Table statistics: seq scan vs. index scan
PostgreSQL provides the table pg_stat_user_tables
giving us some basic information about the tables e.g. how they are searched within your queries.
Check out the PostgreSQL documentation for details: https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-PG-STAT-ALL-TABLES-VIEW
For us the most interesting fields are
seq_scan
: bigintidx_scan
: bigint
They indicate if the tables are usually searched sequentially or by an index.
As a rule of thumb, if a table is large, then we would prefer to scan it via index. As the data might not fit into the memory. And reading from the hard disk tends to be slow.
Analyzing statics
Here comes a handy query:
select
relname,
seq_scan - coalesce(idx_scan, 0) "seq_vs_idx",
seq_scan,
idx_scan,
pg_relation_size(relname::regclass) "rel_size",
case
when seq_scan - coalesce(idx_scan, 0) > 0 then 'Check indexes'
else ''
end "hint"
from
pg_stat_user_tables
where
schemaname = 'public' -- or whatever schema you are using
and pg_relation_size(relname::regclass) > 80000
order by
seq_vs_idx desc;
It lists the tables of a certain size and hints if you should check whether all indexes exist or not. The most relevant entries are on top.
Next, you will find an example, which should be analyzed in more detail. At least the first ten entries, look like they are missing an index.
Check indexes of the table
The next query will list all the indexes of your table.
select
tablename,
indexname,
indexdef
from
pg_indexes
where
schemaname = 'public' -- or whatever schema you are using
and
tablename = 'foobar' -- or whatever table you are using
order by
tablename,
indexname;
Here we want to make sure, that we have at least an index of the primary key. And if we already know other columns used in queries, we might want to index those columns, too.
Add missing indexes
There is some more theory about indexes in PostgreSQL, that would exceed this post. Here is a good starting point: https://www.postgresql.org/docs/9.1/indexes.html
In general, here is an example of how an index could be added:
create index INDEX_NAME on
SCHEMA.TABLE_NAME
using btree (FIELD);
Further readings
This post was inspired by