PostgreSQL indexes Types and Functionality
This article will discuss PostgreSQL indexes, their types, and their uses. Indexes are designed to optimize database queries, but they add memory overhead. Therefore, they should be used wisely and with understanding.
Indexes types
The main thing to know about indexes is that they are different. Each is used in certain cases. By default, PostgreSQL creates a B-Tree index, which is the most versatile. The second most used is the Hash index, which is ideal for searching for specific records in a table. Other indexes, such as GiST, SP-GiST, GIN, and BRIN, are more situational. Now, here are a few more details about each of them.
B-Tree
The B-tree index is suitable for sorted data. In other words, the operators “greater than,” “greater than or equal to,” “less than,” “less than or equal to,” and “equal to” must be defined for the data type.
Hash
Hash indexes are designed for fast key-value searches. They are created based on the value’s hash code, which gives us incredible search speed when using “=” or “IN.”
GiST
The GiST (Generalized Search Tree) is a balanced search tree, similar to the B-Tree. But the B-Tree requires the comparisons: “more,” “less,” or “equal" to be solvable. GiST, on the other hand, doesn’t have such strict requirements on comparisons. Modern databases also store types of data for which these operators aren't applicable: i.e. geodata, text documents, or pictures.
SP-GiST
This is a “space-partitioned GiST (generalized search tree)”. This index is used to improve searching within complex data, such as geometric data or IP addresses.This index is designed to support a wide range of data types and complex queries. For example, if you have a table that contains longitude, latitude, and city name, SP-GIST will help you find all places within a 10 kilometer radius of a given point.
GIN
GIN stands for “Generalized Inverted Index,” a reverse index. It works with values that are not atomic but consist of elements; each element refers to a specific value in the table. A good example is the alphabetical index at the end of books, where for each term, a list of pages where this term was used is given.
BRIN
Block Range Index isn’t designed to find a desired value quickly but to avoid viewing unnecessary values. It splits the table into “pages” and stores the “max” and “min” values, making it fast and lightweight for large tables. But within the “page,” the search goes row by row. One column in the table must be responsible for sorting to work well. For example if you have a table “users”, you need to have a “register_date” column that can sort this table for that index.
Multicolumn indexes
An index can be defined on more than one column of a table. A multicolumn index is often called a composite index, a combined index, or a concatenated index. Only B-Tree, GIST, GIN, and BRIN index types support multicolumn indexes. A multicolumn index can include a maximum of 32 columns. You can change the maximum number of columns. But to do it you need to build your own, custom PostgreSQL by yourself.
Syntax for creating:
CREATE INDEX [IF NOT EXISTS] index_name
ON table_name (column1, column2, ...);
Index on Expression
One of the special indexes in PostgreSQL is Indexes on Expressions. This type of index allows us to optimize queries containing expressions. This reduces the time to retrieve data and operations in the database, reducing server power consumption. They allow you to store the result of the expression together and not just the column values. This is very useful when your functions contain calculations, functions, or transformations. Using this index, PostgreSQL will quickly find rows that match a given function\calculation without enumerating it for each row.
Syntax for creating:
CREATE INDEX index_name
ON table_name (expression);
For example if you have a table “users” and that table contains “birth_date” you can use the expression “DATE_PART('year', AGE(birth_date)”.
Partial indexes
PostgreSQL creates an index for the entire column when you create an index on a table. Sometimes, you may want to include only some values from a column in the index. To do this, you can use Partial indexes. This index will be built only for rows that satisfy a predicate. This is useful for optimizing memory usage. For example, create indexes only for rows frequently used.
Syntax for creating:
CREATE [IF NOT EXISTS] INDEX index_name
ON table_name(column1,column2, ...)
WHERE predicate;
Predicate example: status = 'active', amount > 1000, etc. That is, the predicate that will be used in your query
Indexes in order by
PostgreSQL adds sorting to the execution plan when performing a query with sorting. If you create a sorted index, you can exclude this sorting from the plan. That is, PostgreSQL will simply take everything sorted from the created index. This works best when using limits with sorting. PostgreSQL will need to take some of the records from the beginning or end of the sorted index.
Syntax for creating:
CREATE INDEX index_name
ON table_name (column_name [ASC | DESC] [NULLS [FIRST | LAST]]);
Summary
PostgreSQL has many different types of indexes and ways to apply them. By understanding their variability and specifics, you can significantly optimize your database’s performance and even reduce its size.
Indexes are a powerful feature but must be used correctly for enhanced database performance.