15 PostgreSQL DBA Interview Questions and Answers
Prepare for your next technical interview with this guide on PostgreSQL DBA, featuring common and advanced questions to enhance your database skills.
Prepare for your next technical interview with this guide on PostgreSQL DBA, featuring common and advanced questions to enhance your database skills.
PostgreSQL is a powerful, open-source relational database system known for its robustness, scalability, and compliance with SQL standards. It is widely used in various industries for managing large datasets and supporting complex queries. PostgreSQL’s extensibility and support for advanced data types make it a preferred choice for many organizations looking to build reliable and efficient database solutions.
This article offers a curated selection of interview questions tailored for PostgreSQL Database Administrators (DBAs). Reviewing these questions will help you deepen your understanding of PostgreSQL’s features and best practices, ensuring you are well-prepared to demonstrate your expertise in any technical interview setting.
A PostgreSQL DBA manages and maintains PostgreSQL databases. Their responsibilities include:
To find the top 5 largest tables in a PostgreSQL database, use the following SQL query:
SELECT table_name, pg_size_pretty(pg_total_relation_size(table_name)) AS size FROM information_schema.tables WHERE table_schema = 'public' ORDER BY pg_total_relation_size(table_name) DESC LIMIT 5;
Replication in PostgreSQL involves setting up a primary server and standby servers. The primary server handles read and write operations, while standby servers can take over if the primary fails. Key steps include:
1. Configure the Primary Server: Edit postgresql.conf
to enable replication and create a replication user.
2. Base Backup: Use pg_basebackup
to create a copy of the primary server’s data directory.
3. Configure the Standby Server: Copy the base backup to the standby server and edit recovery.conf
.
4. Start the Standby Server: Start the PostgreSQL service to begin streaming WAL records.
5. Monitor Replication: Use pg_stat_replication
to monitor replication status.
To identify slow-running queries, use the pg_stat_statements
system catalog:
SELECT query, calls, total_time, mean_time, stddev_time, rows FROM pg_stat_statements ORDER BY mean_time DESC LIMIT 10;
Ensure the pg_stat_statements
extension is enabled.
PostgreSQL offers several types of indexes:
To check for table bloat, use this SQL query:
SELECT schemaname, tablename, reltuples::bigint AS num_rows, relpages::bigint AS num_pages, otta, ROUND(CASE WHEN otta = 0 THEN 0.0 ELSE sml.relpages / otta::numeric END, 1) AS tbloat, relpages::bigint - otta AS wasted_pages, CASE WHEN relpages < otta THEN 0 ELSE (relpages::bigint - otta) * current_setting('block_size')::bigint / 1024 / 1024 END AS wasted_size FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, COALESCE(CEIL((cc.reltuples * ((datahdr + ma - (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) + nullhdr2 + 4)) / (current_setting('block_size')::numeric - 20::numeric)), 0) AS otta FROM ( SELECT ma, schemaname, tablename, cc.reltuples, cc.relpages, datahdr, (maxalign - CASE WHEN datahdr%maxalign = 0 THEN maxalign ELSE datahdr%maxalign END) AS ma, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr2 FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr, ma FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr, ma FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr, ma FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr, ma FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr, ma FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr, ma FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr, ma FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr, ma FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr, ma FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr, ma FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr, ma FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr, ma FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr, ma FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr, ma FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr, ma FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr, ma FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr, ma FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr, ma FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr, ma FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr, ma FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr, ma FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr, ma FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr, ma FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr, ma FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr, ma FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr, ma FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr, ma FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr, ma FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr, ma FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr, ma FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr, ma FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0 THEN ma ELSE nullhdr%ma END) AS nullhdr, ma FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, (datahdr + (CASE WHEN datahdr%ma = 0 THEN ma ELSE datahdr%ma END)) AS datahdr, nullhdr + (CASE WHEN nullhdr%ma = 0
VACUUM in PostgreSQL reclaims storage occupied by dead tuples. When rows are updated or deleted, the space they occupied isn’t immediately reclaimed. VACUUM is necessary to maintain data visibility for concurrent transactions.
The importance of VACUUM includes:
Types of VACUUM operations:
Configuring PostgreSQL for high availability involves strategies to ensure database accessibility during disruptions. Key components include replication, failover, and monitoring.
To monitor disk usage by each database, use this SQL query:
SELECT pg_database.datname AS database_name, pg_size_pretty(pg_database_size(pg_database.datname)) AS size FROM pg_database;
This query lists all databases and their sizes in a human-readable format.
Partitioning in PostgreSQL can be implemented using range, list, or hash partitioning. Here’s an example using range partitioning:
CREATE TABLE sales ( id serial PRIMARY KEY, sale_date date NOT NULL, amount numeric ) PARTITION BY RANGE (sale_date); CREATE TABLE sales_2022 PARTITION OF sales FOR VALUES FROM ('2022-01-01') TO ('2023-01-01'); CREATE TABLE sales_2023 PARTITION OF sales FOR VALUES FROM ('2023-01-01') TO ('2024-01-01');
Benefits of partitioning include:
To detect deadlocks, use this SQL query to check the pg_locks
system catalog:
SELECT blocked_locks.pid AS blocked_pid, blocked_activity.usename AS blocked_user, blocking_locks.pid AS blocking_pid, blocking_activity.usename AS blocking_user, blocked_activity.query AS blocked_query, blocking_activity.query AS blocking_query FROM pg_catalog.pg_locks blocked_locks JOIN pg_catalog.pg_stat_activity blocked_activity ON blocked_locks.pid = blocked_activity.pid JOIN pg_catalog.pg_locks blocking_locks ON blocking_locks.locktype = blocked_locks.locktype AND blocking_locks.database IS NOT DISTINCT FROM blocked_locks.database AND blocking_locks.relation IS NOT DISTINCT FROM blocked_locks.relation AND blocking_locks.page IS NOT DISTINCT FROM blocked_locks.page AND blocking_locks.tuple IS NOT DISTINCT FROM blocked_locks.tuple AND blocking_locks.virtualxid IS NOT DISTINCT FROM blocked_locks.virtualxid AND blocking_locks.transactionid IS NOT DISTINCT FROM blocked_locks.transactionid AND blocking_locks.classid IS NOT DISTINCT FROM blocked_locks.classid AND blocking_locks.objid IS NOT DISTINCT FROM blocked_locks.objid AND blocking_locks.objsubid IS NOT DISTINCT FROM blocked_locks.objsubid JOIN pg_catalog.pg_stat_activity blocking_activity ON blocking_locks.pid = blocking_activity.pid WHERE NOT blocked_locks.granted;
Write-Ahead Logging (WAL) in PostgreSQL ensures data integrity and durability by logging changes before they are applied. WAL allows for crash recovery and point-in-time recovery (PITR).
WAL works by writing a record of each transaction to a log file before the transaction is committed. This log file is stored on disk, ensuring that even if the system crashes, the log file can be used to replay the transactions and restore the database to its last consistent state.
Securing a PostgreSQL database involves several practices:
To list all active connections, query the pg_stat_activity
system view:
SELECT pid, usename, datname, client_addr, state FROM pg_stat_activity;
This query provides details such as process ID, user, database, client address, and connection state.
Detecting and handling data corruption involves several steps:
To detect data corruption, use:
To handle data corruption, you can: