AWS Database Blog
Managing object dependencies in PostgreSQL – Overview and helpful inspection queries (Part 1)
In PostgreSQL, object binding (or dependencies) encompasses the relationships existing among various database elements. These interdependencies hold significant importance when it comes to the management and modification of objects within the database. They ensure that adjustments made to one object don’t inadvertently disrupt other dependent objects.
For instance, when a view relies on a table, any alterations to the table structure, such as modifying utilized columns, altering data types, or even dropping the table, can directly influence the functionality of the associated view. To make informed choices and preempt potential issues, it’s imperative to grasp the intricacies of these dependencies.
In this blog series, we explore the concept of object dependencies in PostgreSQL and their significance in managing a database effectively. Understanding and managing object dependencies is crucial for ensuring data integrity and making changes to the database without causing unexpected issues.
In this first post, we provide a comprehensive introduction to object dependencies and discuss various types of dependencies with examples. We also provide the queries that help you find the dependencies based on which you can take appropriate actions.
Prerequisites
To follow along with this post, complete the following prerequisites:
- Create an Amazon Aurora PostgreSQL-Compatible Edition cluster or Amazon Relational Database Service (Amazon RDS) for PostgreSQL instance if you don’t already have one. For instructions, refer to Create an Amazon Aurora PostgreSQL-Compatible DB cluster or Create a PostgreSQL DB instance, respectively.
- Create an Amazon Elastic Compute Cloud (Amazon EC2) instance to install the PostgreSQL client to access the Aurora PostgreSQL or RDS for PostgreSQL instance. For instructions, refer to Create your EC2 resources and launch your EC2 instance. Or you can set up connectivity between your RDS database and EC2 compute instance in one click.
- On Amazon Linux 2023, you can use the following commands to download the psql command line tool to run all the sql statements:
Types of object dependencies
In this section, we introduce the different types of object dependencies and provide example code snippets.
Columns
Column dependencies in PostgreSQL can occur when one column relies on another column within the same table. This is commonly seen in scenarios where a calculated or derived column is based on the values of other columns. By establishing column dependencies, PostgreSQL ensures that the derived column is automatically updated whenever the values of the dependent columns change. This helps maintain data consistency and accuracy within the table.
Let’s consider the following example.
Create a sales table:
In this example, the sales
table has a calculated column, total_amount
, which depends on the values of unit_price
and quantity
. The total_amount
column is generated as the product of unit_price
and quantity
. This is a typical use case where the values of one column depend on the values of other columns in the same table.
The GENERATED ALWAYS AS clause is used to define the derived column, and the STORED
keyword ensures that the value is physically stored, allowing for efficient querying.
This is just one example of column dependencies in PostgreSQL, showcasing how you can create calculated or derived columns that rely on other columns within the same table.
Indexes
When an index is created on a table, it becomes dependent on that table. Any changes made to the table, such as inserting, updating, or deleting rows, can impact the index. For example, if a row is inserted into the table, the index is updated to include the new row for efficient data retrieval.
Let’s consider an example to illustrate this concept:
In this example, we create a table called customers
with columns for customer information. We then create an index called idx_customers_email
on the email
column. The index is dependent on the customers
table.
Now, suppose we insert a new row into the customers
table:
This insertion triggers an update to the idx_customers_email
index, ensuring that it includes the new row for efficient email-based searches.
Sequences
A sequence is an object that generates a sequence of unique values. It is often used to generate unique identifiers for tables.
When a column in a table is defined as a sequence, it establishes a dependency between the column and the sequence object. The column depends on the sequence to generate the next unique value when inserting new records into the table.
Let’s consider the following example:
In this example, we first create a sequence named employee_id_seq
using the CREATE SEQUENCE
statement. This sequence will be used to generate unique values for the employee_id
column.
Next, we create a table called employees
with three columns: employee_id
, first_name
, and last_name
. The employee_id
column is defined with a default value of nextval('employee_id_seq')
, which means that it will automatically obtain the next value from the sequence when a new record is inserted. The employee_id
column is also declared as the primary key of the table.
User-defined data type
A user-defined data type (UDT) represents a structure or record containing single or multiple fields or attributes. It allows you to group related data together into a single object.
UDTs are used to define columns in tables or as the return type of functions. When a table column or a function returns a UDT, it establishes a dependency between the UDT and the table or function.
Let’s consider following example:
In this example, the employees
table has a dependency on the composite data type employee_info
. Any changes made to the definition of the employee_info
type will affect the employees
table.
Views
Views are virtual tables that are generated from the result of a query. They depend on the underlying tables and columns used in the query that defines the view. Any changes made to the referenced tables or columns can impact the views built on them. Views provide a dynamic and user-friendly way to interact with complex data structures, making them an essential aspect of database design.
Let’s consider the following example:
The orders_summary
view relies on the orders
table for its operation. Any adjustments applied to the orders
table columns utilized in the view, whether it involves adding or removing columns or altering the data types of columns, can have an effect on how the orders_summary
view functions.
In this specific case, the orders_summary
view selects the order_id
, customer_id
, and order_date
columns from the orders
table. Therefore, any changes to these columns in the orders
table will be reflected in the orders_summary
view.
Functions and Procedures
Functions in PostgreSQL can have dependencies on tables, views, or other functions. Understanding these dependencies is crucial when developing complex database functions, because it ensures their reliability and consistency.
Let’s consider the following example:
This function, calculate_salary
, depends on a table named employees
. It retrieves the wage
and hours
values from the employees
table based on the employee_id
parameter. The function then calculates and returns the salary by multiplying the hourly_wage
and hours_worked
values.
This function demonstrates a dependency on the employees
table because it relies on the existence of the table and the availability of the wage
and hours
columns within that table. Any changes to the table structure or the column names or types that were directly referenced in the function could potentially impact the functionality of this function.
Constraints
Constraints, such as primary key or unique constraints, establish dependencies on associated columns or tables. For example, a foreign key constraint on a table creates a direct dependency on another table. This ensures that the data in the referencing table adheres to specific rules defined in the referenced table. Constraints play a critical role in enforcing data integrity within a database.
Let’s consider the following example:
In this example, the orders_tab
table has a dependency on the customers
table through the customer_id
foreign key constraint.
Triggers
Triggers are database objects that automatically run in response to specific events, such as data modifications. They can depend on tables, views, or other triggers. When a trigger is fired, it may run actions that depend on the data or structure of the database, creating a dynamic and event-driven aspect of database management.
Let’s consider the following example:
In this example, the trigger function update_total_sales()
multiplies the quantity by 10 (assuming that the price per unit is 10) and updates the total_sales
column. The trigger products_trigger
is set to run this function before an INSERT or UPDATE operation on the products
table.
Inspecting and analyzing object dependencies
In PostgreSQL, it is possible to determine what objects, such as tables, views, or functions, have dependencies on each other. This information is stored in the PostgreSQL system catalog, a set of tables and views that contain metadata about all the objects in the database. By querying these system catalog views with the appropriate SQL, you can gain insights into object dependencies. These dependencies can be important to understand when making changes to database objects, as changes can have cascading effects on dependent objects. Thus, understanding these dependencies can help you maintain the integrity and functionality of your database.
Here are some of the basic catalog views that PostgreSQL provides:
- pg_attribute – This table stores detailed information regarding attributes for tables, views, composite types, and indices.
- pg_namespace – This table holds information related to database schemas.
- pg_class – It functions as a directory for various database objects.
- pg_rewrite – This table contains rules for transforming queries and views.
- pg_depend – It is used to track information about dependencies between different database objects.
- pg_shdepend – This table is used to monitor dependencies on objects that are utilized across multiple databases.
With these tools, you can view the inner workings of your database and understand how different parts are linked. This knowledge helps you make smart decisions when you’re managing your PostgreSQL database, almost like being an investigator in a puzzle, figuring out how all the pieces fit together.
Queries to find the object dependencies
In this section, we walk you through the queries to find the object dependencies for each object type.
Find foreign key dependencies
To find the table dependencies with respect to referential integrity constraint references, use the following query:
From previous examples, the orders
and order_items
tables are dependent. The order_id
column of the order_items
table is dependent on the order_id
column of the orders
table. Let’s run the query for the orders
table and see the dependent table:
Following is the output of the query:
The query shows the order_items
table and order_id
column as dependent in the table created.
Find table type dependencies
The following is a query to find the dependent table type for a table:
From our other example, the car table has a dependency on Person table because one of the column’s data types in the car table is the person table.
Following is the output of the query:
Find column dependencies on another column
The following is the query to find the dependent columns for a column in the same table:
From the example, the sales
table has a total_amount
column, which is a derived column from the quantity
and unit_price
columns. The following query shows those dependent columns:
Following is the output of the query:
Find indexes of a table
The following is a query to find the indexes for a table:
From the example, the customers
table has a unique key and an index. The following is a query to find the indexes for the customers
table:
Following is the output of the query:
Find sequence dependencies on the tables
The following is the query to find the sequence dependencies:
From the example, the employees
table has an employee_id
column with a default value to the next value from a sequence. The following is the query to find the sequence that refers to the employees
table:
Following is the output of the query:
Find UDT dependencies on tables and columns
The following is the query to find UDTs defined to a table’s column:
From the example, the employees_info
table’s info
column is defined as the employee_info
UDT. The following query finds the dependency:
Following is the output of the query:
Find view dependencies
The following is the query to find the dependent views of a table:
From the example, the orders_summary
view is based on the orders
table. The following query finds the dependent views of the orders
table:
Following is the output of the query:
Find triggers
The following is the query to find the trigger dependencies of a table:
From the example, the products
table has products_trigger
. The following query finds the trigger dependencies of the products
table:
Following is the output of the query:
Conclusion
In this first part of our series on object dependencies in PostgreSQL, we introduced the concept and discussed different types of dependencies. We also explored the importance of understanding object dependencies in managing a database effectively. Additionally, we highlighted some of the views and functions available in PostgreSQL for querying object dependencies.
Stay tuned for the next part of our series, where we will dive deep into the tools and techniques for managing object dependencies in PostgreSQL. We will explore how to use DROP and ALTER statements to handle dependencies and discuss best practices for avoiding potential problems.
By gaining a deeper understanding of object dependencies, you can optimize your PostgreSQL database operations, ensure data integrity, and make informed decisions when making changes to your database.