AWS Database Blog
Amazon Aurora PostgreSQL parameters, Part 4: ANSI compatibility options
Organizations today have a strategy to migrate from traditional databases and as they plan their migration, they don’t want to compromise on performance, availability, and security features. Amazon Aurora is a cloud native relational database service that combines the speed and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. The PostgreSQL-compatible edition of Aurora delivers up to 3X the throughput of standard PostgreSQL running on the same hardware, enabling existing PostgreSQL applications and tools to run without requiring modification. The combination of PostgreSQL compatibility with Aurora enterprise database capabilities provides an ideal target for commercial database migrations.
Amazon Aurora PostgreSQL-Compatible Edition has enhancements at the engine level which improves the performance for high concurrent OLTP workload, and also helps bridge the feature gap between commercial engines and open-source engines. While the default parameter settings for Aurora PostgreSQL are good for most of the workloads, customers who migrate their workloads from commercial engines may need to tune some of the parameters according to performance and other non-functional requirements. Even for workloads which are migrated from PostgreSQL to Aurora PostgreSQL, we may need to relook at some of the parameter settings because of architectural differences and engine level optimizations.
In this four part series, we explain parameters specific to Aurora PostgreSQL. We also delve into certain PostgreSQL database parameters that apply to Aurora PostgreSQL, how they behave differently, and how to set these parameters to leverage or control additional features in Aurora PostgreSQL.
In part one of this series, we discussed the instance memory-related parameters and Query Plan Management parameters that can help you tune Amazon Aurora PostgreSQL. In part two, we discussed parameters related to replication, security, and logging. We covered Aurora PostgreSQL optimizer parameters in part three which can improve performance of queries. In this part, we will cover parameters which can align Aurora PostgreSQL closer to American National Standards Institute (ANSI) standards and reduce the migration effort when migrating from commercial engines.
The ANSI has approved committees of standards developing organizations that publish best practices and standards for database query languages. Most vendors modify SQL to meet their needs and generally base their programs off the current version of this standard. The international standard (now ISO/IEC 9075) has been revised periodically ever since the first in 1986 and most recently in 2016.
The PostgreSQL community tries to maintain compatibility with ANSI SQL. But some PostgreSQL behaviors don’t exactly comply with ANSI specifications. In other cases, although PostgreSQL complies with ANSI specifications, the syntax accepted by PostgreSQL is slightly different from commercial engines. Several customers, especially ISVs, strive to keep their code ANSI compatible so as to allow for DB engine independence for their product or offering.
Amazon Aurora PostgreSQL adds additional capabilities that can be helpful for retaining behavior when migrating from other databases, such as Oracle or Microsoft SQL Server. These features were introduced in Aurora PostgreSQL 3.1.0 (compatible with PostgreSQL 11.6) and Aurora PostgreSQL 2.4.0 (compatible with PostgreSQL v10.11) and are controlled by additional parameters. These features are also available in newer major versions release such as Aurora PostgreSQL 4.x (compatible with PostgreSQL 12).
In this post, we cover parameters that control these compatibility behaviors in Aurora PostgreSQL.
ansi_constraint_trigger_ordering
This parameter controls whether Aurora PostgreSQL retains PostgreSQL behavior or complies with ANSI specifications regarding the run order for user-defined trigger actions and triggers defined for internal constraints. Switching it off reverts back to PostgreSQL behavior, meaning triggers run in alphabetical order.
As per the pg_settings
catalog, this parameter is described as the following code:
Let’s understand how this parameter affects your query behavior.
PostgreSQL behavior
In PostgreSQL, if two triggers have the same firing criteria (such as AFTER INSERT FOR EVERY ROW
or BEFORE DELETE FOR EVERY ROW
), the run order is decided based on their alphabetical order (pure ASCII sorting). The triggers with a name in uppercase are fired first, and followed by the triggers in lowercase in alphabetical order. This can be useful if you want to control the order of your triggers. For example, I can define two before insert triggers on a table pgbench_branches
: pgbench_branches_trig_B_I_R_001
and pgbench_branches_trig_B_I_R_010
. In this case, the trigger pgbench_branches_trig_B_I_R_001
fires before pgbench_branches_trig_B_I_R_010
.
ANSI standards
SQL ANSI standards require that triggers be fired in the order in which they’re created. Although this makes sense, it adds an additional responsibility on programmers to drop and create all the triggers whenever introducing a new trigger. To make things easier, some of the engines implement an additional feature so you can specify an additional property—ORDER
—while defining the trigger. PostgreSQL takes a different approach; it’s not hard to emulate what other engines offer with additional syntax by following a naming convention as we discussed.
Internal triggers
PostgreSQL implements internal and user-defined constraints as triggers. For example, even though we don’t define any trigger on pgbench_tellers
, internal triggers are defined because of a referential integrity constraint (commonly referred to as a foreign key). Let’s look at the triggers currently defined on pgbench_tellers
, which has four internal triggers. The internal triggers are defined to trigger an action or check whenever we use data modification language (DML) on pgbench_tellers
:
Now let’s look at triggers defined on pgbench_branches
. Triggers are defined to cascade or restrict an action to the child table when a DML fires on pgbench_branches
. A set of triggers is defined for each parent or child table—pgbench_tellers
and pgbench_accounts
:
Let’s add a new AFTER TRIGGER
on pgbench_teller
, which adds a row to pgbench_branches
by selecting a row from pgbench_teller
. We’re not using the new
variable here to insert an incoming row. Instead, we’re getting a row that exists in pgbench_teller
but the corresponding branch doesn’t exist in pgbench_branches
:
Let’s look at the triggers currently defined on pgbench_tellers
. In addition to the four internal triggers we saw earlier, the code contains a user-defined trigger (the one we just created):
Let’s insert a row in pgbench_tellers
where the bid
does not yet exist in pgbench_branches
:
The insert fails with a foreign key violation constraint exception because before the user-defined trigger could fire and insert a row in pgbench_branches
, the trigger related to the referential integrity constraint (RI_ConstraintTrigger_c_22663
) was fired, which rejected the row:
When multiple triggers meet the same firing criteria, they are fired in alphabetical order. For more information, see Overview of Trigger Behavior.
If we have a different name for our trigger, it has a different impact:
Now let’s execute our insert statement again:
What changed in this case was the name of the trigger. Now that the trigger name is in uppercase and starts with P
, it can fire before the trigger defined for the referential integrity constraint (RI_ConstraintTrigger_c_22663
).
This behavior of PostgreSQL isn’t compliant with ANSI specifications and it can cause incompatibility when migrating an application from another relational database.
Effect of the Aurora PostgreSQL parameter
Now let’s change the parameter ansi_constraint_trigger_ordering
in the DB cluster parameter group with the Aurora instance we’re using for these tests:
This parameter makes sure that PostgreSQL follows the ANSI specifications. It ensures that internal constraint triggers are fired first, followed by user-defined constraint triggers, then user-defined triggers.
If you’re using an application that’s compatible with PostgreSQL and prefer to stick to default behavior, you can switch this off in the DB cluster parameter group. Although you can’t change this parameter for a specific transaction or session, a change to this parameter is dynamic and doesn’t require a restart of the DB instance.
ansi_force_foreign_key_checks
This parameter controls whether Aurora PostgreSQL retains PostgreSQL behavior or complies with ANSI specifications for imposing foreign key constraints when a cascaded action is defined in the constraint. Switching it off reverts back to PostgreSQL behavior.
The following description is provided in pg_settings
:
Let’s understand how this parameter affects your query behavior.
ANSI standards
SQL ANSI standards require that any operation cascaded to a child table because of a referential integrity constraint should be applied irrespective of trigger actions defined on the child table. Let’s consider an example of a trigger that is defined on a child table to trigger upon delete and the trigger is defined to skip deletion and perform some other operation instead. Now if we delete a row from the parent table, it will lead to delete on the child table as well. In such a scenario, the trigger will not impact cascaded delete operation and the rows from the child table will be removed irrespective of the trigger behavior defined.
PostgreSQL behavior
For PostgreSQL, because the referential integrity constraints are defined as an AFTER
trigger, there is a chance that a cascaded delete or cascaded update for a child table is skipped as an effect of a BEFORE
trigger that exists on the child table. This leaves the database in an inconsistent state which is hard to debug unless you drop and recreate the foreign key constraint. The inconsistency also makes it hard to trust table metadata information (foreign key constraint) for the purpose of query optimization, such as removing redundant inner joins on the guarantees of referential integrity constraints.
Let’s see how it works in a practical scenario.
In the following code, we make changes to the foreign key constraint on pgbench_teller
and pgbench_accounts
so that any DELETE
on the parent table (pgbench_branches
) is also CASCADED
to these child tables:
Now suppose we have a requirement to ensure that tellers are not removed from pgbench_teller
if they still hold some balance (if the tbalance
is more than 0). Let’s add a trigger on pgbench_teller
to reject such deletes:
To test if the trigger is working or not, we can delete a branch which has more than a 0 balance. In my dataset generated by pgbench, row with tid=1
in pgbench_tellers
is one such row:
Now let’s delete a row from the parent table – pgbench_branches
:
After this delete, the database is left in an inconsistent state:
Not only does this violate the behavior for the foreign key defined by ANSI, it also makes it hard for the database optimizer to perform optimizations like removing redundant joins. For example, the optimization that we discussed in the previous part of this blog series (part 3), with pgbench_v_teller
view, can’t be applied. Skipping a join with pgbench_branches
would now produces inconsistent results:
If we drop and try to recreate the foreign key constraint, it fails:
Let’s fix the data and add a foreign key:
Effect of the Aurora PostgreSQL parameter
Now use the Amazon Relational Database Service (Amazon RDS) console to change the parameter ansi_force_foreign_key_checks
in the DB cluster parameter group.
Let’s run the delete statement again:
When ansi_force_foreign_key_checks
is enabled, Aurora PostgreSQL makes sure that the referential integrity constraint is enforced irrespective of trigger context for user-defined triggers. If the triggers attempt to suppress or skip the cascaded DELETE
or cascaded UPDATE
, the original action on the parent table is also rolled back.
If you’re using an application that’s compatible with PostgreSQL and prefer to stick to the default behavior, you can switch this off in the DB cluster parameter group. Although you can’t change this parameter for a specific transaction or session, a change to this parameter is dynamic and doesn’t require a restart of the DB instance.
ansi_qualified_update_set_target
This parameter controls if Aurora PostgreSQL retains PostgreSQL’s default behavior when parsing the column name in the SET
clause of an UPDATE
statement, or if it accepts syntax that’s consistent with what’s allowed by Oracle and SQL Server.
The following is the description in the pg_settings
view:
PostgreSQL behavior
The following code is the UPDATE
syntax as per PostgreSQL documentation:
The column_name
can be specified in the table named by table_name
. The column name can be qualified with a subfield name or array subscript, if needed. Don’t include the table’s name in the specification of a target column. For example, UPDATE table_name SET table_name.col = 1
is invalid.
PostgreSQL throws an exception if UPDATE
queries have a fully qualified table name or even a column name with a table name prefix in the SET
clause:
As the exception explains, PostgreSQL is expecting pgbench_terllers
or public
to be a column and tbalance
to be a subfield. This behavior of PostgreSQL can increase the effort required to port application code when migrating from commercial engines like Oracle and SQL Server.
Effect of the Aurora PostgreSQL parameter
The commercial engines allow both syntaxes and we can set Aurora PostgreSQL to behave the same way by setting ansi_qualified_update_set_target
to on
:
This can be useful when migrating an application from Oracle or SQL Server. It allows UPDATE
statements in your application with Oracle or SQL Server-compatible syntax to work with Aurora PostgreSQL with little or no change.
You can set this parameter in the DB cluster parameter group or at the session level. A change to this parameter doesn’t require a restart of the DB instance.
Conclusion
Aurora PostgreSQL has several levers by way of parameters that allow for ANSI compliance, which helps a great deal if you’re migrating from other engines to Aurora PostgreSQL.
As part of this four part blog series, we covered parameters related to memory and query plan management in part one. In part two, we covered replication, security and logging parameters. Part three and part four, covered detailed explanation of parameters that can be used to modify Aurora PostgreSQL behavior to improve query performance and increase adherence to ANSI standard, which is helpful while migrating applications from other database engines.
AWS continues to iterate on customer feedback and improve Aurora, which offers enterprise-grade features on popular open-source database engines.
About the authors
Sameer Kumar is a Database Specialist Technical Account Manager at Amazon Web Services. He focuses on Amazon RDS, Amazon Aurora and Amazon DocumentDB. He works with enterprise customers providing technical assistance on database operational performance and sharing database best practices.
Gopalakrishnan Subramanian is a Database Specialist solutions architect at Amazon Web Services. He works with our customers to provide guidance and technical assistance on database projects, helping them improving the value of their solutions when using AWS