PostgreSQL Weekly News - August 1, 2021

Posted on 2021-08-02 by PWN
PWN

PostgreSQL Weekly News - August 1, 2021

PostgreSQL in the News

Planet PostgreSQL: https://planet.postgresql.org/

PostgreSQL Weekly News is brought to you this week by David Fetter

Submit news and announcements by Sunday at 3:00pm PST8PDT to [email protected].

Applied Patches

Michaël Paquier pushed:

David Rowley pushed:

Tom Lane pushed:

  • Harden pg_stat_statements tests against CLOBBER_CACHE_ALWAYS. Turns out the buildfarm hasn't been testing this, which will soon change. Julien Rouhaud, per report from me Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/0806d08d4626f0a71a4e83cb399ee965a8aac99b

  • Use the "pg_temp" schema alias in EXPLAIN and related output. This patch causes EXPLAIN output to refer to objects that are in the current session's temp schema with the "pg_temp" schema alias rather than that schema's actual name. This is useful for our own testing purposes since it will stabilize EXPLAIN VERBOSE output for such cases, allowing us to use that in regression tests. It should be less confusing for end users too. Since ruleutils.c needs to change behavior for this, the change also leaks into a few other users of ruleutils.c, for example pg_get_viewdef(). AFAICS that won't cause any problems. We did find that aggressively trying to change this behavior across-the-board would cause issues, but as long as "pg_temp" only appears within generated SQL text, I think it'll be fine. Along the way, make get_namespace_name_or_temp conform to the same API as get_namespace_name, ie that it returns a palloc'd string or NULL. The current behavior hasn't caused any bugs since no callers attempt to pfree the result, but if it gets more widespread usage that could become a problem. Amul Sul, reviewed and extended by me Discussion: https://postgr.es/m/CAAJ_b97W=QaGmag9AhWNbmx3uEYsNkXWL+OVW1_E1D3BtgWvtw@mail.gmail.com https://git.postgresql.org/pg/commitdiff/48c5c9068211e0a04fd9553c8714b2821ed3ad17

  • In event triggers, use "pg_temp" only for our own temp schema. pg_event_trigger_ddl_commands used "pg_temp" to refer to any temp schema, not only that of the current backend. This seems like overreach. It's somewhat unlikely that DDL commands would refer to temp objects of other sessions to begin with, but if they do, "pg_temp" would be a most misleading way to display the action. While this seems like a bug, it's not quite out of the realm of possibility that somebody out there is expecting the current behavior. Hence, fix in HEAD, but don't back-patch. Discussion: https://postgr.es/m/CAAJ_b97W=QaGmag9AhWNbmx3uEYsNkXWL+OVW1_E1D3BtgWvtw@mail.gmail.com https://git.postgresql.org/pg/commitdiff/024515cac50e246d92bbe67e9de4da0f302972ef

  • Stabilize output of new regression test. Commit 48c5c9068 failed to allow for buildfarm animals that force jit = on. I'm surprised that this hasn't come up elsewhere in explain.sql, so turn it off for that whole test script not just the one new test case. Per buildfarm. https://git.postgresql.org/pg/commitdiff/674f6fe8e65a37289432fa373e5d742a36751ae8

  • Fix bugs in polymorphic-argument resolution for multiranges. We failed to deal with an UNKNOWN-type input for anycompatiblemultirange; that should throw an error indicating that we don't know how to resolve the multirange type. We also failed to infer the type of an anycompatiblerange output from an anycompatiblemultirange input or vice versa. Per bug #17066 from Alexander Lakhin. Back-patch to v14 where multiranges were added. Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/336ea6e6ff1109e7b83370565e3cb211804fda0c

  • Improve libpq's handling of OOM during error message construction. Commit ffa2e4670 changed libpq so that multiple error reports occurring during one operation (a connection attempt or query) are accumulated in conn->errorMessage, where before new ones usually replaced any prior error. At least in theory, that makes us more vulnerable to running out of memory for the errorMessage buffer. If it did happen, the user would be left with just an empty-string error report, which is pretty unhelpful. We can improve this by relying on pqexpbuffer.c's existing "broken buffer" convention to track whether we've hit OOM for the current operation's error string, and then substituting a constant "out of memory" string in the small number of places where the errorMessage is read out. While at it, apply the same method to similar OOM cases in pqInternalNotice and pqGetErrorNotice3. Back-patch to v14 where ffa2e4670 came in. In principle this could go back further; but in view of the lack of field reports, the hazard seems negligible in older branches. Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/514b4c11d24701d2cc90ad75ed787bf1380af673

  • In postgres_fdw, allow CASE expressions to be pushed to the remote server. This is simple enough except for the need to check whether CaseTestExpr nodes have a collation that is not derived from a remote Var. For that, examine the CASE's "arg" expression and then pass that info down into the recursive examination of the WHEN expressions. Alexander Pyhalov, reviewed by Gilles Darold and myself Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/5d44fff01eebfe0c7a118326605864e662a44f46

  • Doc: add a glossary entry for "domain". Anton Voloshin and Jürgen Purtz, reviewed by Laurenz Albe Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/3dddb2a8219d3cfd163a83561d4b4613cd2ea376

  • Use elog, not Assert, to report failure to provide an outer snapshot. As of commit 84f5c2908, executing SQL commands (via SPI or otherwise) requires having either an active Portal, or a caller-established active snapshot. We were simply Assert'ing that that's the case. But we've now had a couple different reports of people testing extensions that didn't meet this requirement, and were confused by the resulting crash. Let's convert the Assert to a test-and-elog, in hopes of making the issue clearer for extension authors. Per gripes from Liu Huailing and RekGRpth. Back-patch to v11, like the prior commit. Discussion: https://postgr.es/m/OSZPR01MB6215671E3C5956A034A080DFBEEC9@OSZPR01MB6215.jpnprd01.prod.outlook.com Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/ef12f32c0870b73ab7589ffc178094ac9ba3e84f

  • Doc: alphabetize the regexp_foo() function descriptions in 9.7.3. For no visible reason (other than historical accident no doubt), regexp_replace() was out of order. Re-order to match the way that these functions are listed in 9.4. (That means substring() remains first, because it's SQL-standard and the rest aren't.) I've not touched the text other than to move it. This is just to reduce confusion in the diffs for upcoming additions. https://git.postgresql.org/pg/commitdiff/64a1f225654f8866422010ff28e0d3384ae4c3af

Dean Rasheed pushed:

  • Allow numeric scale to be negative or greater than precision. Formerly, when specifying NUMERIC(precision, scale), the scale had to be in the range [0, precision], which was per SQL spec. This commit extends the range of allowed scales to [-1000, 1000], independent of the precision (whose valid range remains [1, 1000]). A negative scale implies rounding before the decimal point. For example, a column might be declared with a scale of -3 to round values to the nearest thousand. Note that the display scale remains non-negative, so in this case the display scale will be zero, and all digits before the decimal point will be displayed. A scale greater than the precision supports fractional values with zeros immediately after the decimal point. Take the opportunity to tidy up the code that packs, unpacks and validates the contents of a typmod integer, encapsulating it in a small set of new inline functions. Bump the catversion because the allowed contents of atttypmod have changed for numeric columns. This isn't a change that requires a re-initdb, but negative scale values in the typmod would confuse old backends. Dean Rasheed, with additional improvements by Tom Lane. Reviewed by Tom Lane. Discussion: https://postgr.es/m/CAEZATCWdNLgpKihmURF8nfofP0RFtAKJ7ktY6GcZOPnMfUoRqA@mail.gmail.com https://git.postgresql.org/pg/commitdiff/085f931f52494e1f304e35571924efa6fcdc2b44

  • Fix corner-case errors and loss of precision in numeric_power(). This fixes a couple of related problems that arise when raising numbers to very large powers. Firstly, when raising a negative number to a very large integer power, the result should be well-defined, but the previous code would only cope if the exponent was small enough to go through power_var_int(). Otherwise it would throw an internal error, attempting to take the logarithm of a negative number. Fix this by adding suitable handling to the general case in power_var() to cope with negative bases, checking for integer powers there. Next, when raising a (positive or negative) number whose absolute value is slightly less than 1 to a very large power, the result should approach zero as the power is increased. However, in some cases, for sufficiently large powers, this would lose all precision and return 1 instead of 0. This was due to the way that the local_rscale was being calculated for the final full-precision calculation: local_rscale = rscale + (int) val - ln_dweight + 8 The first two terms on the right hand side are meant to give the number of significant digits required in the result ("val" being the estimated result weight). However, this failed to account for the fact that rscale is clipped to a maximum of NUMERIC_MAX_DISPLAY_SCALE (1000), and the result weight might be less then -1000, causing their sum to be negative, leading to a loss of precision. Fix this by forcing the number of significant digits calculated to be nonnegative. It's OK for it to be zero (when the result weight is less than -1000), since the local_rscale value then includes a few extra digits to ensure an accurate result. Finally, add additional underflow checks to exp_var() and power_var(), so that they consistently return zero for cases like this where the result is indistinguishable from zero. Some paths through this code already returned zero in such cases, but others were throwing overflow errors. Dean Rasheed, reviewed by Yugo Nagata. Discussion: http://postgr.es/m/CAEZATCW6Dvq7+3wN3tt5jLj-FyOcUgT5xNoOqce5=6Su0bCR0w@mail.gmail.com https://git.postgresql.org/pg/commitdiff/4dd5ce2fd9b9b2134886fee7fe4cb36d1988adb2

Tomáš Vondra pushed:

Álvaro Herrera pushed:

Bruce Momjian pushed:

Andrew Dunstan pushed:

Fujii Masao pushed:

John Naylor pushed:

Daniel Gustafsson pushed:

Amit Kapila pushed:

Etsuro Fujita pushed:

  • postgres_fdw: Fix handling of pending asynchronous requests. A pending asynchronous request is handled by process_pending_request(), which previously not only processed an in-progress remote query but performed ExecForeignScan() to produce a tuple to return to the local server asynchronously from the result of the remote query. But that led to a server crash when executing a query or led to an "InstrStartNode called twice in a row" or "InstrEndLoop called on running node" failure when doing EXPLAIN ANALYZE of it, in cases where the plan tree for it contained multiple async-capable nodes accessing the same initplan/subplan that contained multiple async-capable nodes scanning the same foreign tables as for the parent async-capable nodes, as reported by Andrey Lepikhov. The reason is that the second step in process_pending_request() invoked when executing the initplan/subplan for one of the parent async-capable nodes caused recursive execution of the initplan/subplan for another of the parent async-capable nodes. To fix, split process_pending_request() into the two steps and postpone the second step until ForeignAsyncConfigureWait() is called for each of the pending asynchronous requests. Also, in ExecAppendAsyncEventWait() we assumed that FDWs would register at least one wait event in a WaitEventSet created there when they were called from ForeignAsyncConfigureWait() in that function, but allow FDWs to register zero wait events in the WaitEventSet; modify ExecAppendAsyncEventWait() to just return in that case. Oversight in commit 27e1f1456. Back-patch to v14 where that commit went in. Andrey Lepikhov and Etsuro Fujita Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/1ec7fca8592178281cd5cdada0f27a340fb813fc

Heikki Linnakangas pushed:

Robert Haas pushed:

  • Remove unnecessary call to ReadCheckpointRecord(). It should always be the case that the last checkpoint record is still readable, because otherwise, a crash would leave us in a situation from which we can't recover. Therefore the test removed by this patch should always succeed. For it to fail, either there has to be a serious bug in the code someplace, or the user has to be manually modifying pg_wal while crash recovery is running. If it's the first one, we should fix the bug. If it's the second one, they should stop, or anyway they're doing so at their own risk. In neither case does a full checkpoint instead of an end-of-recovery record seem like a clear winner. Furthermore, rarely-taken code paths are particularly vulnerable to bugs, so let's simplify by getting rid of this one. Discussion: http://postgr.es/m/CA+TgmoYmw==TOJ6EzYb_vcjyS09NkzrVKSyBKUUyo1zBEaJASA@mail.gmail.com https://git.postgresql.org/pg/commitdiff/1d919de5eb3fffa7cc9479ed6d2915fb89794459

Jeff Davis pushed:

Pending Patches

Bharath Rupireddy sent in another revision of a patch to improve publication error messages by distinguishing the ones for unlogged tables from the ones for temporary tables.

Daniel Gustafsson sent in another revision of a patch to support NSS as a libpq TLS backend.

David Fetter and Greg Nancarrow traded patches to use fewer instructions to find the decimal length of integers for formatting.

John Naylor sent in two more revisions of a patch to add fast paths for validating UTF-8 text.

Justin Pryzby and Bharath Rupireddy traded patches to make it possible to log events in the startup process.

Haiying Tang and Vigneshwaran C traded patches to add schema level granularity to PUBLICATIONs.

Nathan Bossart sent in two more revisions of a patch to clarify initdb --sync-only help message and docs.

Amit Langote sent in another revision of a patch to allow batching of inserts during cross-partition updates.

Andrey V. Lepikhov sent in another revision of a patch to remove unneeded self-joins.

David Rowley sent in another revision of a patch to the the postgres_fdw to add ORDER BY to some remote SQL queries.

Peter Smith sent in another revision of a patch to add row filtering for logical replication.

Daniel Westermann sent in a patch to fix a typo.

Amit Langote sent in a patch to use a separate tupledesc structure for plans in src/backend/executor/nodeModifyTable.c:ExecInsert.

Dagfinn Ilmari Mannsåker sent in a patch to comply with perlcritic by prohibiting map and grep in void context.

Dipesh Pandit sent in another revision of a patch to mitigate a performance-killing directory scan for WAL archiver.

Bertrand Drouvot sent in another revision of a patch to make it possible to do logical decoding on a standby.

Bharath Rupireddy sent in another revision of a patch to use a WaitLatch for {pre, post}_auth_delay.

Daniel Gustafsson and Jacob Champion traded patches to rework the sslfiles Makefile target.

Bruce Momjian sent in four more revisions of a patch to fix some issues with the way interval arithmetic is being handled.

Peter Smith sent in two more revisions of a patch to add prepare API support for streaming transactions.

Pavel Stěhule sent in another revision of a patch to add an --options-file option and supporting machinery to pg_dump.

Pavel Stěhule sent in another revision of a patch to add filtering to pg_dump.

Pavel Stěhule sent in another revision of a patch to return the text value of variable content in the plpgsql debugging API.

Michaël Paquier sent in a patch to document RESTART in CREATE SEQUENCE, while describing its behavior according to START.

Daniel Gustafsson sent in another revision of a patch to emit namespace in the post-copy errmsg.

Melanie Plageman sent in a patch to use pgsr for AIO bitmapheapscan.

Yugo Nagata sent in another revision of a patch to pgbench to prevent it from aborting unnecessarily.

Amit Langote sent in another revision of a patch to rework query relation permission checking by moving the permission checking information from the range table entries to a new node type called RelPermissionInfo, averting many searches through the range tables.

Ronan Dunklau sent in a patch to add a READ_REPLICATION_SLOT command, which is to be used in the context of a physical replication connection, and use same in pg_receivewal to get a better estimate of the start_lsn needed.

Daniil Zakhlystov sent in a patch to add zlib and zstd streaming compression, and use same when implementing libpq compression.

Hayato Kuroda sent in a patch to add the local pid to fallback_application_name.

Masahiko Sawada sent in two more revisions of a patch to add errcontext to errors in applying logical replication changes, add a pg_stat_logical_replication_error statistics view, and add a skip_xid option to ALTER SUBSCRIPTION.

Yura Sokolov sent in a patch to add an integerset2 implementation to bdbench, which is intended to improve dead tuple storage for lazy vacuum.

RGU sent in a patch to gather partial paths for subproblem's topmost scan/join rel.

Dean Rasheed sent in a patch to avert an incorrect division by zero error in to_char.

Prabhat Sahu sent in a patch to add a TAP test for prohibit_wal.

John Naylor sent in another revision of a patch to accelerate tuple sorting for common types.

Daniel Gustafsson sent in another revision of a patch to fix sscanf limits in pg_basebackup and pg_dump, and fix a bug in TOC file error message printing which manifested as sometimes failing to include the name of the file where the error occurred.

Tomáš Vondra sent in another revision of a patch to add operations on sequences to logical replication.

Melanie Plageman sent in another revision of a patch to fix a race condition in parallel hash join batch cleanup, clarify the naming of Parallel Hash Join phases, and make it possible to use parallel hash on full and right outer joins.

Hou Zhijie sent in a patch to remove an unused parameter in maybe_send_schema.

David Rowley and Tomáš Vondra traded patches to use generation context to speed up tuplesorts.

Álvaro Herrera sent in another revision of a patch to avoid creating archive status ".ready" files too early.

Heikki Linnakangas sent in three more revisions of a patch to move code around in StartupXLOG(), split xlog.c into xlog.c and xlogrecovery.c, and move the code to apply one WAL record to a subroutine.

Ajin Cherian sent in a patch to fix a possible failure in 021_twophase tap test by changing the test so that it waits for both subscriptions to catch up before checking the prepared transaction count.

Gilles Darold sent in another revision of a patch to add functions regexp_count, regexp_instr, and regexp_like, and add new parameters to regexp_replace and regexp_substr to make it possible to specify position and occurrence.

Tomáš Vondra sent in another revision of a patch to improve the performance of the slab allocator.

David Rowley sent in another revision of a patch to track non-pruned partitions in RelOptInfo, and allow ordered partition scans in more cases.

Vigneshwaran C sent in another revision of a patch to include the actual datatype used in logical replication message format documentation.