PostgreSQL Weekly News - January 17, 2021

Posted on 2021-01-18 by PWN
PWN

PostgreSQL Weekly News - January 17, 2021

Person of the week: https://postgresql.life/post/gunnar_bluth/

PostgreSQL Product News

pspg 4.0.0 a pager designed for PostgreSQL, released. https://github.com/okbob/pspg/releases/tag/4.0.0

DBConvert Studio 2.0, a database migration and synchronization suite that supports PostgreSQL, released. https://dbconvert.com/dbconvert-studio

PostgreSQL Jobs for January

https://archives.postgresql.org/pgsql-jobs/2021-01/

PostgreSQL in the News

Planet PostgreSQL: https://planet.postgresql.org/

PostgreSQL Weekly News is brought to you this week by David Fetter

Submit news and announcements by Sunday at 3:00pm PST8PDT to [email protected].

Applied Patches

Thomas Munro pushed:

Tom Lane pushed:

  • In libpq, always append new error messages to conn->errorMessage. Previously, we had an undisciplined mish-mash of printfPQExpBuffer and appendPQExpBuffer calls to report errors within libpq. This commit establishes a uniform rule that appendPQExpBuffer[Str] should be used. conn->errorMessage is reset only at the start of an application request, and then accumulates messages till we're done. We can remove no less than three different ad-hoc mechanisms that were used to get the effect of concatenation of error messages within a sequence of operations. Although this makes things quite a bit cleaner conceptually, the main reason to do it is to make the world safer for the multiple-target-host feature that was added awhile back. Previously, there were many cases in which an error occurring during an individual host connection attempt would wipe out the record of what had happened during previous attempts. (The reporting is still inadequate, in that it can be hard to tell which host got the failure, but that seems like a matter for a separate commit.) Currently, lo_import and lo_export contain exceptions to the "never use printfPQExpBuffer" rule. If we changed them, we'd risk reporting an incidental lo_close failure before the actual read or write failure, which would be confusing, not least because lo_close happened after the main failure. We could improve this by inventing an internal version of lo_close that doesn't reset the errorMessage; but we'd also need a version of PQfn() that does that, and it didn't quite seem worth the trouble for now. Discussion: https://postgr.es/m/BN6PR05MB3492948E4FD76C156E747E8BC9160@BN6PR05MB3492.namprd05.prod.outlook.com https://git.postgresql.org/pg/commitdiff/ffa2e4670123124b92f037d335a1e844c3782d3f

  • Allow pg_regress.c wrappers to postprocess test result files. Add an optional callback to regression_main() that, if provided, is invoked on each test output file before we try to compare it to the expected-result file. The main and isolation test programs don't need this (yet). In pg_regress_ecpg, add a filter that eliminates target-host details from "could not connect" error reports. This filter doesn't do anything as of this commit, but it will be needed by the next one. In the long run we might want to provide some more general, perhaps pattern-based, filtering mechanism for test output. For now, this will solve the immediate problem. Discussion: https://postgr.es/m/BN6PR05MB3492948E4FD76C156E747E8BC9160@BN6PR05MB3492.namprd05.prod.outlook.com https://git.postgresql.org/pg/commitdiff/800d93f314b0f7c10193e48b259f87800cb85d84

  • Uniformly identify the target host in libpq connection failure reports. Prefix "could not connect to host-or-socket-path:" to all connection failure cases that occur after the socket() call, and remove the ad-hoc server identity data that was appended to a few of these messages. This should produce much more intelligible error reports in multiple-target-host situations, especially for error cases that are off the beaten track to any degree (because none of those provided any server identity info). As an example of the change, formerly a connection attempt with a bad port number such as "psql -p 12345 -h localhost,/tmp" might produce psql: error: could not connect to server: Connection refused Is the server running on host "localhost" (::1) and accepting TCP/IP connections on port 12345? could not connect to server: Connection refused Is the server running on host "localhost" (127.0.0.1) and accepting TCP/IP connections on port 12345? could not connect to server: No such file or directory Is the server running locally and accepting connections on Unix domain socket "/tmp/.s.PGSQL.12345"? Now it looks like psql: error: could not connect to host "localhost" (::1), port 12345: Connection refused Is the server running on that host and accepting TCP/IP connections? could not connect to host "localhost" (127.0.0.1), port 12345: Connection refused Is the server running on that host and accepting TCP/IP connections? could not connect to socket "/tmp/.s.PGSQL.12345": No such file or directory Is the server running locally and accepting connections on that socket? This requires adjusting a couple of regression tests to allow for variation in the contents of a connection failure message. Discussion: https://postgr.es/m/BN6PR05MB3492948E4FD76C156E747E8BC9160@BN6PR05MB3492.namprd05.prod.outlook.com https://git.postgresql.org/pg/commitdiff/52a10224e3cc1d706ba9800695f97cb163b747d5

  • Try next host after a "cannot connect now" failure. If a server returns ERRCODE_CANNOT_CONNECT_NOW, try the next host, if multiple host names have been provided. This allows dealing gracefully with standby servers that might not be in hot standby mode yet. In the wake of the preceding commit, it might be plausible to retry many more error cases than we do now, but I (tgl) am hesitant to move too aggressively on that --- it's not clear it'd be desirable for cases such as bad-password, for example. But this case seems safe enough. Hubert Zhang, reviewed by Takayuki Tsunakawa Discussion: https://postgr.es/m/BN6PR05MB3492948E4FD76C156E747E8BC9160@BN6PR05MB3492.namprd05.prod.outlook.com https://git.postgresql.org/pg/commitdiff/c1d589571c497a952d7fbe40d9828655859d746f

  • Rethink SQLSTATE code for ERRCODE_IDLE_SESSION_TIMEOUT. Move it to class 57 (Operator Intervention), which seems like a better choice given that from the client's standpoint it behaves a heck of a lot like, e.g., ERRCODE_ADMIN_SHUTDOWN. In a green field I'd put ERRCODE_IDLE_IN_TRANSACTION_SESSION_TIMEOUT here as well. But that's been around for a few years, so it's probably too late to change its SQLSTATE code. Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/4edf96846a02693e4416478b3302e5133d2e8e01

  • Make pg_dump's table of object-type priorities more maintainable. Wedging a new object type into this table has historically required manually renumbering a lot of existing entries. (Although it appears that some people got lazy and re-used the priority level of an existing object type, even if it wasn't particularly related.) We can let the compiler do the counting by inventing an enum type that lists the desired priority levels in order. Now, if you want to add or remove a priority level, that's a one-liner. This patch is not purely cosmetic, because I split apart the priorities of DO_COLLATION and DO_TRANSFORM, as well as those of DO_ACCESS_METHOD and DO_OPERATOR, which look to me to have been merged out of expediency rather than because it was a good idea. Shell types continue to be sorted interchangeably with full types, and opclasses interchangeably with opfamilies. https://git.postgresql.org/pg/commitdiff/d5ab79d815783fe60062cefc423b54e82fbb92ff

  • Dump ALTER TABLE ... ATTACH PARTITION as a separate ArchiveEntry. Previously, we emitted the ATTACH PARTITION command as part of the child table's ArchiveEntry. This was a poor choice since it complicates restoring the partition as a standalone table; you have to ignore the error from the ATTACH, which isn't even an option when restoring direct-to-database with pg_restore. (pg_restore will issue the whole ArchiveEntry as one PQexec, so that any error rolls back the table creation as well.) Hence, separate it out as its own ArchiveEntry, as indeed we already did for index ATTACH PARTITION commands. Justin Pryzby Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/9a4c0e36fbd671b5e7426a5a0670bdd7ba2714a0

  • Doc: fix description of privileges needed for ALTER PUBLICATION. Adding a table to a publication requires ownership of the table (in addition to ownership of the publication). This was mentioned nowhere. https://git.postgresql.org/pg/commitdiff/cc865c0f319fde22540625e02863f42e9853b3e4

  • pg_dump: label INDEX ATTACH ArchiveEntries with an owner. Although a partitioned index's attachment to its parent doesn't have separate ownership, the ArchiveEntry for it needs to be marked with an owner anyway, to ensure that the ALTER command is run by the appropriate role when restoring with --use-set-session-authorization. Without this, the ALTER will be run by the role that started the restore session, which will usually work but it's formally the wrong thing. Back-patch to v11 where this type of ArchiveEntry was added. In HEAD, add equivalent commentary to the just-added TABLE ATTACH case, which I'd made do the right thing already. Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/9eabfe300a22ad3d776dc293265e15379790bd9a

  • Doc: clarify behavior of back-half options in pg_dump. Options that change how the archive data is converted to SQL text are ignored when dumping to archive formats. The documentation previously said "not meaningful", which is not helpful. Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/06ed235adeb621a73cafd6ab35fa2405b3177329

  • Disallow a digit as the first character of a variable name in pgbench. The point of this restriction is to avoid trying to substitute variables into timestamp literal values, which may contain strings like '12:34'. There is a good deal more that should be done to reduce pgbench's tendency to substitute where it shouldn't. But this is sufficient to solve the case complained of by Jaime Soler, and it's simple enough to back-patch. Back-patch to v11; before commit 9d36a3866, pgbench had a slightly different definition of what a variable name is, and anyway it seems unwise to change long-stable branches for this. Fabien Coelho Discussion: https://postgr.es/m/alpine.DEB.2.22.394.2006291740420.805678@pseudo https://git.postgresql.org/pg/commitdiff/c21ea4d53e9404279273da800daa49b7b9a5e81e

  • Doc, more or less: uncomment tutorial example that was fixed long ago. Reverts a portion of commit 344190b7e. Apparently, back in the twentieth century we had some issues with multi-statement SQL functions, but they've worked fine for a long time. Daniel Westermann Discussion: https://postgr.es/m/GVAP278MB04242DCBF5E31F528D53FA18D2A90@GVAP278MB0424.CHEP278.PROD.OUTLOOK.COM https://git.postgresql.org/pg/commitdiff/dce62490818170b6479dfe08a28aae4bcdf7cc2d

  • Run reformat-dat-files to declutter the catalog data files. Things had gotten pretty messy here, apparently mostly but not entirely the fault of the multirange patch. No functional changes. https://git.postgresql.org/pg/commitdiff/8b411b8ff41566a1aa601d1f05aeebbebbdb4a54

  • Mark inet_server_addr() and inet_server_port() as parallel-restricted. These need to be PR because they access the MyProcPort data structure, which doesn't get copied to parallel workers. The very similar functions inet_client_addr() and inet_client_port() are already marked PR, but somebody missed these. Although this is a pre-existing bug, we can't readily fix it in the back branches since we can't force initdb. Given the small usage of these two functions, and the even smaller likelihood that they'd get pushed to a parallel worker anyway, it doesn't seem worth the trouble to suggest that DBAs should fix it manually. Masahiko Sawada Discussion: https://postgr.es/m/CAD21AoAT4aHP0Uxq91qpD7NL009tnUYQe-b14R3MnSVOjtE71g@mail.gmail.com https://git.postgresql.org/pg/commitdiff/5a6f9bce8dabd371bdb4e3db5dda436f7f0a680f

  • pg_dump: label PUBLICATION TABLE ArchiveEntries with an owner. This is the same fix as commit 9eabfe300 applied to INDEX ATTACH entries, but for table-to-publication attachments. As in that case, even though the backend doesn't record "ownership" of the attachment, we still ought to label it in the dump archive with the role name that should run the ALTER PUBLICATION command. The existing behavior causes the ALTER to be done by the original role that started the restore; that will usually work fine, but there may be corner cases where it fails. The bulk of the patch is concerned with changing struct PublicationRelInfo to include a pointer to the associated PublicationInfo object, so that we can get the owner's name out of that when the time comes. While at it, I rewrote getPublicationTables() to do just one query of pg_publication_rel, not one per table. Back-patch to v10 where this code was introduced. Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/8e396a773b80c72e5d5a0ca9755dffe043c97a05

  • Improve our heuristic for selecting PG_SYSROOT on macOS. In cases where Xcode is newer than the underlying macOS version, asking xcodebuild for the SDK path will produce a pointer to the SDK shipped with Xcode, which may end up building code that does not work on the underlying macOS version. It appears that in such cases, xcodebuild's answer also fails to match the default behavior of Apple's compiler: assuming one has installed Xcode's "command line tools", there will be an SDK for the OS's own version in /Library/Developer/CommandLineTools, and the compiler will default to using that. This is all pretty poorly documented, but experimentation suggests that "xcrun --show-sdk-path" gives the sysroot path that the compiler is actually using, at least in some cases. Hence, try that first, but revert to xcodebuild if xcrun fails (in very old Xcode, it is missing or lacks the --show-sdk-path switch). Also, "xcrun --show-sdk-path" may give a path that is valid but lacks any OS version identifier. We don't really want that, since most of the motivation for wiring -isysroot into the build flags at all is to ensure that all parts of a PG installation are built against the same SDK, even when considering extensions built later and/or on a different machine. Insist on finding "N.N" in the directory name before accepting the result. (Adding "--sdk macosx" to the xcrun call seems to produce the same answer as xcodebuild, but usually more quickly because it's cached, so we also try that as a fallback.) The core reason why we don't want to use Xcode's default SDK in cases like this is that Apple's technology for introducing new syscalls does not play nice with Autoconf: for example, configure will think that preadv/pwritev exist when using a Big Sur SDK, even when building on an older macOS version where they don't exist. It'd be nice to have a better solution to that problem, but this patch doesn't attempt to fix that. Per report from Sergey Shinderuk. Back-patch to all supported versions. Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/4823621db312a0597c40686c4c94d47428889fef

  • Add missing array-enlargement logic to test_regex.c. The stanza to report a "partial" match could overrun the initially allocated output array, so it needs its own copy of the array-resizing logic that's in the main loop. I overlooked the need for this in ca8217c10. Per report from Alexander Lakhin. Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/0c7d3bb99f72d66ec6ac63aee4c5fe6d683eee86

Amit Kapila pushed:

Álvaro Herrera pushed:

Michaël Paquier pushed:

  • Fix routine name in comment of catcache.c. Author: Bharath Rupireddy Discussion: https://postgr.es/m/CALj2ACUDXLAkf_XxQO9tAUtnTNGi3Lmd8fANd+vBJbcHn1HoWA@mail.gmail.com https://git.postgresql.org/pg/commitdiff/fce7d0e6efbef304e81846c75eddf73099628d10

  • Rework refactoring of hex and encoding routines. This commit addresses some issues with c3826f83 that moved the hex decoding routine to src/common/: - The decoding function lacked overflow checks, so when used for security-related features it was an open door to out-of-bound writes if not carefully used that could remain undetected. Like the base64 routines already in src/common/ used by SCRAM, this routine is reworked to check for overflows by having the size of the destination buffer passed as argument, with overflows checked before doing any writes. - The encoding routine was missing. This is moved to src/common/ and it gains the same overflow checks as the decoding part. On failure, the hex routines of src/common/ issue an error as per the discussion done to make them usable by frontend tools, but not by shared libraries. Note that this is why ECPG is left out of this commit, and it still includes a duplicated logic doing hex encoding and decoding. While on it, this commit uses better variable names for the source and destination buffers in the existing escape and base64 routines in encode.c and it makes them more robust to overflow detection. The previous core code issued a FATAL after doing out-of-bound writes if going through the SQL functions, which would be enough to detect problems when working on changes that impacted this area of the code. Instead, an error is issued before doing an out-of-bound write. The hex routines were being directly called for bytea conversions and backup manifests without such sanity checks. The current calls happen to not have any problems, but careless uses of such APIs could easily lead to CVE-class bugs. Author: Bruce Momjian, Michael Paquier Reviewed-by: Sehrope Sarkuni Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/aef8948f38d9f3aa58bf8c2d4c6f62a7a456a9d1

  • Fix O(N^2) stat() calls when recycling WAL segments. The counter tracking the last segment number recycled was getting initialized when recycling one single segment, while it should be used across a full cycle of segments recycled to prevent useless checks related to entries already recycled. This performance issue has been introduced by b2a5545, and it was first implemented in 61b86142. No backpatch is done per the lack of field complaints. Reported-by: Andres Freund, Thomas Munro Author: Michael Paquier Reviewed-By: Andres Freund Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/CA+hUKG+DRiF9z1_MU4fWq+RfJMxP7zjoptfcmuCFPeO4JM2iVg@mail.gmail.com https://git.postgresql.org/pg/commitdiff/5ae1572993ae8bf1f6c33a933915c07cc9bc0add

  • Remove PG_SHA*_DIGEST_STRING_LENGTH from sha2.h. The last reference to those variables has been removed in aef8948, so this cleans up a bit the code. Discussion: https://postgr.es/m/X//[email protected] https://git.postgresql.org/pg/commitdiff/ccf4e277a4de120a2f08db7e45399d87e1176bda

Heikki Linnakangas pushed:

Magnus Hagander pushed:

Fujii Masao pushed:

  • Log long wait time on recovery conflict when it's resolved. This is a follow-up of the work done in commit 0650ff2303. This commit extends log_recovery_conflict_waits so that a log message is produced also when recovery conflict has already been resolved after deadlock_timeout passes, i.e., when the startup process finishes waiting for recovery conflict after deadlock_timeout. This is useful in investigating how long recovery conflicts prevented the recovery from applying WAL. Author: Fujii Masao Reviewed-by: Kyotaro Horiguchi, Bertrand Drouvot Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/39b03690b529935a3c33024ee68f08e2d347cf4f

  • Ensure that a standby is able to follow a primary on a newer timeline. Commit 709d003fbd refactored WAL-reading code, but accidentally caused WalSndSegmentOpen() to fail to follow a timeline switch while reading from a historic timeline. This issue caused a standby to fail to follow a primary on a newer timeline when WAL archiving is enabled. If there is a timeline switch within the segment, WalSndSegmentOpen() should read from the WAL segment belonging to the new timeline. But previously since it failed to follow a timeline switch, it tried to read the WAL segment with old timeline. When WAL archiving is enabled, that WAL segment with old timeline doesn't exist because it's renamed to .partial. This leads a primary to have tried to read non-existent WAL segment, and which caused replication to faill with the error "ERROR: requested WAL segment ... has already been removed". This commit fixes WalSndSegmentOpen() so that it's able to follow a timeline switch, to ensure that a standby is able to follow a primary on a newer timeline even when WAL archiving is enabled. This commit also adds the regression test to check whether a standby is able to follow a primary on a newer timeline when WAL archiving is enabled. Back-patch to v13 where the bug was introduced. Reported-by: Kyotaro Horiguchi Author: Kyotaro Horiguchi, tweaked by Fujii Masao Reviewed-by: Alvaro Herrera, Fujii Masao Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/fef5b47f6bfc9bfec619bb2e6e66b027e7ff21a3

  • Improve tab-completion for CLOSE, DECLARE, FETCH and MOVE. This commit makes CLOSE, FETCH and MOVE commands tab-complete the list of cursors. Also this commit makes DECLARE command tab-complete the options. Author: Shinya Kato, Sawada Masahiko, tweaked by Fujii Masao Reviewed-by: Shinya Kato, Sawada Masahiko, Fujii Masao Discussion: https://postgr.es/m/b0e4c5c53ef84c5395524f5056fc71f0@MP-MSGSS-MBX001.msg.nttdata.co.jp https://git.postgresql.org/pg/commitdiff/3f238b882c276a59f5d98224850e5aee2a3fec8c

  • Stabilize timeline switch regression test. Commit fef5b47f6b added the regression test to check whether a standby is able to follow a primary on a newer timeline when WAL archiving is enabled. But the buildfarm member florican reported that this test failed because the requested WAL segment was removed and replication failed. This is a timing issue. Since neither replication slot is used nor wal_keep_size is set in the test, checkpoint could remove the WAL segment that's still necessary for replication. This commit stabilizes the test by setting wal_keep_size. Back-patch to v13 where the regression test that this commit stabilizes was added. Author: Fujii Masao Discussion: https://postgr.es/m/X//[email protected] https://git.postgresql.org/pg/commitdiff/424d7a9b277c0da5ec638bf6344cda899a2e544a

  • postgres_fdw: Save foreign server OID in connection cache entry. The foreign server OID stored in the connection cache entry is used as a lookup key to directly get the server name. Previously since the connection cache entry did not have the server OID, postgres_fdw had to get the server OID at first from user mapping before getting the server name. So if the corresponding user mapping was dropped, postgres_fdw could raise the error "cache lookup failed for user mapping" while looking up user mapping and fail to get the server name even though the server had not been dropped yet. Author: Bharath Rupireddy Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/CALj2ACVRZPUB7ZwqLn-6DY8C_UmPs6084gSpHA92YBv++1AJXA@mail.gmail.com https://git.postgresql.org/pg/commitdiff/5e5f4fcd89c082bba0239e8db1552834b4905c34

  • Fix calculation of how much shared memory is required to store a TOC. Commit ac883ac453 refactored shm_toc_estimate() but changed its calculation of shared memory size for TOC incorrectly. Previously this could cause too large memory to be allocated. Back-patch to v11 where the bug was introduced. Author: Takayuki Tsunakawa Discussion: https://postgr.es/m/TYAPR01MB2990BFB73170E2C4921E2C4DFEA80@TYAPR01MB2990.jpnprd01.prod.outlook.com https://git.postgresql.org/pg/commitdiff/2ad78a87f018260d4474eee63187e1cc73c9b976

Peter Geoghegan pushed:

  • Pass down "logically unchanged index" hint. Add an executor aminsert() hint mechanism that informs index AMs that the incoming index tuple (the tuple that accompanies the hint) is not being inserted by execution of an SQL statement that logically modifies any of the index's key columns. The hint is received by indexes when an UPDATE takes place that does not apply an optimization like heapam's HOT (though only for indexes where all key columns are logically unchanged). Any index tuple that receives the hint on insert is expected to be a duplicate of at least one existing older version that is needed for the same logical row. Related versions will typically be stored on the same index page, at least within index AMs that apply the hint. Recognizing the difference between MVCC version churn duplicates and true logical row duplicates at the index AM level can help with cleanup of garbage index tuples. Cleanup can intelligently target tuples that are likely to be garbage, without wasting too many cycles on less promising tuples/pages (index pages with little or no version churn). This is infrastructure for an upcoming commit that will teach nbtree to perform bottom-up index deletion. No index AM actually applies the hint just yet. Author: Peter Geoghegan pg@bowt.ie Reviewed-By: Victor Yegorov vyegorov@gmail.com Discussion: https://postgr.es/m/CAH2-Wz=CEKFa74EScx_hFVshCOn6AA5T-ajFASTdzipdkLTNQQ@mail.gmail.com https://git.postgresql.org/pg/commitdiff/9dc718bdf2b1a574481a45624d42b674332e2903

  • Enhance nbtree index tuple deletion. Teach nbtree and heapam to cooperate in order to eagerly remove duplicate tuples representing dead MVCC versions. This is "bottom-up deletion". Each bottom-up deletion pass is triggered lazily in response to a flood of versions on an nbtree leaf page. This usually involves a "logically unchanged index" hint (these are produced by the executor mechanism added by commit 9dc718bd). The immediate goal of bottom-up index deletion is to avoid "unnecessary" page splits caused entirely by version duplicates. It naturally has an even more useful effect, though: it acts as a backstop against accumulating an excessive number of index tuple versions for any given logical row. Bottom-up index deletion complements what we might now call "top-down index deletion": index vacuuming performed by VACUUM. Bottom-up index deletion responds to the immediate local needs of queries, while leaving it up to autovacuum to perform infrequent clean sweeps of the index. The overall effect is to avoid certain pathological performance issues related to "version churn" from UPDATEs. The previous tableam interface used by index AMs to perform tuple deletion (the table_compute_xid_horizon_for_tuples() function) has been replaced with a new interface that supports certain new requirements. Many (perhaps all) of the capabilities added to nbtree by this commit could also be extended to other index AMs. That is left as work for a later commit. Extend deletion of LP_DEAD-marked index tuples in nbtree by adding logic to consider extra index tuples (that are not LP_DEAD-marked) for deletion in passing. This increases the number of index tuples deleted significantly in many cases. The LP_DEAD deletion process (which is now called "simple deletion" to clearly distinguish it from bottom-up deletion) won't usually need to visit any extra table blocks to check these extra tuples. We have to visit the same table blocks anyway to generate a latestRemovedXid value (at least in the common case where the index deletion operation's WAL record needs such a value). Testing has shown that the "extra tuples" simple deletion enhancement increases the number of index tuples deleted with almost any workload that has LP_DEAD bits set in leaf pages. That is, it almost never fails to delete at least a few extra index tuples. It helps most of all in cases that happen to naturally have a lot of delete-safe tuples. It's not uncommon for an individual deletion operation to end up deleting an order of magnitude more index tuples compared to the old naive approach (e.g., custom instrumentation of the patch shows that this happens fairly often when the regression tests are run). Add a further enhancement that augments simple deletion and bottom-up deletion in indexes that make use of deduplication: Teach nbtree's bt_delitems_delete() function to support granular TID deletion in posting list tuples. It is now possible to delete individual TIDs from posting list tuples provided the TIDs have a tableam block number of a table block that gets visited as part of the deletion process (visiting the table block can be triggered directly or indirectly). Setting the LP_DEAD bit of a posting list tuple is still an all-or-nothing thing, but that matters much less now that deletion only needs to start out with the right _general idea about which index tuples are deletable. Bump XLOG_PAGE_MAGIC because xl_btree_delete changed. No bump in BTREE_VERSION, since there are no changes to the on-disk representation of nbtree indexes. Indexes built on PostgreSQL 12 or PostgreSQL 13 will automatically benefit from bottom-up index deletion (i.e. no reindexing required) following a pg_upgrade. The enhancement to simple deletion is available with all B-Tree indexes following a pg_upgrade, no matter what PostgreSQL version the user upgrades from. Author: Peter Geoghegan pg@bowt.ie Reviewed-By: Heikki Linnakangas hlinnaka@iki.fi Reviewed-By: Victor Yegorov vyegorov@gmail.com Discussion: https://postgr.es/m/CAH2-Wzm+maE3apHB8NOtmM=p-DO65j2V5GzAWCOEEuy3JZgb2g@mail.gmail.com https://git.postgresql.org/pg/commitdiff/d168b666823b6e0bcf60ed19ce24fb5fb91b8ccf

Tomáš Vondra pushed:

Noah Misch pushed:

  • Fix pg_dump for GRANT OPTION among initial privileges. The context is an object that no longer bears some aclitem that it bore initially. (A user issued REVOKE or GRANT statements upon the object.) pg_dump is forming SQL to reproduce the object ACL. Since initdb creates no ACL bearing GRANT OPTION, reaching this bug requires an extension where the creation script establishes such an ACL. No PGXN extension does that. If an installation did reach the bug, pg_dump would have omitted a semicolon, causing a REVOKE and the next SQL statement to fail. Separately, since the affected code exists to eliminate an entire aclitem, it wants plain REVOKE, not REVOKE GRANT OPTION FOR. Back-patch to 9.6, where commit 23f34fa4ba358671adab16773e79c17c92cbc870 first appeared. Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/f713ff7c646e5912e08089de74dacdfaaac3d03b

  • Prevent excess SimpleLruTruncate() deletion. Every core SLRU wraps around. With the exception of pg_notify, the wrap point can fall in the middle of a page. Account for this in the PagePrecedes callback specification and in SimpleLruTruncate()'s use of said callback. Update each callback implementation to fit the new specification. This changes SerialPagePrecedesLogically() from the style of asyncQueuePagePrecedes() to the style of CLOGPagePrecedes(). (Whereas pg_clog and pg_serial share a key space, pg_serial is nothing like pg_notify.) The bug fixed here has the same symptoms and user followup steps as 592a589a04bd456410b853d86bd05faa9432cbbb. Back-patch to 9.5 (all supported versions). Reviewed by Andrey Borodin and (in earlier versions) by Tom Lane. Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/6db992833c04c0322f7f34a486adece01651f929

Jeff Davis pushed:

Pending Patches

Andrey V. Lepikhov sent in another revision of a patch to remove unneeded self-joins in a class of places where it is safe to do so.

Tom Lane sent in a patch intended to fix a bug that manifested As multiple hosts in connection string failed to failover in non-hot standby mode by fixing some of the retry and error logic for connecting.

David Fetter sent in another revision of a patch to surface popcount to SQL.

Andrey V. Lepikhov sent in another revision of a patch to add a bulk insert interface to the FDW API and use same in the PostgreSQL FDW. This should speed up bulk loads to tables with foreign partitions.

Masahiko Sawada and Bharath Rupireddy traded patches to avoid catalogue accesses in conversion_error_callback.

Konstantin Knizhnik and Tomáš Vondra traded patches to implement compression for libpq.

Ian Barwick and Greg Sabino Mullane traded patches to help psql tab-complete functions by including the data types of their arguments.

Mark Dilger sent in another revision of a patch to add contrib module pg_amcheck, a command line interface for running amcheck's verifications against tables and indexes.

Bharath Rupireddy sent in two more revisions of a patch to make it possible to use parallel inserts in CTAS.

Anastasia Lubennikova sent in two more revisions of a patch to set PD_ALL_VISIBLE and visibility map bits in COPY FREEZE.

Masahiko Sawada sent in a patch to implement buffer encryption to make sure the kms patch would be workable with other components using an encryption key managed by kmgr.

Simon Riggs sent in another revision of a patch to implement system-versioned temporal tables.

Ian Barwick sent in a patch to fix has_column_privilege() with attnums and non-existent columns by confirming the existence of a column even if the user has the table-level privilege, otherwise the function will happily report the user has privilege on a dropped or non-existent column if an invalid attnum is provided.

Yugo Nagata sent in another revision of a patch to implement incremental view maintenance.

Atsushi Torikoshi sent in another revision of a patch to add the plan type (generic or custom) to pg_stat_statements.

Peter Smith sent in two more revisions of a patch to make it possible to use background workers for tablesync.

Kyotaro HORIGUCHI sent in two more revisions of a patch to make it possible to change the persistence (LOGGED/UNLOGGED) of a table without incurring a heap rewrite.

Atsushi Torikoshi sent in another revision of a patch to make it possible to collect memory contexts of the specified process via a new function, pg_get_target_backend_memory_contexts().

John Naylor sent in a patch to remove references to the now-removed replication_timeout GUC.

Hou Zhijie sent in two more revisions of a patch to add a Nullif case for eval_const_expressions_mutator.

Justin Pryzby sent in another revision of a patch to pg_upgrade to add a test to exercise binary compatibility.

Álvaro Herrera sent in another revision of a patch to set PROC_IN_SAFE_IC during REINDEX CONCURRENTLY.

Tomáš Vondra sent in four more revisions of a patch to add bulk insert for foreign tables.

Li Japin and Bharath Rupireddy traded patches to fix ALTER PUBLICATION...DROP TABLE behaviour by arranging it so that when an entry is invalidated in rel_sync_cache_publication_cb(), mark the pubactions to false and let get_rel_sync_entry() recalculate the pubactions.

Takamichi Osumi sent in three more revisions of a patch to add a new wal_level to disable WAL logging which is designed to make bulk loads faster with the trade-off of leaving an unrecoverable cluster if it fails midway.

Bruce Momjian sent in three more revisions of a patch to implement key management.

DRU sent in three more revisions of a patch to add documentation about data page checksums, and support checksum enable/disable in a running cluster.

Heikki Linnakangas and Andrey Borodin traded patches to add functions to 'pageinspect' to inspect GiST indexes.

Dilip Kumar sent in another revision of a patch to support custom compression methods for tables.

Yuzuko Hosoya sent in a patch to make it possible to Release SPI plans for referential integrity with DISCARD ALL, which will among other things reduce the amount of memory used when creating or using foreign keys on tables with many partitions.

Stephen Frost sent in a patch to introduce an obsolete appendix to link old terms to new docs.

Stephen Frost sent in another revision of a patch to use pre-fetching for ANALYZE and bring the details logged for autoanalyze into line with those for autovacuum.

Michaël Paquier and Aleksey Kondratov traded patches to refactor the utility statement options.

Peter Eisentraut sent in another revision of a patch to pageinspect that change es the block number arguments to bigint to avoid possible overflows.

Tomáš Vondra sent in three more revisions of a patch to implement BRIN multi-range indexes.

Heikki Linnakangas sent in two more revisions of a patch to move a few ResourceOwnerEnlarge() calls for safety and clarity, and make resowners more easily extensible by using a single array and hash, rather than one for each type of object.

Kyotaro HORIGUCHI sent in a patch to fix some misuses of RelationNeedsWAL.

Dilip Kumar sent in another revision of a patch to ensure that pg_is_wal_replay_paused waits for recovery to pause.

Kyotaro HORIGUCHI sent in another revision of a patch to move the stats collector's temporary storage from files to shared memory.

Kyotaro HORIGUCHI sent in another revision of a patch to protect syscache from bloating with negative cache entries by adding a CatCache expiration feature.

Pavel Stěhule sent in another revision of a patch to implement schema variables.

Li Japin sent in a patch to fix a typo in a comment on WalSndPrepareWrite.

Simon Riggs sent in a patch to make it possible to change an index's uniqueness without validating it, and a way to do that validation separately.

Takayuki Tsunakawa sent in a patch to fix the size calculation for shmem TOC by changing a couple of incorrect += assignments to = .

Peter Geoghegan sent in a patch to lower vacuum_cost_page_miss's default to 3.

Ian Barwick sent in another revision of a patch to add lock acquisition wait start time to the pg_lock_status function.

Andy Fan sent in a patch to make cost_sort more accurate.

Masahiko Sawada sent in another revision of a patch to make it possible to do transactions involving multiple postgres foreign servers.

Fujii Masao and Bharath Rupireddy traded patches to add a postgres_fdw function to discard cached connections, along with both a postgres_fdw-specific and a system-wide GUC, keep_connections.

Hou Zhijie sent in a patch to remove a stray apostrophe from a comment in reorderbuffer.c.

Álvaro Herrera sent in a patch to have VACUUM ignore processes doing CIC and RC when computing the Xid horizon of tuples to remove.

Álvaro Herrera sent in a patch to increase the size of pg_commit_ts buffers.

David Zhang sent in a patch to update the tablespace documentation to keep it consistent with the new table access method option for pgbench.

Iwata Aya sent in another revision of a patch to enable tracing for libpq.

Tomáš Vondra sent in two more revisions of a patch to cover expressions with extended statistics.

Tom Lane sent in a patch to fix a wrong calculation in pull_varnos().

Thomas Munro sent in another revision of a patch to make it possible to get pgbench to delay queries till connections are established.