PostgreSQL Weekly News - January 31, 2021

Posted on 2021-02-01 by PWN
PWN

PostgreSQL Weekly News - January 31, 2021

Crunchy PostgreSQL Operator 4.6.0, a system for deploying and managing open source PostgreSQL clusters on Kubernetes, released. https://access.crunchydata.com/documentation/postgres-operator/latest/releases/4.6.0/

pgAdmin4 4.30, a web- and native GUI control center for PostgreSQL, released. https://www.pgadmin.org/docs/pgadmin4/dev/release_notes_4_30.html

PostgreSQL Product News

PostgreSQL Jobs for January

https://archives.postgresql.org/pgsql-jobs/2021-01/

PostgreSQL in the News

Planet PostgreSQL: https://planet.postgresql.org/

PostgreSQL Weekly News is brought to you this week by David Fetter

Submit news and announcements by Sunday at 3:00pm PST8PDT to [email protected].

Applied Patches

Tom Lane pushed:

  • Update time zone data files to tzdata release 2021a. DST law changes in Russia (Volgograd zone) and South Sudan. Historical corrections for Australia, Bahamas, Belize, Bermuda, Ghana, Israel, Kenya, Nigeria, Palestine, Seychelles, and Vanuatu. Notably, the Australia/Currie zone has been corrected to the point where it is identical to Australia/Hobart. https://git.postgresql.org/pg/commitdiff/c7edf4ac246b67073563354c2808c78868cbac36

  • Make storage/standby.h compile standalone again. This file has failed headerscheck/cpluspluscheck verification since commit 0650ff230, as a result of referencing typedef TimestampTz without including the appropriate header. https://git.postgresql.org/pg/commitdiff/a4b03de589c1df0845e9732da203f505f2eedb6d

  • Doc: improve documentation of pg_proc.protrftypes. Add a "references" link pointing to pg_type, as we have for other arrays of type OIDs. Wordsmith the explanation a bit. Joel Jacobson, additional editing by me Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/951862eda57e5dc8f78c97b3c30fe2032a5562b8

  • Fix broken ruleutils support for function TRANSFORM clauses. I chanced to notice that this dumped core due to a faulty Assert. To add insult to injury, the output has been misformatted since v11. Obviously we need some regression testing here. Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/07d46fceb4254b00e79f3d06419cbae13b0ecb5a

  • Don't clobber the calling user's credentials cache in Kerberos test. Embarrassing oversight in this test script, which fortunately is not run by default. Report and patch by Jacob Champion. Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/881933f194221abcce07fb134ebe8685e5bb58dd

  • Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com https://git.postgresql.org/pg/commitdiff/ee895a655ce4341546facd6f23e3e8f2931b96bf

  • Code review for psql's helpSQL() function. The loops to identify word boundaries could access past the end of the input string. Likely that would never result in an actual crash, but it makes valgrind unhappy. The logic to try different numbers of words didn't work when the input has two words but we only have a match to the first, eg "\h with select". (We must "continue" the pass loop, not "break".) The logic to compute nl_count was bizarrely managed, and in at least two code paths could end up calling PageOutput with nl_count = 0, resulting in failing to paginate output that should have been fed to the pager. Also, in v12 and up, the nl_count calculation hadn't been updated to account for the addition of a URL. The PQExpBuffer holding the command syntax details wasn't freed, resulting in a session-lifespan memory leak. While here, improve some comments, choose a more descriptive name for a variable, fix inconsistent datatype choice for another variable. Per bug #16837 from Alexander Lakhin. This code is very old, so back-patch to all supported branches. Kyotaro Horiguchi and Tom Lane Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/f76a85000bba2f1b1c926cbbe525e47b246215f1

  • Suppress compiler warnings from commit ee895a655. For obscure reasons, some buildfarm members are now generating complaints about plpgsql_call_handler's "retval" variable possibly being used uninitialized. It seems no less safe than it was before that commit, but these complaints are (mostly?) new. I trust that initializing the variable where it's declared will be enough to shut that up. I also notice that some compilers are warning about setjmp clobber of the same variable, which is maybe a bit more defensible. Mark it volatile to silence that. Also, rearrange the logic to give procedure_resowner a single point of initialization, in hopes of silencing some setjmp-clobber warnings about that. (Marking it volatile would serve too, but its sibling variables are depending on single assignment, so let's stick with that method.) Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/7292fd8f1c781278021407276474d9188845113d

  • Rethink recently-added SPI interfaces. SPI_execute_with_receiver and SPI_cursor_parse_open_with_paramlist are new in v14 (cf. commit 2f48ede08). Before they can get out the door, let's change their APIs to follow the practice recently established by SPI_prepare_extended etc: shove all optional arguments into a struct that callers are supposed to pre-zero. The hope is to allow future addition of more options without either API breakage or a continuing proliferation of new SPI entry points. With that in mind, choose slightly more generic names for them: SPI_execute_extended and SPI_cursor_parse_open respectively. Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com https://git.postgresql.org/pg/commitdiff/d5a83d79c9f9b660a6a5a77afafe146d3c8c6f46

  • Doc: improve documentation for UNNEST(). Per a user question, spell out that UNNEST() returns array elements in storage order; also provide an example to clarify the behavior for multi-dimensional arrays. While here, also clarify the SELECT reference page's description of WITH ORDINALITY. These details were already given in 7.2.1.4, but a reference page should not omit details. Back-patch to v13; there's not room in the table in older versions. Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/662affcfe9e816584e3d8602b3b4005236931bbb

  • Make ecpg's rjulmdy() and rmdyjul() agree with their declarations. We had "short *mdy" in the extern declarations, but "short mdy[3]" in the actual function definitions. Per C99 these are equivalent, but recent versions of gcc have started to issue warnings about the inconsistency. Clean it up before the warnings get any more widespread. Back-patch, in case anyone wants to build older PG versions with bleeding-edge compilers. Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/1b242f42ba1e732b202f35265ab1a1614ce7d859

  • Fix hash partition pruning with asymmetric partition sets. perform_pruning_combine_step() was not taught about the number of partition indexes used in hash partitioning; more embarrassingly, get_matching_hash_bounds() also had it wrong. These errors are masked in the common case where all the partitions have the same modulus and no partition is missing. However, with missing or unequal-size partitions, we could erroneously prune some partitions that need to be scanned, leading to silently wrong query answers. While a minimal-footprint fix for this could be to export get_partition_bound_num_indexes and make the incorrect functions use it, I'm of the opinion that that function should never have existed in the first place. It's not reasonable data structure design that PartitionBoundInfoData lacks any explicit record of the length of its indexes[] array. Perhaps that was all right when it could always be assumed equal to ndatums, but something should have been done about it as soon as that stopped being true. Putting in an explicit "nindexes" field makes both partition_bounds_equal() and partition_bounds_copy() simpler, safer, and faster than before, and removes explicit knowledge of the number-of-partition-indexes rules from some other places too. This change also makes get_hash_partition_greatest_modulus obsolete. I left that in place in case any external code uses it, but no core code does anymore. Per bug #16840 from Michał Albrycht. Back-patch to v11 where the hash partitioning code came in. (In the back branches, add the new field at the end of PartitionBoundInfoData to minimize ABI risks.) Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/1d9351a87c9a9e82c7091aab03d9299982670ce0

  • Silence another gcc 11 warning. Per buildfarm and local experimentation, bleeding-edge gcc isn't convinced that the MemSet in reorder_function_arguments() is safe. Shut it up by adding an explicit check that pronargs isn't negative, and by changing MemSet to memset. (It appears that either change is enough to quiet the warning at -O2, but let's do both to be sure.) https://git.postgresql.org/pg/commitdiff/1046dbedde2fc3fe55f007ff3255ab65ab98f858

  • Doc: improve cross-references for SET/SHOW. The corresponding functions set_config and current_setting were mostly not hyperlinked. Clarify their descriptions a tad, too. Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/f743a2bbd4e511ad58b6ce01f81841e5e1611474

Amit Kapila pushed:

David Rowley pushed:

  • Fix hypothetical bug in heap backward scans. Both heapgettup() and heapgettup_pagemode() incorrectly set the first page to scan in a backward scan in which the number of pages to scan was specified by heap_setscanlimits(). The code incorrectly started the scan at the end of the relation when startBlk was 0, or otherwise at startBlk - 1, neither of which is correct when only scanning a subset of pages. The fix here checks if heap_setscanlimits() has changed the number of pages to scan and if so we set the first page to scan as the final page in the specified range during backward scans. Proper adjustment of this code was forgotten when heap_setscanlimits() was added in 7516f5259 back in 9.5. However, practice, nowhere in core code performs backward scans after having used heap_setscanlimits(), yet, it is possible an extension uses the heap functions in this way, hence backpatch. An upcoming patch does use heap_setscanlimits() with backward scans, so this must be fixed before that can go in. Author: David Rowley Discussion: https://postgr.es/m/CAApHDvpGc9h0_oVD2CtgBcxCS1N-qDYZSeBRnUh+0CWJA9cMaA@mail.gmail.com Backpatch-through: 9.5, all supported versions https://git.postgresql.org/pg/commitdiff/16dfe253e31f75b60e93acc0c2b5bbf19936c074

Peter Eisentraut pushed:

Robert Haas pushed:

  • Remove CheckpointLock. Up until now, we've held this lock when performing a checkpoint or restartpoint, but commit 076a055acf3c55314de267c62b03191586d79cf6 back in 2004 and commit 7e48b77b1cebb9a43f9fdd6b17128a0ba36132f9 from 2009, taken together, have removed all need for this. In the present code, there's only ever one process entitled to attempt a checkpoint: either the checkpointer, during normal operation, or the postmaster, during single-user operation. So, we don't need the lock. One possible concern in making this change is that it means that a substantial amount of code where HOLD_INTERRUPTS() was previously in effect due to the preceding LWLockAcquire() will now be running without that. This could mean that ProcessInterrupts() gets called in places from which it didn't before. However, this seems unlikely to do very much, because the checkpointer doesn't have any signal mapped to die(), so it's not clear how, for example, ProcDiePending = true could happen in the first place. Similarly with ClientConnectionLost and recovery conflicts. Also, if there are any such problems, we might want to fix them rather than reverting this, since running lots of code with interrupt handling suspended is generally bad. Patch by me, per an inquiry by Amul Sul. Review by Tom Lane and Michael Paquier. Discussion: http://postgr.es/m/CAAJ_b97XnBBfYeSREDJorFsyoD1sHgqnNuCi=02mNQBUMnA=FA@mail.gmail.com https://git.postgresql.org/pg/commitdiff/d18e75664a2fda2e4d5cc433d68e37fc0e9499f2

  • Move StartupCLOG() calls to just after we initialize ShmemVariableCache. Previously, the hot_standby=off code path did this at end of recovery, while the hot_standby=on code path did it at the beginning of recovery. It's better to do this in only one place because (a) it's simpler, (b) StartupCLOG() is trivial so trying to postpone the work isn't useful, and (c) this will make it possible to simplify some other logic. Patch by me, reviewed by Heikki Linnakangas. Discussion: http://postgr.es/m/CA+TgmoZYig9+AQodhF5sRXuKkJ=RgFDugLr3XX_dz_F-p=TwTg@mail.gmail.com https://git.postgresql.org/pg/commitdiff/1f113abdf87cd085dee3927960bb4f70442b7250

  • In clog_redo(), don't set XactCtl->shared->latest_page_number. The comment is no longer accurate, and hasn't been entirely accurate since Hot Standby was introduced. The original idea here was that StartupCLOG() wouldn't be called until the end of recovery and therefore this value would be uninitialized when this code is reached, but Hot Standby made that true only when hot_standby=off, and commit 1f113abdf87cd085dee3927960bb4f70442b7250 means that this value is now always initialized before replay even starts. The original purpose of this code was to bypass the sanity check in SimpleLruTruncate(), which will no longer occur: now, if something is wrong, that sanity check might trip during recovery. That's probably a good thing, because in the current code base latest_page_number should always be initialized and therefore we expect that the sanity check should pass. If it doesn't, something has gone wrong, and complaining about it is appropriate. Patch by me, reviewed by Heikki Linnakangas. Discussion: http://postgr.es/m/CA+TgmoZYig9+AQodhF5sRXuKkJ=RgFDugLr3XX_dz_F-p=TwTg@mail.gmail.com https://git.postgresql.org/pg/commitdiff/0fcc2decd485a61321a3220d8f76cb108b082009

  • In TrimCLOG(), don't reset XactCtl->shared->latest_page_number. Since the CLOG page number is not recorded directly in the checkpoint record, we have to use ShmemVariableCache->nextXid to figure out the latest CLOG page number at the start of recovery. However, as recovery progresses, replay of CLOG/EXTEND records will update our notion of the latest page number, and we should rely on that being accurate rather than recomputing the value based on an updated notion of nextXid. ShmemVariableCache->nextXid is only an approximation during recovery anyway, whereas CLOG/EXTEND records are an authoritative representation of how the SLRU has been updated. Commit 0fcc2decd485a61321a3220d8f76cb108b082009 makes this simplification possible, as before that change clog_redo() might have injected a bogus value here, and we'd want to get rid of that before entering normal running. Patch by me, reviewed by Heikki Linnakangas. Discussion: http://postgr.es/m/CA+TgmoZYig9+AQodhF5sRXuKkJ=RgFDugLr3XX_dz_F-p=TwTg@mail.gmail.com https://git.postgresql.org/pg/commitdiff/69059d3b2f0754c8e661ba479f7121e6631cdf4a

Andres Freund pushed:

Fujii Masao pushed:

  • postgres_fdw: Add functions to discard cached connections. This commit introduces two new functions postgres_fdw_disconnect() and postgres_fdw_disconnect_all(). The former function discards the cached connections to the specified foreign server. The latter discards all the cached connections. If the connection is used in the current transaction, it's not closed and a warning message is emitted. For example, these functions are useful when users want to explicitly close the foreign server connections that are no longer necessary and then to prevent them from eating up the foreign servers connections capacity. Author: Bharath Rupireddy, tweaked a bit by Fujii Masao Reviewed-by: Alexey Kondratov, Zhijie Hou, Zhihong Yu, Fujii Masao Discussion: https://postgr.es/m/CALj2ACVvrp5=AVp2PupEm+nAC8S4buqR3fJMmaCoc7ftT0aD2A@mail.gmail.com https://git.postgresql.org/pg/commitdiff/411ae64997dc3a42d19eda6721c581841ce2cb82

  • postgres_fdw: Stabilize regression test for postgres_fdw_disconnect_all(). The regression test added in commit 411ae64997 caused buildfarm failures. The cause of them was that the order of warning messages output in the test was not stable. To fix this, this commit sets client_min_messages to ERROR temporarily when performing the test generating those warnings. Per buildfarm failures. Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/6adc5376dca4ef8b7d591c0ee7338cb9ff660216

  • postgres_fdw: Fix test failure with -DENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS. The roles created by regression test should have names starting with "regress_", and the test introduced in commit 411ae64997 did not do that. Per buildfarm member longfin. Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/0c3fc09fe359a6dc46f1870ceccf60ec60396bc9

  • postgres_fdw: Fix tests for CLOBBER_CACHE_ALWAYS. The regression tests added in commits 708d165ddb and 411ae64997 caused buildfarm failures when CLOBBER_CACHE_ALWAYS was enabled. This commit stabilizes those tests. The foreign server connections established by postgres_fdw behaves differently depending on whether CLOBBER_CACHE_ALWAYS is enabled or not. If it's not enabled, those connections are cached. On the other hand, if it's enabled, when the connections are established outside transaction block, they are not cached (i.e., they are immediately closed at the end of query that established them). So the subsequent postgres_fdw_get_connections() cannot list those connections and postgres_fdw_disconnect() cannot close them (because they are already closed). When the connections are established inside transaction block, they are cached whether CLOBBER_CACHE_ALWAYS was enabled or not. But if it's enabled, they are immediately marked as invalid, otherwise not. This causes the subsequent postgres_fdw_get_connections() to return different result in "valid" column depending on whether CLOBBER_CACHE_ALWAYS was enabled or not. This commit prevents the above differences of behavior from affecting the regression tests. Per buildfarm failure on trilobite. Original patch by Bharath Rupireddy. I (Fujii Masao) extracted the regression test fix from that and revised it a bit. Reported-by: Tom Lane Author: Bharath Rupireddy Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/f77717b2985aa529a185e6988de26b885ca10ddb

Michaël Paquier pushed:

Peter Geoghegan pushed:

  • Fix GiST index deletion assert issue. Avoid calling heap_index_delete_tuples() with an empty deltids array to avoid an assertion failure. This issue was arguably an oversight in commit b5f58cf2, though the failing assert itself was added by my recent commit d168b666. No backpatch, though, since the oversight is harmless in the back branches. Author: Peter Geoghegan pg@bowt.ie Reported-By: Jaime Casanova jcasanov@systemguards.com.ec Discussion: https://postgr.es/m/CAJKUy5jscES84n3puE=sYngyF+zpb4wv8UMtuLnLPv5z=6yyNw@mail.gmail.com https://git.postgresql.org/pg/commitdiff/e42b3c3bd6a9c6233ac4c8a2e9b040367ba2f97c

  • Reduce the default value of vacuum_cost_page_miss. When commit f425b605 introduced cost based vacuum delays back in 2004, the defaults reflected then-current trends in hardware, as well as certain historical limitations in PostgreSQL. There have been enormous improvements in both areas since that time. The cost limit GUC defaults finally became much more representative of current trends following commit cbccac37, which decreased autovacuum_vacuum_cost_delay's default by 10x for PostgreSQL 12 (it went from 20ms to only 2ms). The relative costs have shifted too. This should also be accounted for by the defaults. More specifically, the relative importance of avoiding dirtying pages within VACUUM has greatly increased, primarily due to main memory capacity scaling and trends in flash storage. Within Postgres itself, improvements like sequential access during index vacuuming (at least in nbtree and GiST indexes) have also been contributing factors. To reflect all this, decrease the default of vacuum_cost_page_miss to 2. Since the default of vacuum_cost_page_dirty remains 20, dirtying a page is now considered 10x "costlier" than a page miss by default. Author: Peter Geoghegan pg@bowt.ie Discussion: https://postgr.es/m/CAH2-WzmLPFnkWT8xMjmcsm7YS3+_Qi3iRWAb2+_Bc8UhVyHfuA@mail.gmail.com https://git.postgresql.org/pg/commitdiff/e19594c5c059d2e071b67d87ae84f569a52d2e32

  • Remove unused _bt_delitems_delete() argument. The latestRemovedXid values used by nbtree deletion operations are determined by _bt_delitems_delete()'s caller, so there is no reason to pass a separate heapRel argument. Oversight in commit d168b666823. https://git.postgresql.org/pg/commitdiff/dc43492e46c7145a476cb8ca6200fc8eefe673ef

Andrew Gierth pushed:

  • Don't add bailout adjustment for non-strict deserialize calls. When building aggregate expression steps, strict checks need a bailout jump for when a null value is encountered, so there is a list of steps that require later adjustment. Adding entries to that list for steps that aren't actually strict would be harmless, except that there is an Assert which catches them. This leads to spurious errors on asserts builds, for data sets that trigger parallel aggregation of an aggregate with a non-strict deserialization function (no such aggregates exist in the core system). Repair by not adding the adjustment entry when it's not needed. Backpatch back to 11 where the code was introduced. Per a report from Darafei (Komzpa) of the PostGIS project; analysis and patch by me. Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/a3367aa3c4552170004c92369681730d85e384c6

Heikki Linnakangas pushed:

  • Add mbverifystr() functions specific to each encoding. This makes pg_verify_mbstr() function faster, by allowing more efficient encoding-specific implementations. All the implementations included in this commit are pretty naive, they just call the same encoding-specific verifychar functions that were used previously, but that already gives a performance boost because the tight character-at-a-time loop is simpler. Reviewed-by: John Naylor Discussion: https://www.postgresql.org/message-id/[email protected] https://git.postgresql.org/pg/commitdiff/b80e10638e36b9d2f0b39170c613837af2ca2aac

  • Add direct conversion routines between EUC_TW and Big5. Conversions between EUC_TW and Big5 were previously implemented by converting the whole input to MIC first, and then from MIC to the target encoding. Implement functions to convert directly between the two. The reason to do this now is that I'm working on a patch that will change the conversion function signature so that if the input is invalid, we convert as much as we can and return the number of bytes successfully converted. That's not possible if we use an intermediary format, because if an error happens in the intermediary -> final conversion, we lose track of the location of the invalid character in the original input. Avoiding the intermediate step makes the conversions faster, too. Reviewed-by: John Naylor Discussion: https://www.postgresql.org/message-id/b9e3167f-f84b-7aa4-5738-be578a4db924%40iki.fi https://git.postgresql.org/pg/commitdiff/6c5576075b0f93f2235ac8a82290fe3b6e82300d

Álvaro Herrera pushed:

Thomas Munro pushed:

Alexander Korotkov pushed:

  • Document behavior of the .** jsonpath accessor in the lax mode. When the .** jsonpath accessor handles the array, it selects both array and each of its elements. When using lax mode, subsequent accessors automatically unwrap arrays. So, the content of each array element may be selected twice. Even though this behavior is counterintuitive, it's correct because everything works as designed. This commit documents it. Backpatch to 12 where the jsonpath language was introduced. Reported-by: Thomas Kellerer Bug: #16828 Discussion: https://postgr.es/m/16828-2b0229babfad2d8c%40postgresql.org Discussion: https://postgr.es/m/CAPpHfdtS-nNidT%3DEqZbAYOPcnNOWh_sd6skVdu2CAQUGdvpT8Q%40mail.gmail.com Author: Alexandex Korotkov, revised by Tom Lane Reviewed-by: Alvaro Herrera, Thomas Kellerer, Tom Lane Backpatch-through: 12 https://git.postgresql.org/pg/commitdiff/b41645460af563cfd4e4f57f354058cf69ef3b14

  • Fix parsing of complex morphs to tsquery. When to_tsquery() or websearch_to_tsquery() meet a complex morph containing multiple words residing adjacent position, these words are connected with OP_AND operator. That leads to surprising results. For instace, both websearch_to_tsquery('"pg_class pg"') and to_tsquery('pg_class <-> pg') produce '( pg & class ) <-> pg' tsquery. This tsquery requires 'pg' and 'class' words to reside on the same position and doesn't match to to_tsvector('pg_class pg'). It appears to be ridiculous behavior, which needs to be fixed. This commit makes to_tsquery() or websearch_to_tsquery() connect words residing adjacent position with OP_PHRASE. Therefore, now those words are normally chained with other OP_PHRASE operator. The examples of above now produces 'pg <-> class <-> pg' tsquery, which matches to to_tsvector('pg_class pg'). Another effect of this commit is that complex morph word positions now need to match the tsvector even if there is no surrounding OP_PHRASE. This behavior change generally looks like an improvement but making this commit not backpatchable. Reported-by: Barry Pederson Bug: #16592 Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/CAPpHfdv0EzVhf6CWfB1_TTZqXV_2Sn-jSY3zSd7ePH%3D-%2B1V2DQ%40mail.gmail.com Author: Alexander Korotkov Reviewed-by: Tom Lane, Neil Chen https://git.postgresql.org/pg/commitdiff/0c4f355c6a5fd437f71349f2f3d5d491382572b7

Noah Misch pushed:

  • Fix CREATE INDEX CONCURRENTLY for simultaneous prepared transactions. In a cluster having used CREATE INDEX CONCURRENTLY while having enabled prepared transactions, queries that use the resulting index can silently fail to find rows. Fix this for future CREATE INDEX CONCURRENTLY by making it wait for prepared transactions like it waits for ordinary transactions. This expands the VirtualTransactionId structure domain to admit prepared transactions. It may be necessary to reindex to recover from past occurrences. Back-patch to 9.5 (all supported versions). Andrey Borodin, reviewed (in earlier versions) by Tom Lane and Michael Paquier. Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/8a54e12a38d1545d249f1402f66c8cde2837d97c

  • Fix error with CREATE PUBLICATION, wal_level=minimal, and new tables. CREATE PUBLICATION has failed spuriously when applied to a permanent relation created or rewritten in the current transaction. Make the same change to another site having the same semantic intent; the second instance has no user-visible consequences. Back-patch to v13, where commit c6b92041d38512a4176ed76ad06f713d2e6c01a8 broke this. Kyotaro Horiguchi Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/360bd2321b1ad9c47466bc485ee5eb2f4515372e

  • Revive "snapshot too old" with wal_level=minimal and SET TABLESPACE. Given a permanent relation rewritten in the current transaction, the old_snapshot_threshold mechanism assumed the relation had never been subject to early pruning. Hence, a query could fail to report "snapshot too old" when the rewrite followed an early truncation. ALTER TABLE SET TABLESPACE is probably the only rewrite mechanism capable of exposing this bug. REINDEX sets indcheckxmin, avoiding the problem. CLUSTER has zeroed page LSNs since before old_snapshot_threshold existed, so old_snapshot_threshold has never cooperated with it. ALTER TABLE ... SET DATA TYPE makes the table look empty to every past snapshot, which is strictly worse. Back-patch to v13, where commit c6b92041d38512a4176ed76ad06f713d2e6c01a8 broke this. Kyotaro Horiguchi and Noah Misch Discussion: https://postgr.es/m/[email protected] https://git.postgresql.org/pg/commitdiff/7da83415e5bc01bba0093f34f4f612b70c70b678

Pending Patches

Amit Langote sent in another revision of a patch to set ForeignScanState.resultRelInfo lazily, set ResultRelInfo.ri_PartitionRoot in all result relations, and initialize result relation information lazily.

Takamichi Osumi sent in another revision of a patch to ensure that archive recovery not miss data.

Amit Langote sent in another revision of a patch to create foreign key triggers in partitioned tables, and enforce foreign keys correctly during cross-partition updates.

Pavel Borisov sent in another revision of a patch to generate HASH and LIST partitions automatically with a new CONFIGURATION directive.

Bucoo sent in another revision of a patch to support parallel union/distinct/aggregate using batch hashagg and sort.

Vigneshwaran C sent in another revision of a patch to identify missing publications from publisher during CREATE/ALTER SUBSCRIPTION operations.

Bertrand Drouvot sent in a patch to add an --extension flag to pg_dump to dump specific extensions.

David Cramer sent in two more revisions of a patch to throw an error and rollback on a failed transaction instead of silently rolling back.

Konstantin Knizhnik sent in two more revisions of a patch to have auto_explain create statistics.

Michaël Paquier and Sehrope Sarkuni traded patches to surface sha1 to SQL.

Takayuki Tsunakawa, Iwata Aya, and Kirk Jamison traded patches to add tracing to libpq.

Masahiro Ikeda sent in three more revisions of a patch to add statistics related to write/sync wal records, and make the wal receiver report WAL statistics.

Masahiko Sawada sent in two more revisions of a patch to choose vacuum strategy before heap and index vacuums, skip btree bulkdelete if the index doesn't grow, and disable index cleanup for anti-wraparound and aggressive vacuums.

Peter Smith and Amit Kapila traded patches to enable tablesync workers to use multiple transactions apiece.

Bruce Momjian sent in four more revisions of a patch to implement key management.

Dilip Kumar sent in three more revisions of a patch to provide a new interface to get the recovery pause status.

David Rowley sent in another revision of a patch to add TID Range Scans to support efficient scanning ranges of TIDs.

Bertrand Drouvot sent in another revision of a patch to implement minimal logical decoding on standbys.

Surafel Temesgen sent in another revision of a patch to implement system-versioned temporal tables.

Jim Finnerty sent in another revision of a patch to implement 64-bit XIDs.

Amit Langote sent in another revision of a patch to overhaul how updates compute new tuples, and revise how inherited updates and deletes are handled.

Li Japin sent in a patch to fix the documentation of ALTER SUBSCRIPTION ... SET PUBLICATION.

Daniel Gustafsson sent in another revision of a patch to support checksum enable/disable in a running cluster.

Takashi Menjo sent in another revision of a patch to make it possible to use NVRAM for WAL buffers.

Masahiko Sawada sent in another revision of a patch to implement transactions involving multiple postgres foreign servers.

Dean Rasheed sent in another revision of a patch to make it possible to create and use extended statistics on expressions.

Amit Langote sent in another revision of a patch to export get_partition_for_tuple() and avoid using SPI for some RI checks.

Denis Laxalde sent in a patch to Set default transactions to read-only at server start in pg_upgrade.

Amit Khandekar sent in another revision of a patch to speed up xor'ing of two gist index signatures for tsvectors by using popcount64() on chunks instead of xor'ing char values, and avoid function pointer dereferencing for pg_popcount32/64() calls.

Konstantin Knizhnik sent in another revision of a patch to extending the auto_explain extension to generate extended statistics in case of bad selectivity estimation, and take into account extended statistics when computing join selectivity.

Michail Nikolaev sent in another revision of a patch to add full support for index LP_DEAD hint bits on standby.

Nathan Bossart sent in another revision of a patch to Avoid marking segments as ready-for-archival too early.

Anastasia Lubennikova sent in another revision of a patch intended to fix a bug that manifested as pg_upgrade fails with non-standard ACL.

Paul Guo sent in a patch to speed up pg_rewind by fsync()ing only the affected files/directories, and using copy_file_range() for file copying.

Kyotaro HORIGUCHI and Heikki Linnakangas traded patches to add catcache expiration.

Heikki Linnakangas and Daniel Gustafsson traded patches to support enabling/disabling page checksums in a running instance.

Mark Rofail sent in two more revisions of a patch to implement foreign key arrays.

Vigneshwaran C sent in three more revisions of a patch to make it possible to print a backtrace of any postgres process using a new function pg_print_backtrace() that is part of the instance in which the function is called.

Michaël Paquier and Daniel Gustafsson traded patches to make it possible to use NSS as a TLS provider for libpq.

Bharath Rupireddy sent in two more revisions of a patch to avoid catalogue accesses in slot_store_error_callback and conversion_error_callback.

Takamichi Osumi sent in a patch to log when/where wal_level is changed to 'minimal' from upper level to invalidate the old backups or make alerts to users.

Amul Sul sent in another revision of a patch to implement wal prohibit state using global barrier.

Greg Nancarrow sent in another revision of a patch to make it possible to use parallel selects in INSERT ... SELECT, and parallel INSERT or SELECT where it's safe to do so.

Justin Pryzby sent in another revision of a patch to make it possible to run CREATE INDEX CONCURRENTLY on a partitioned table.

Andrew Dunstan sent in two revisions of a patch to implement PostgresNodePath, a subclass of PostgresNode which runs commands in the context of a given PostgreSQL install path.

David Rowley sent in another revision of a patch to add a Result Cache executor node.

Nathan Bossart sent in two more revisions of a patch to add a PROCESS_TOAST option to VACUUM.

Paul Martinez sent in a patch to clarify the messages that occur when a replication connection is not allowed.

Jacob Champion sent in a patch to save a role's original authenticated identity for logging.

Mark Dilger sent in three more revisions of a patch to implement a new pg_amcheck contrib extension.

Hou Zhijie sent in another revision of a patch to add a new GUC and corresponding table option enable_parallel_dml.

Marco Montagna sent in a patch to reduce likelihood of fdw prepared statement collisions by prepending a random number to the names generated for same.

Peter Eisentraut sent in another revision of a patch to pg_dump to fix dumping of inherited generated columns.

Antonin Houska sent in another revision of a patch to clean up orphaned files using undo logs.

Dilip Kumar and Justin Pryzby traded patches to implement custom table compression methods.

Álvaro Herrera sent in another revision of a patch to add pg_atomic_monotonic_advance_u64, use atomics in xlog.c, and add barriers.

Fabien COELHO sent in another revision of a patch to add barriers to pgbench.

Thomas Munro sent in two revisions of a patch to use a global barrier to fix DROP TABLESPACE on Windows, and use condition variables for ProcSignalBarriers.

Andrew Dunstan sent in another revision of a patch to allow matching whole DN from a client certificate.

Thomas Munro sent in a patch to hold interrupts while running dsm_detach() callbacks.

Joel Jacobson sent in a patch to improve the documentation of oid columns that can be zero in system views.

Vigneshwaran C sent in another revision of a patch to add schema support for CREATE PUBLICATION.

Tom Lane sent in a patch to record foreign key relationships for the system catalogs.

Euler Taveira de Oliveira sent in another revision of a patch to implement row filtering for logical replication.