Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PXB-3218 - Merge MySQL 8.3 #1541

Merged
merged 1,008 commits into from
Feb 28, 2024
Merged

PXB-3218 - Merge MySQL 8.3 #1541

merged 1,008 commits into from
Feb 28, 2024

Conversation

satya-bodapati
Copy link
Contributor

No description provided.

gkodinov and others added 30 commits November 1, 2023 13:56
Add the missing STL headers to allow compiling on VS2022

Change-Id: I2e87cc97fa4365d22ddd18f1d5373c389fda4c86
…error to mtr output.

MTR tests that uses run_ndbapitest.inc or run_java.inc were intended to
record the last 512 or 200 output lines of test output in result file on
error to eventually show up in mtr output.

But mtr only showed the last 20 lines of difference between test result
file and the pre-recorded test results.

An empty file include/full_result_diff.inc is introduced to be included
in test files that do not want the 20 lines limitation of result diff
output.

The file is included by run_ndbapitest.inc or run_java.inc.

This will make it easier to see for example why ndb.test_mgmd fails in
PB2 without need to download tar file with all test failures and for
PB2 builds such as gcov which does not provide such tar file.

Change-Id: I457d0f9430bfe0960ebd60fcbe5eb25e3b71a986
Change-Id: Idefbe610f8bd2f435ed4cc0bac32898cfe9ff2fc
…array_YY

This patch fixes three issues

     - For hashed set operations we compute the number of chunk files
       as a multiple of two. If the statistics for the number of rows
       in the left set operand is wrong (in this case it was -1), we
       would get a calculated number of chunk files of zero. When the
       bug report was made, this led to us not having any chunk files,
       since the ceiling method used (to get multiple of two) returned
       0 for 0 input.  This ceiling function has since been changed
       (commit 88d716a - Bug#35813111: Use C++ standard library for
       more bit operations [my_round_up_to_next_power, noclose]), so
       we now get 1 on zero input, so we end up with just one chunk
       file.  This makes the crash seen with the old optimizer go
       away; we would quickly run out of space again and revert to
       de-duplicating via tmp table index.  This is fixed by sanity
       checking the number of rows estimate and making a wild guess at
       the number of rows in the result set, currently set to 8 * rows
       already in hash table when we overflow memory.

     - The estimate of -1 for the windowed left operand is obviously
       wrong, and we changed the code to propagate the number of rows
       of the child table. The added tests shows new behavior for this
       case.

     - even if the estimate is not -1, but a reasonable number, fixing
       the above saw another crash (using the repro for the hypergraph
       given, but with old optimizer enabled): we ran out of space in
       the dedicated hash table upon re-reading one of the on-disk
       chunks into the hash table. If the estimate is too low, we may
       end up in a situation where one of the chunks just barely fits
       into the dedicated mem_root when processing the left
       operand. Later, when processing operands 2..n we have to
       re-read the chunks into the hash table in preparation for
       matching with right operand rows. We asserted that this should
       always succeed, since these rows already fit for the left
       operand. However, the order in which the rows are entered into
       the hash table for operand two will in general differ from that
       of operand one. This will in general lead to different
       fragmentation of the dedicated mem_root - notably if we have
       blobs of different sizes as in the repro - and the chunk that
       previously *just* fit, will may not fit this time around. In
       the example we saw the last row in the chunk file not fit (row
       percona#290 out of 290 rows in the chunk file).  We fix this by
       falling back on the thread's main mem_root if this
       happens. This should be very rare.  Note that in most cases if
       the estimate is low too low, we would run out of space when
       processing the left operand, and fall back on index based
       de-duplication using tmp table.

Change-Id: Ie2b299ff6f106df727866b3cd3631b7a273c1c9b
…failed.

This issue involves setting a user variable inside the argument of a
window function, which in turn is evaluated using the window frame
buffer (Window::m_needs_frame_buffering), row optimizable
(Window::m_row_optimizable).

Setting of user variables inside expressions is deprecated, in part
since the semantics are hard to define, but this patch avoids the
assert at least.

Normally when evaluating a function argument for a window function,
e.g.  AVG( 2 * c1 ), the function (here multiplication) will be
evaluated just before writing a row to the frame buffer, whereas
functions *containing* a window function will be evaluated later,
i.e. after the value of the window function is ready,
cf. split_sum_func2.

Now, for the first case when creating the window's output and frame
buffer tmp tables, we replace the function (here the multiplication)
with a result field in the window's tmp table,
cf. change_to_use_tmp_fields, so that the window function when
evaluated will pick up its input (the result of the multiplication)
from the window frame buffer. However, there is an explicit exception
for setting of user variables (which is implemented as a function),
cf. this comment in change_to_use_tmp_fields:

  /*
    Replace "@:=<expression>" with "@:=<tmp table column>". Otherwise, we
    would re-evaluate <expression>, and if expression were a subquery,
    this would access already-unlocked tables.
  */

which stems from Bug#11764371 at commit 6a2402b, which involves
locking issues when executing a subquery when evaluating a setting of
a user variable, a join and group by. The exception mentioned is a
problem for evaluating window functions, because the window function
(AVG) will try to evaluate Item_func_set_user_var when reading from
the frame buffer. But the underlying column is taken from the input
file, not from the output/frame buffer (it is not replaced by an
Item_ref subject to the slice indirection).

In the repro, the two first rows are null. The third row is non-null.
When trying to evaluate the wf for the third row, we invert the first
(null) row, but this time, since the third input row has been read,
the Item_func_set_user_var's argument points into the input buffer
which has a non-null row, where as AVG expects to invert a null row
(it has counters indicating that rows 1 and 2 are both null) from the
frame buffer. This causes the assert error seen.

The fix is to *not* apply the mentioned exception for setting user
variables when windowing: that way the user variable is only set when
we read the input result set for the windowing action, and the frame
buffer will contain that value. As far as the locking issue of
11764371, I tried windowing on the query mentioned for the issue
(included in test), and saw no issue.

Change-Id: I75dc538bea55d5888783586d31ced5e9643d6fa3
…tions behind

During a data node restart, the starting node asks nodegroup peers
to copy the details of the current event subscribers to the starting
node, so that it can take over responsibility for part of the event
forwarding and buffering work when it is started.

This implies that the starting node must also monitor for subscribing
API node failures from this point forward.

As the copy of subscribers happens before the API nodes are directly
connected to the starting node, it is necessary for the starting data
node to handle API failures even when those APIs are not directly
connected during restarts.

This was already handled to some extent in QMGR, but appears to have
regressed over time.

This could result in e.g. :
 - Leaked subscribers at SUMA
 - Wasted effort forwarding events to disconnected subscribers
 - Wasted effort forwarding events to reconnected subscribers
 - Crashes / undefined behaviour if reconnected subscribers reuse
   NdbApi object identitied.

test_event_mysqld is created to give some coverage of the MySQLD
ha_ndbcluster plugin stack event consumption behaviour during data
node restarts and MySQLD asynchronous disconnects.

  test_event_mysqld -n MySQLDEventsRestarts
  test_event_mysqld -n MySQLDEventsDisconnects
  test_event_mysqld -n MySQLDEventsRestartsDisconnects

The third testase is invoked from a new MTR testcase to give simple
automated coverage.

  ndb_binlog_testevent_rd

Change-Id: Icef0fd0972fb5646bd27dbf2b9ff194938a30ba9
…FAILREP handling

Background

NdbApi events are buffered in NdbApi to match the rate of production and
rate of consumption by user code.
A maximum buffer size can be set to avoid unbounded memory usage when
the rate is mismatched for a long time.
When the maximum size is reached event buffering stops until the buffer
usage drops below a lower threshold.
A TE_OUT_OF_MEMORY event is added to the buffer to inform the consumer
that there can be missing events.

Problem

Testing and experience shows some issues with this mechanism, particularly as
it stops the buffering of both 'table data' events (TE_INSERT, TE_UPDATE,
TE_DELETE) and other meta events (TE_DROP_TABLE, TE_NODE_FAILURE,
TE_CLUSTER_FAILURE, ...).  Discarding the meta events results in issues
in NdbApi and in consumer code as the state of the event stream is not
clear.

One specific example is where the cluster is disconnected while the event
buffer is full - in this case, the local ClusterMgr thread should add
TE_NODE_FAILURE and TE_CLUSTER_FAILURE events to the buffered events for
each active EventOperation, but this was not possible and resulted in
a crash.  Other undesirable behaviours may be possible.

Solution

In an event buffer overload situation, only the table data events
TE_INSERT, TE_UPDATE, TE_DELETE are discarded.  Meta events are buffered
and eventually processed as normal.

Effects

This means that the buffer usage is not necessarily as tightly bounded
as in the original implementation, though the rate of production of meta
events is expected to be minimal.

Additionally it means that from the consumer's point of view, when they
iterate to a position in the event stream that corresponds to an out
of memory situation, they may still observe some meta events 'during'
the out of memory period.  These events will not include data, and
specifically will not include TE_EMPTY_EPOCH events which could be
misleading.

The consumer can therefore safely assume that the out-of-memory situation
persists until it receives TE_EMPTY_EPOCH (if enabled) or data carrying
events afer a TE_OUT_OF_MEMORY event.

An additional effect of this change is that the event API's latest epoch
as returned by getLatestGCI() no longer is frozen while the event buffer
producer is in an epoch gap - it continues to climb.

This behaviour was not documented before as an indication of event buffer
OOM handling, but was used by some existing test_event testcase.  These
are modified to use observations of the buffer fill level instead to
detect when an OOM gap is being buffered.

Testing

Four new tests added to test_event_mysqld :

  test_event_mysqld -n MySQLDEventsEventBufferOverload
  test_event_mysqld -n MySQLDEventsEventBufferOverloadRestarts
  test_event_mysqld -n MySQLDEventsEventBufferOverloadDisconnects
  test_event_mysqld -n MySQLDEventsEventBufferOverloadRestartsDisconnects

New MTR test added invoking

  test_event_mysqld -n MySQLDEventsEventBufferOverloadRestartsDisconnects

  ndb_binlog_testevent_ord

Change-Id: If904fb60be2798b53cc4cb25ea3607c4289739b9
…or drop table

Problem

During NdbApi event buffer overflow, buffering of data and metadata
events is paused, resulting in data and metadata events being
discarded.

This means that event subscribers waiting for metadata events
such as TE_DROP_TABLE may never receive them and will timeout.

Specifically, MySQL Cluster schema distribution uses the arrival
of a TE_DROP_TABLE event on an NdbEventOperation to know when it
is required and safe to drop the NdbEventOperation.

If this event does not arrive within a bounded time then the
NdbEventOperation is not dropped, leaking resources and potentially
resulting in later problems.

Solution

The fix to
  Bug#35663761 NdbApi fails to find container for latest epoch in NODE_FAILREP handling

modified the NdbApi event buffer overload handling to only pause
the buffering of data events, but continue buffering metadata
events.
This means that in event buffer overload situations,
TE_DROP events will not be lost.

This patch adds a testcase covering this scenario :

  test_event_mysqld -n MySQLDEventsEventBufferOverloadDDL

This test is invoked from MTR as :
  ndb_binlog_testevent_os

Change-Id: I654fd0aa0c83d35956d4d01632f16f610951bd0c
…tions behind

During a data node restart, the starting node asks nodegroup peers
to copy the details of the current event subscribers to the starting
node, so that it can take over responsibility for part of the event
forwarding and buffering work when it is started.

This implies that the starting node must also monitor for subscribing
API node failures from this point forward.

As the copy of subscribers happens before the API nodes are directly
connected to the starting node, it is necessary for the starting data
node to handle API failures even when those APIs are not directly
connected during restarts.

This was already handled to some extent in QMGR, but appears to have
regressed over time.

This could result in e.g. :
 - Leaked subscribers at SUMA
 - Wasted effort forwarding events to disconnected subscribers
 - Wasted effort forwarding events to reconnected subscribers
 - Crashes / undefined behaviour if reconnected subscribers reuse
   NdbApi object identitied.

test_event_mysqld is created to give some coverage of the MySQLD
ha_ndbcluster plugin stack event consumption behaviour during data
node restarts and MySQLD asynchronous disconnects.

  test_event_mysqld -n MySQLDEventsRestarts
  test_event_mysqld -n MySQLDEventsDisconnects
  test_event_mysqld -n MySQLDEventsRestartsDisconnects

The third testase is invoked from a new MTR testcase to give simple
automated coverage.

  ndb_binlog_testevent_rd

Change-Id: Icef0fd0972fb5646bd27dbf2b9ff194938a30ba9
…FAILREP handling

Background

NdbApi events are buffered in NdbApi to match the rate of production and
rate of consumption by user code.
A maximum buffer size can be set to avoid unbounded memory usage when
the rate is mismatched for a long time.
When the maximum size is reached event buffering stops until the buffer
usage drops below a lower threshold.
A TE_OUT_OF_MEMORY event is added to the buffer to inform the consumer
that there can be missing events.

Problem

Testing and experience shows some issues with this mechanism, particularly as
it stops the buffering of both 'table data' events (TE_INSERT, TE_UPDATE,
TE_DELETE) and other meta events (TE_DROP_TABLE, TE_NODE_FAILURE,
TE_CLUSTER_FAILURE, ...).  Discarding the meta events results in issues
in NdbApi and in consumer code as the state of the event stream is not
clear.

One specific example is where the cluster is disconnected while the event
buffer is full - in this case, the local ClusterMgr thread should add
TE_NODE_FAILURE and TE_CLUSTER_FAILURE events to the buffered events for
each active EventOperation, but this was not possible and resulted in
a crash.  Other undesirable behaviours may be possible.

Solution

In an event buffer overload situation, only the table data events
TE_INSERT, TE_UPDATE, TE_DELETE are discarded.  Meta events are buffered
and eventually processed as normal.

Effects

This means that the buffer usage is not necessarily as tightly bounded
as in the original implementation, though the rate of production of meta
events is expected to be minimal.

Additionally it means that from the consumer's point of view, when they
iterate to a position in the event stream that corresponds to an out
of memory situation, they may still observe some meta events 'during'
the out of memory period.  These events will not include data, and
specifically will not include TE_EMPTY_EPOCH events which could be
misleading.

The consumer can therefore safely assume that the out-of-memory situation
persists until it receives TE_EMPTY_EPOCH (if enabled) or data carrying
events afer a TE_OUT_OF_MEMORY event.

An additional effect of this change is that the event API's latest epoch
as returned by getLatestGCI() no longer is frozen while the event buffer
producer is in an epoch gap - it continues to climb.

This behaviour was not documented before as an indication of event buffer
OOM handling, but was used by some existing test_event testcase.  These
are modified to use observations of the buffer fill level instead to
detect when an OOM gap is being buffered.

Testing

Four new tests added to test_event_mysqld :

  test_event_mysqld -n MySQLDEventsEventBufferOverload
  test_event_mysqld -n MySQLDEventsEventBufferOverloadRestarts
  test_event_mysqld -n MySQLDEventsEventBufferOverloadDisconnects
  test_event_mysqld -n MySQLDEventsEventBufferOverloadRestartsDisconnects

New MTR test added invoking

  test_event_mysqld -n MySQLDEventsEventBufferOverloadRestartsDisconnects

  ndb_binlog_testevent_ord

Change-Id: If904fb60be2798b53cc4cb25ea3607c4289739b9
…or drop table

Problem

During NdbApi event buffer overflow, buffering of data and metadata
events is paused, resulting in data and metadata events being
discarded.

This means that event subscribers waiting for metadata events
such as TE_DROP_TABLE may never receive them and will timeout.

Specifically, MySQL Cluster schema distribution uses the arrival
of a TE_DROP_TABLE event on an NdbEventOperation to know when it
is required and safe to drop the NdbEventOperation.

If this event does not arrive within a bounded time then the
NdbEventOperation is not dropped, leaking resources and potentially
resulting in later problems.

Solution

The fix to
  Bug#35663761 NdbApi fails to find container for latest epoch in NODE_FAILREP handling

modified the NdbApi event buffer overload handling to only pause
the buffering of data events, but continue buffering metadata
events.
This means that in event buffer overload situations,
TE_DROP events will not be lost.

This patch adds a testcase covering this scenario :

  test_event_mysqld -n MySQLDEventsEventBufferOverloadDDL

This test is invoked from MTR as :
  ndb_binlog_testevent_os

Change-Id: I654fd0aa0c83d35956d4d01632f16f610951bd0c
Change-Id: I5488e06be61992c7d817d6febfdbb0b8d84d2e2c
Description:
------------
Each SELECT on a view causes a small memory leak on the
Windows platform.

Analysis:
---------
In case of select view Security_context(sctx) is allocated
in the MEM_ROOT. Memory allocated by the members of sctx is
not freed until the MEM_ROOT goes out of scope as a result
memory keeps on accumulating. Security_context employed a
mechanism logout() to avoid such memory accumulation but for
it to work correctly, we need to own the life cycle of the
sctx members keeping in mind that it is a frequently
accessed code path so that we avoid frequent memory
allocations/deallocations.

DB_restrictions is owned by the Security_context(sctx). The
former has an associative container std::unordered_map
(m_restrictions) member that is allocated on the stack. We
need to clear the memory allocated to this member at the
time of logout(). We had a similar memory growth reported
in Bug#31919448. At that time we used the temporary object
and swap() trick to free the content of m_restrictions.
Unfortunately, that trick worked on Linux but not on
Windows. This is because constructors of associative
containers can throw which means compilers are free to
allocate memory on the heap in the constructor of
associative containers. As it turned out to be the case on
Windows, std::unordered_map allocates the memory in the
heap. It is not cleared unless the object is destroyed. That
means the swap() trick would not work in such cases because
memory allocated by the temporary object is transferred to
the m_restrictions hence memory growth is still seen.

We have the following options to fix this situation :

- Refactor the code in the select view area that may have
  a wider repercussions, hence it is not the preferred
  solution.

- Create the m_restrictions map on the heap but not on the
  stack. This is the preferred option.

Fix:
----
- Change m_restrictions object creation from stack to heap.
- Allocate the memory to m_restriction only if that is
  required.
- Handle the unsafe usage of the partial_revokes APIs.
- Got rid of the swap() trick in DB_restrictions::clear().
- Removed is_not_empty() as we already have is_empty().

Change-Id: Iee828909d23991db86227f9664eb7f130a3d6d56
Change-Id: I84cfb3b47dfdf4002d1ee9c3c690f8788f84bad2
Change-Id: I2c0899e95d9cb25d4942557351e209244931f77a
Problem is in Item_field::replace_equal_field where we try to
replace fields with equivalent ones in order to push some of
these predicates down, we do not take into consideration
nullability, so some non-nullable fields were replaced with nullable
ones.
The fix was to add a function called allow_replacement which determines
if a non-nullable field can be replaced with a nullable one
(UNKNOWN results can be treated as false), otherwise skip such changes.

Change-Id: Id4370babd6d6e29e36a05ed46e4ca951507670d6
The assertion occurs because there is missing error checking in
udf_handler::call_init_func(), after evaluating the arguments to
the UDF function.

Fixed by adding error checks.

Change-Id: Icbbd256bdd4c6b21f4ec2c8b8b8ce856ebe68a47
…r to execute k nearest neighbors (kNN) queries. Works with innodb distance scan and hypergraph optimizer.

Change-Id: I3c25f586b5cd6b1ae5fd74ebb0897071f23336d4
This patch added some abstaction layers between physical libraries
(bundled or system) and their names on disk vs name in cmake code:
    Bug#35057542 Create INTERFACE libraries for bundled/system zlib/zstd/lz4

When generating the files mysqlclient.pc (for pkg-config) and
mysql_config (a shell script), we need to go through these abstraction
layers, and produce library names as required by the linker of the
host platform. This is done by the cmake macro EXTRACT_LINK_LIBRARIES.

The fix is to store these explicitly as "-lz" and "-lzstd"
respectively.  The lz4 library is not used by our client library, so
there is no need to store the linker option for that.

Change-Id: Ief70f66900bec2d07fdb3daa88caf60a138c953b
The fix for bug#35865438 'ndb_mgm hangs without --ndb-tls-search-path'
misused the ndb_mgm_get_clusterlog_severity_filter() API.

Change-Id: I0ef255c83b2ff3efaf0ebc1ad502e61e0e483575
This worklog will remove the system variable `slave-rows-search-algorithms`

Change-Id: Ic8264bce904d1102c90eb83cbb48ea78cdeef264
Removed the option
Removed the tests which uses the option.

Change-Id: I5316b2c49eb4cf1dc583d97891e7b43a5c472eca
… missing DDL statements

WL#15497 lets group replication determine whether any session
is running a DDL statement that would make it inadvisable to
change the primary server at this point.

This changeset adds support for additional DDL statements
as well as some DCL statements:

13.1.19 CREATE SPATIAL REFERENCE SYSTEM Statement
13.1.31 DROP SPATIAL REFERENCE SYSTEM Statement

13.1.12 CREATE DATABASE Statement
13.1.2  ALTER DATABASE Statement
13.1.24 DROP DATABASE Statement

13.1.18 CREATE SERVER Statement
13.1.8  ALTER SERVER Statement
13.1.30 DROP SERVER Statement

13.1.5  ALTER INSTANCE Statement
13.1.36 RENAME TABLE Statement

13.1.23 CREATE VIEW Statement
13.1.11 ALTER VIEW Statement
13.1.35 DROP VIEW Statement

13.1.22 CREATE TRIGGER Statement
13.1.34 DROP TRIGGER Statement

13.1.21 CREATE TABLESPACE Statement
13.1.10 ALTER TABLESPACE Statement
13.1.33 DROP TABLESPACE Statement

13.1.17 CREATE PROCEDURE and CREATE FUNCTION Statements
13.1.7  ALTER PROCEDURE Statement
13.1.4  ALTER FUNCTION Statement
13.1.29 DROP PROCEDURE and DROP FUNCTION Statements

13.1.14 CREATE FUNCTION Statement       (SONAME)
13.1.26 DROP FUNCTION Statement         (SONAME)

13.1.13 CREATE EVENT Statement
13.1.3  ALTER EVENT Statement
13.1.25 DROP EVENT Statement

13.1.15 CREATE INDEX Statement          (WL#15497)
13.1.27 DROP INDEX Statement            (WL#15497)

13.1.20 CREATE TABLE Statement          (WL#15497)
13.1.9  ALTER TABLE Statement           (WL#15497)
13.1.37 TRUNCATE TABLE Statement        (WL#15497)
13.1.32 DROP TABLE Statement            (WL#15497)

13.7.1.3 CREATE USER Statement
13.7.1.1 ALTER USER Statement
13.7.1.5 DROP USER Statement

13.7.1.2 CREATE ROLE Statement
13.7.1.9 SET DEFAULT ROLE Statement
13.7.1.11 SET ROLE Statement
13.7.1.4 DROP ROLE Statement

13.7.1.6 GRANT Statement
13.7.1.8 REVOKE Statement

13.7.1.7 RENAME USER Statement
13.7.1.10 SET PASSWORD Statement

Change-Id: Ia472d9ce1b4ac9bfe00d3ec619baa5a32f719857
Change-Id: I3d7820d23a7f4f80136eab131fd9e59e7c9175d9
…er WL#13448

After

  WL#13448 Remove COM_XXX command which are deprecated.

the test:

  ...classic_protocol_reconnect_all_commands

fails as the error-codes changed.

Change
======

- updated the expected error-codes for COM_PROCESSKILL,
  COM_LIST_FIELDS and COM_REFRESH

Change-Id: Icb09aa6253e850353b54a2ba4b90c7c23c9f3965
In the past the pattern:

  std::shared_ptr<void> exit_guard(nullptr,
    [this](void *) { ... });

has been used as adhoc Scope_guard.

Change
======

- use Scope_guard explicitly

Change-Id: I7593730da0f90a028231b580c443c3d0062b2cf6
In the past:

  std::shared_ptr<void> exit_trigger(nullptr,
    [&](void *) { ... });

has been used as adhoc scope guard.

In this case, the guard is used around a std::recursive_mutex
for which a std::lock_guard exists.

Change
======

- use a std::lock_guard instead of the adhoc scope-guard

Change-Id: Iae88f4d60db70894620cd3d108056588182c15d0
Additional patch, removing reference to Ubuntu 16 and Ubuntu 18.

Change-Id: I91ca91e91b4609d542d034f60fc1ec9e62ce4a10
mbremyk and others added 25 commits November 27, 2023 09:31
Bug#35337193 Add used columns to iterator-based EXPLAIN FORMAT=JSON.
Bug#36027152 Add schema name to iterator-based EXPLAIN FORMAT=JSON.

Other EXPLAIN JSON format changes:
-Change the "table_name" field  to be the name of the base table instead
of the alias.
-Add "alias" field to hold the alias that was previously in "table_name".

Change-Id: I1c0ac1c09b3f859f7edbca86c19f2fb0351936a2
Change the rhel conditoion from == 7 to >= 7

For 8.3+ this condition can be removed since 6 is not supported.

Add DISABLE_MISSING_PROFILE_WARNING() to libfido2 as it otherwise fails
to compile in the second phase. This was not needed before since
libfido2 does not build on EL7.

Change-Id: Id2bcf30bdcc073e3257a83aa4edc5367c7f27f2f
…flags

Recent versions of Clang have changed their implementation of
std::sort(), and our own 'varlen_sort()' function returns wrong
results.

The result is that some of our .mtr tests using the MRR strategy are
failing.

The fix is to remove usage of std::sort(), and implement our own
sorting algorithm instead.

Change-Id: Iec35400503309c026766d5b2f10b1e32e2e7a319
Approved by: Erlend Dahl <erlend.dahl@oracle.com>
Approved by: Erlend Dahl <erlend.dahl@oracle.com>
Post-push fix:
storage/innobase/dict/dict0stats.cc:2241:7:
error: ret may be used uninitialized [-Werror=maybe-uninitialized]
 2241 |       if (ret == DB_SUCCESS) {

Change-Id: Id0448b37e9182ef60ff98724cd51177b8e445aa5
 replace_embedded_rollup_references_with_tmp_fields

Regression cause by Bug#35390341 Assertion `m_count > 0 && m_count >
m_frame_null_count' failed.

That issue involved setting a user variable inside the argument of a
window function, which in turn was evaluated using the window frame
buffer (Window::m_needs_frame_buffering) row optimizable
(Window::m_row_optimizable). The first solution had an unfortunate
side-effect as seen by the present bug: here, the window function
doesn't have an argument containing the setting of a user variable, it
is the other way around: the setting of a user variable requires the
value of a window function.

Solution: refine the criterion for when we evaluate the setting of
user variable before windowing to only include the case of a wf
containing a setting (Bug#35390341).

Change-Id: Idc6824adf4bd123775a14b92bfe54824acf105c8
To get an impression of the usage of i_s.processlist,
we add status counters to the mysqld server to keep
track of how many times i_s processlist is used in a
query, and the last timestamp for when it was used.

We also add a thread in the health monitor component
to print the variable values to the error log once an
hour, provided the usage is > 0.

This makes us able to use log searching in MHS to see
how frequently i_s processlist is used.

Change-Id: Ia1060151cd265fd81d37e88c5c9d3d30c703bae9
Bug#36021695 - WL#15951:Missing error message when a pattern is
returning csv file to load but but "dialect":{"format": "parquet"}} is
provided

 * Added error code to error messages struct, updated interface
   functions.
 * Added new lakehouse error codes and messages to mysql's
   messages_to_clients.txt
 * Parquet and Avro now use the same codes and messages for common
   errors with CSV.
 * Removed CanLogWarning function and moved the check into AddWarning.
 * All datalake tests have been re-recorded to include new error codes.

Change-Id: Ib17d631cacf843d0ec744ffb2c27d73e842529a5
              found by fuzzing tool
Bug#35779012: SEGV (Item_subselect::print() at
              sql/item_subselect.cc:835) found by fuzzing tool
Bug#35733778: SEGV (Item_subselect::exec() at
              sql/item_subselect.cc:660) found by fuzzing tool
Bug#35738531: Another SEGV (Item_subselect::exec() at
              sql/item_subselect.cc:660) found by fuzzing tool

If an Item_ref object is referenced from multiple
places in a query block, and if the item that it
refers to is also referenced from multiple places,
there is a chance that while removing redundant
expressions, we could end up removing the
referenced item even though it is still being
referred to.
E.g. while resolving ORDER BY clause, if the
expression is found in the select list, expression
used in order by is removed and it starts
using the one in select list. When this happens,
while removing the redundant expressions from the
query block, if the select expression is an
Item_ref object, on the first visit to this
expression, we mark the object as unlinked. On
the second visit, this time because of the
order by, as the object is marked as unlinked,
it exits the traversal without doing anything. However,
when the item it refers to is visited, it does not
know that the item is still being referred to. So
it ends up deleting the referenced item.

Solution is to decrement the ref_count of
an item without initiating the clean up
of the underlying item unless its the last
reference (This necessitated changes to
all implementations of clean_up_after_removal).
Along with this we also remove m_unlinked member
because it's no more needed. If the underlying
item of an Item_ref object is not abandoned, we
decrement the ref count and stop looking further.

Change-Id: I4ef3aaf92a8c0961a541dae09c766929d93bb64e
Add a getter and a setter to better encapsulate
LEX::using_hypergraph_optimizer.

Change-Id: I56467ccedbe929fa18961cc37d7990febd77bdc9
Post-push fix: clean up dangling garbage pointer

Change-Id: I1c0ac1c09b3f859f7edbca86c19f2fb0351936a2
…tforms

Bug#36054662 With PGO enabled, build fails to produce commercial-debuginfo RPM on EL8
Bug#36072667 EL8 RPM "libmysqlclient.a" exports no symbols

strip and dwz from /usr/bin used by rpmbuild might break object files
and binaries produced by GCC Toolset on el* platforms.

See for example:
 https://sourceware.org/bugzilla/show_bug.cgi?id=24195

Point to newer strip and make sure corresponding dwz tool is available
(set $PATH correctly to let script use it).

Post processing in rpmbuild by brp_strip_static_archive runs "strip
-g" which breaks static archive on el8 even if newer strip is used,
disable this processing completely.

Change-Id: I681fd2bc3a7556d09fdc9e77357d779fc8c7d336
Testcase failure exposed regression from fix to
  Bug#22602898 NDB : CURIOUS STATE OF TC COMMIT_SENT / COMPLETE_SENT TIMEOUT HANDLING

Node failure handling in TC performs 1 pass through the local active
transactions to find those affected by a node failure.

In this pass, all transactions affected by the node failure are queued
for processing, e.g. rollback, commit, complete, via e.g. the serial
abort/commit or complete protocols.

The exceptions are transactions in transient internal states such as
CS_PREPARE_TO_COMMIT, CS_COMMITTING, CS_COMPLETING, which are then followed
by stable 'wait' states such as CS_COMMIT_SENT, CS_COMPLETE_SENT.
Transactions in these states were handled by doing nothing in the node failure
handling pass, and relying on the timeout handling in the subsequent
stable states to queue transactions for processing.

The fix to Bug#22602898 removed this stable state handling to avoid it
accidentally triggering, but also stopped it from triggering when needed
in this case where node failure handling found a transaction in a transient
state.

This is solved by modifying the CS_COMMIT_SENT and CS_COMPLETE_SENT stable
state handling to also perform node failure processing if a timeout has
occurred for a transaction with a failure number different to the current
latest failure number.

This ensures that all transactions involving the failed node are handled
eventually.

A new testcase testNodeRestart -n TransientStatesNF T1 is added to the
AT testsuite to give coverage.

Change-Id: I0c0d4b6f75a97a3a7ff892cc4eafd2351491a8ff
Root cause is that the mutexes 'theMultiTransporterMutex' and
'clusterMgrThreadMutex' are taken in different order in the
two respective call chains:

1) ClusterMgr::threadMain() -> lock() -> NdbMutex_Lock(clusterMgrThreadMutex)
   - ::threadMain(), holding clusterMgrThreadMutex -> TransporterFacade::startConnecting()
     - TF::startConnecting -> lockMultiTransporters()  <<<< HANG while holding clusterMgrThreadMutex

2) TransporterRegistry::report_disconnect() -> lockMultiTransporters()
   - ::report_disconnect(), holding theMultiTransporterMutex, -> TransporterFacade::reportDisconnect()
     - TF::reportDisconnect -> ClusterMgr::reportDisconnected()
       - ClusterMgr::reportDisconnected() -> lock()
         - lock() -> NdbMutex_Lock(clusterMgrThreadMutex) <<<< Held by 1)

Patch change TransporterRegistry::report_disconnect() such that the
theMultiTransporterMutex is released before calling reportDisconnect(NodeId).

It should be sufficient to hold theMultiTransporterMutex while
::report_disconnect check if we are disconnecting a multiTransporter,
and if all its Trps are in DISCONNECTED state. When this
finished we have set up 'ready_to_disconnect' and can release
theMultiTransporterMutex before -> reportDisconnect()

Change-Id: I19be0d9d92184efb8f20a92aa7189b9b85f069bc
…ary engine when we have OUT arguments

Before this patch, RAPID could not handle queries of the type
SELECT ... INTO <list of variables>. The reason for this was that these
kind of queries were set up with a special Query_result interceptor
(Query_dumpvar), whereas regular SELECT queries were set up with
Query_result_send, which was substituted on the RAPID side with a
special protocol adapter. However, RAPID has also implemented another
protocol adapter (an Item wrapper), which is used for INSERT INTO
and CREATE TABLE ... SELECT statements.

It was noted that this adapter could also be used for SELECT INTO,
where the Item wrapper is used as an intermediary for Query_dumpvar.

Two new property functions on class Query_result are implemented:
use_protocol_adapter() returns true for Query_result_send and
use_protocol_wrapper() returns true for Query_dumpvar.
An alternative implementation may check these functions for when to use
adapters resp. wrappers instead of the original Query result classes.
The function is_interceptor() is no longer used and is hence removed.

We identify this new requirement in IsSupportedProtocol(), where
we explicitly allow Query_dumpvar and Query_result_send query results,
and in CreateQEP() where we create the Item wrapper for Query_dumpvar.

Notice that this implements SELECT ... INTO both as regular statements,
as prepared statements and as procedure statements.

Notice also that the syntax variants SELECT ... INTO OUTFILE and
SELECT ... INTO DUMPFILE are still not supported.

Most of the changes are in the test suite where we have eliminated the
earlier ER_SECONDARY_ENGINE_PLUGIN error code.

A couple of test lines were removed from the tests rapid.sp and
rapid.cp_sp because they gave different results in dict and varlen
modes.

Change-Id: Icd56cb6fbc32e121a59599a7e5b7d651747804f5
…ith executing queries from stored procedures Bug#35988564: CREATE TEMPORARY TABLE failure after Table_ref::restore_properties

The problem is that when reaching the function RapidOptimize(),
the value of LEX::using_hypergraph_optimizer is not consistent with
the value of the session variable use_old_optimizer.

The problem is a missing setting of LEX::using_hypergraph_optimizer
in the execution of a query. It was only synchronized with the
hypergraph optimizer setting in preparation of a query, which is
sufficient for regular statements which always perform both
preparation and execution, but not for stored procedures that have
separate preparation and execution. The solution is to add this setting.

But this revealed another problem: sometimes execution is out of sync
with the current preparation. An optimization with the hypergraph
optimizer requires that the preparation is also performed with
settings specific to the hypergraph optimizer. This may happen e.g if
the value of the session variable use_secondary_engine is switched from
"off" or "on" to "forced" either between a preparation and execution
(for prepared statements) or between two executions (for prepared
statements and stored procedure statements)

The solution to this is to track the current preparation state versus
the one desired (the optimizer switch setting). It is now checked
that the value of using_hypergraph_optimizer matches the current
optimizer switch setting just after opening tables and before optimizing
such statements, in which case we call ask_to_reprepare().
During optimization we set using_hypergraph_optimizer according to the
optimizer switch.

The test rapid.pfs_secondary has an increased reprepare count because we
now detect more often that an optimization requires a new preparation.

In addition, a test case was added to cover the problem described in
bug#35988564.

Change-Id: Ibf158576ec4cd1edde655d41f7c8bf2813e208ee
produced by my_print_stacktrace

- Add library backtrace in extra/libbacktrace/sha9ae4f4a. Approved as
  LIC#99705/BID#154396.
- Implement stacktrace::{full, simple, pcinfo, syminfo} to encapsulate
  the backtrace state needed for calling the corresponding backtrace_*
  functions - this state needs to be created once in the lifetime of the
  process. The stacktrace namespace is in <backtrace/stacktrace.hpp>.
- Add a convenience library backtrace and the alias ext::backtrace.
- Add the CMake option WITH_EXT_BACKTRACE to control if the library will
  be used, in which case HAVE_EXT_BACKTRACE is defined.
- If HAVE_EXT_BACKTRACE is defined, use <backtrace/stacktrace.hpp> in
  my_print_stacktrace.

Change-Id: I8e0c5fa30b2dd986d42e008b26d9fd1195871177
produced by my_print_stacktrace

Additional patch, enabling stacktrace for non-glibc Linux platforms.

Change-Id: I8ac697e173cb40810fe37b1685e1495fb2660fa7
produced by my_print_stacktrace

Additional patch, enabling stacktrace for Solaris.

Note that man syscall(2) says #include <sys/syscall.h> also on Linux.

Change-Id: Ic9e567b9468e3ca3a0b8f39412099a75710022d4
produced by my_print_stacktrace

Additional patch, enabling stacktrace for freebsd and all Linux
platforms.

Change-Id: Ibc2c83ea7172a2b3af3bd6757ac09ec057c7d1ca
Add an mtr test which is only inteded to be run manually:
  ./mtr --no-check-testcases print_stacktrace
Inspect output in var/log/msqld.1.err

Change-Id: Ia308592441df0e4a23a18c590df867a15882cbef
Conflict resolutions:

CMakeLists.txt

7079e5e removed Wno-format and merge conflict is only about disabling google unit
tests for PXB.

sql_common.h

a32b194 removed MYSQL_FIELD and PXb has changed #define to change from
&& !defined(MYSQL_COMPONENT) to || XTRABACKUP

sql/mysqld.h
sql/mysqld.cc
static to non-static in PXB

storage/innobase/log/log0recv.cc
Adjust the recv_scan_log_recs to have the 'to_lsn' parameter

sql/server_component/mysql_command_backend.cc
a32b194 removed MYSQL_FIELD. PXB has #define around this code

storage/innobase/xtrabackup/src/backup_mysql.cc
mysql_stmt_bind_param() has been deprecated, use mysql_stmt_bind_named_param() instead

storage/innobase/xtrabackup/src/xtrabackup.cc
41bc027 changed component_infrastructure_deinit() about print shutdown component
messages. PXB uses the same API. Adjusted to new signature
@it-percona-cla
Copy link

it-percona-cla commented Feb 15, 2024

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 20 committers have signed the CLA.

✅ satya-bodapati
❌ kahatlen
❌ gurusami
❌ dahlerlend
❌ dtcpatricio
❌ mithuncy
❌ bjornmu
❌ weigon
❌ gkodinov
❌ blaudden
❌ ram1048
❌ mbremyk
❌ ssorumgard
❌ trosten
❌ frazerclement
❌ roylyseng
❌ nacarvalho
❌ vinc13e
❌ marcalff
❌ kboortz
You have signed the CLA already but the status is still pending? Let us recheck it.

@satya-bodapati satya-bodapati self-assigned this Feb 15, 2024
Copy link
Contributor

@percona-ysorokin percona-ysorokin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with minor comment

include/sql_common.h Outdated Show resolved Hide resolved
@satya-bodapati satya-bodapati merged commit efbc374 into percona:trunk Feb 28, 2024
1 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.