-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace one insert call per row with one insert call for multiple rows #1502
Replace one insert call per row with one insert call for multiple rows #1502
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 4 files reviewed, 1 unresolved discussion (waiting on @tristanvuong2021)
src/main/kotlin/org/wfanet/measurement/reporting/deploy/v2/postgres/writers/CreateMetrics.kt
line 216 at r2 (raw file):
) val tempMetricsCurIndex = metricsCurIndex
I'm not a fan of this approach. There should be a better way to handle this. I know that the JDBC driver has a reWriteBatchedInserts
option that basically does this for you, but I'm not sure if this applies for R2DBC. You could give that a shot.
If we really need to handle it ourselves, we should do the rewriting in the common-jvm statement builder (basically changing how add
works).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 4 files reviewed, 1 unresolved discussion (waiting on @SanjayVas)
src/main/kotlin/org/wfanet/measurement/reporting/deploy/v2/postgres/writers/CreateMetrics.kt
line 216 at r2 (raw file):
Previously, SanjayVas (Sanjay Vasandani) wrote…
I'm not a fan of this approach. There should be a better way to handle this. I know that the JDBC driver has a
reWriteBatchedInserts
option that basically does this for you, but I'm not sure if this applies for R2DBC. You could give that a shot.If we really need to handle it ourselves, we should do the rewriting in the common-jvm statement builder (basically changing how
add
works).
For parameterized statements, this is the only method. For statements that aren't parameterized, there is a batch connection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 4 files reviewed, 1 unresolved discussion (waiting on @tristanvuong2021)
src/main/kotlin/org/wfanet/measurement/reporting/deploy/v2/postgres/writers/CreateMetrics.kt
line 216 at r2 (raw file):
Previously, tristanvuong2021 (Tristan Vuong) wrote…
For parameterized statements, this is the only method. For statements that aren't parameterized, there is a batch connection.
Did you try setting reWriteBatchedInserts
in the connection options? For the underlying JDBC driver, that option is intended to basically automatically rewrite the multiple insert statements into a single one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 4 files reviewed, 1 unresolved discussion (waiting on @SanjayVas)
src/main/kotlin/org/wfanet/measurement/reporting/deploy/v2/postgres/writers/CreateMetrics.kt
line 216 at r2 (raw file):
Previously, SanjayVas (Sanjay Vasandani) wrote…
Did you try setting
reWriteBatchedInserts
in the connection options? For the underlying JDBC driver, that option is intended to basically automatically rewrite the multiple insert statements into a single one.
That isn't one of the connection options for postgres r2dbc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 4 files reviewed, 1 unresolved discussion (waiting on @tristanvuong2021)
src/main/kotlin/org/wfanet/measurement/reporting/deploy/v2/postgres/writers/CreateMetrics.kt
line 216 at r2 (raw file):
Previously, tristanvuong2021 (Tristan Vuong) wrote…
That isn't one of the connection options for postgres r2dbc
It's not one of the Option constants, but the options are just strings. The question is whether anything actually reads that option in the R2DBC implementation (e.g. does it end up calling some code from the JDBC impl)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 4 files reviewed, 1 unresolved discussion (waiting on @tristanvuong2021)
src/main/kotlin/org/wfanet/measurement/reporting/deploy/v2/postgres/writers/CreateMetrics.kt
line 216 at r2 (raw file):
Previously, SanjayVas (Sanjay Vasandani) wrote…
It's not one of the Option constants, but the options are just strings. The question is whether anything actually reads that option in the R2DBC implementation (e.g. does it end up calling some code from the JDBC impl)
Ah, perhaps this does indeed not do anything for the R2DBC version. Please reference pgjdbc/r2dbc-postgresql#527 in a comment indicating that we need to do this because R2DBC does not support rewriting batched inserts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 4 files reviewed, 1 unresolved discussion (waiting on @SanjayVas)
src/main/kotlin/org/wfanet/measurement/reporting/deploy/v2/postgres/writers/CreateMetrics.kt
line 216 at r2 (raw file):
Previously, SanjayVas (Sanjay Vasandani) wrote…
Ah, perhaps this does indeed not do anything for the R2DBC version. Please reference pgjdbc/r2dbc-postgresql#527 in a comment indicating that we need to do this because R2DBC does not support rewriting batched inserts.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all commit messages.
Reviewable status: 0 of 4 files reviewed, 1 unresolved discussion (waiting on @tristanvuong2021)
src/main/kotlin/org/wfanet/measurement/reporting/deploy/v2/postgres/writers/CreateMetrics.kt
line 59 at r2 (raw file):
class CreateMetrics(private val requests: List<CreateMetricRequest>) : PostgresWriter<List<Metric>>() { private data class WeightedMeasurementsAndStatementComponents(
This is still pretty hard to read. Is there any way to factor things out to make this more readable? I'm thinking something like creating a Postgres-specific builder for multi-valued bound statements. Usage example:
multiValueBoundStatement(paramCount = 3,
"""
INSERT INTO Metrics (Foo, Bar, Baz) VALUES (${MultiValueBoundStatement.PARAM_LIST_PLACEHOLDER})
"""
) {
for (item in items) {
addBinding {
bind(paramIndex = 0, item.foo)
bind(paramIndex = 1, item.bar)
bind(paramIndex = 2, item.baz)
}
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 6 files reviewed, 1 unresolved discussion (waiting on @SanjayVas)
src/main/kotlin/org/wfanet/measurement/reporting/deploy/v2/postgres/writers/CreateMetrics.kt
line 59 at r2 (raw file):
Previously, SanjayVas (Sanjay Vasandani) wrote…
This is still pretty hard to read. Is there any way to factor things out to make this more readable? I'm thinking something like creating a Postgres-specific builder for multi-valued bound statements. Usage example:
multiValueBoundStatement(paramCount = 3, """ INSERT INTO Metrics (Foo, Bar, Baz) VALUES (${MultiValueBoundStatement.PARAM_LIST_PLACEHOLDER}) """ ) { for (item in items) { addBinding { bind(paramIndex = 0, item.foo) bind(paramIndex = 1, item.bar) bind(paramIndex = 2, item.baz) } } }
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 6 files at r3, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @tristanvuong2021)
MODULE.bazel
line 132 at r3 (raw file):
repo_name = "wfa_common_jvm", ) archive_override(
Until you have a version, add a DO_NOT_SUBMIT referencing the source PR.
bd7fe76
to
3f165a1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 5 of 6 files reviewed, 1 unresolved discussion (waiting on @SanjayVas)
MODULE.bazel
line 132 at r3 (raw file):
Previously, SanjayVas (Sanjay Vasandani) wrote…
Until you have a version, add a DO_NOT_SUBMIT referencing the source PR.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r4, all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @tristanvuong2021)
…-large-insert-per-table
…-large-insert-per-table
…-large-insert-per-table
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 6 files at r3, 2 of 2 files at r4, all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @tristanvuong2021)
The addBinding method executes one insert per row. When the number of rows is large, the performance suffers. For 800 metrics with one measurement each and the same reporting set for each metric, insertion takes 4.2 seconds locally. With the change in this PR, insertion takes 1.4 seconds.
The refactoring is done this way because R2DBC doesn't automically transform the inserts into a single batch insert. See pgjdbc/r2dbc-postgresql#527.