Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Optimization: Lazy Load Vertex Relations #4343

Merged
merged 1 commit into from
Apr 26, 2024

Conversation

ntisseyre
Copy link
Contributor

  • Optimizing reads of large amounts of vertexes by doing lazy-load (lazy-deserialization) of vertex properties or edges.
    Lazy-load is not enabled by default and should be set explicitly on a TransactionBuilder.

  • Leveraging vertex's query-cache for multiQuery operations instead of rebuilding the query while loading vertex relations.

@ntisseyre ntisseyre force-pushed the lazy_relations branch 3 times, most recently from 821111c to dfbb92e Compare March 20, 2024 19:20
@janusgraph-bot janusgraph-bot added the cla: external Externally-managed CLA label Mar 25, 2024
@ntisseyre ntisseyre force-pushed the lazy_relations branch 4 times, most recently from 8a8c0d2 to 1711b2b Compare April 7, 2024 11:46
Copy link
Member

@porunov porunov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @ntisseyre ! LGTM. I have left two nitpicks, but otherwise it looks good.
@JanusGraph/committers merging this PR using lazy consensus in a week.

Copy link
Member

@porunov porunov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more nitpick is the commit message. It says "Initial commit", but better to use more concrete commit description.

@ntisseyre ntisseyre force-pushed the lazy_relations branch 2 times, most recently from bb98134 to 367df51 Compare April 10, 2024 01:42
Copy link
Member

@li-boxuan li-boxuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love the idea of lazy-loading but I am not sure if I understand when this is useful and how powerful it is. Is it possible to add a new benchmark under janusgraph-benchmark module?

Could you also please run a full build with your feature turned on by default? You could create a bogus PR to trigger CI runs. I would love to see if that can help us capture bugs.


import java.util.Iterator;

public class JanusGraphLazyProperty<V> extends JanusGraphLazyRelation<V> implements JanusGraphVertexProperty<V> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider: renaming this to JanusGraphLazyVertexProperty. A "property" can be attached to a vertex, an edge, or a vertex property.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed

* <p>
* When enabled, it can have a boost on large scale read operations, when only certain type of relations are being read.
*
* @return Object with the skip db-cache reads check settings
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrong comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@@ -320,6 +326,7 @@ public void close() {
config.getVertexCacheSize(), effectiveVertexCacheSize, MIN_VERTEX_CACHE_SIZE);
}

relTypeCache = new ConcurrentHashMap<>(30);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If only used by JanusGraphLazyRelation, looks like there's no reason to initialize relTypeCache if this transaction doesn't have lazyLoadRelations() enabled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

docs/basics/transactions.md Outdated Show resolved Hide resolved
@ntisseyre ntisseyre force-pushed the lazy_relations branch 6 times, most recently from 9709a6f to d725e10 Compare April 12, 2024 19:36
Signed-off-by: ntisseyre <ntisseyre@apple.com>
@porunov
Copy link
Member

porunov commented Apr 23, 2024

Comments explained in #4367 (comment) make sense to me. @li-boxuan will you be able to check if you have any concerns or other comments regarding this PR?

Copy link
Member

@li-boxuan li-boxuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great contribution, thank you! I would like to see some benchmark numbers or a new benchmark checked-in for perf-related PRs, but since this feature is off by default, I am happy to accept it as it is.

@ntisseyre
Copy link
Contributor Author

Great contribution, thank you! I would like to see some benchmark numbers or a new benchmark checked-in for perf-related PRs, but since this feature is off by default, I am happy to accept it as it is.

Thank you @li-boxuan !
I want to add a benchmark test myself. I looked at the module, and I think it would be interesting to execute a test with the feature on/off and compare results. Is it possible to configure the test in this way?

@porunov
Copy link
Member

porunov commented Apr 24, 2024

Great contribution, thank you! I would like to see some benchmark numbers or a new benchmark checked-in for perf-related PRs, but since this feature is off by default, I am happy to accept it as it is.

Thank you @li-boxuan ! I want to add a benchmark test myself. I looked at the module, and I think it would be interesting to execute a test with the feature on/off and compare results. Is it possible to configure the test in this way?

You should just add a new param into you test and annotate it. The benchmark test will detect all the annotated parameters and will execute tests with all combinations of all provided values.

Here is an example of a boolean parameter (can be seen in this benchmark test):

@Param({"true", "false"})
boolean fastProperty;

You can create a similar param for lazy-loading feature.

A quick thing you can do in the PR #4367 :

You can see the following report under that CI job:

# Run complete. Total time: 00:56:36

REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.

Benchmark                                                                                                          (fanoutFactor)  (propertyCardinalitySingle)  (verticesAmount)  Mode  Cnt      Score      Error  Units
CQLMultiQueryBenchmark.getAdjacentVerticesLocalCounts                                                                         100                          N/A               N/A  avgt    5    223.567 ±  442.426  ms/op
CQLMultiQueryBenchmark.getAdjacentVerticesLocalCounts                                                                         500                          N/A               N/A  avgt    5   8459.878 ±  647.477  ms/op
CQLMultiQueryBenchmark.getAllElementsTraversedFromOuterVertex                                                                 100                          N/A               N/A  avgt    5    157.624 ±   37.750  ms/op
CQLMultiQueryBenchmark.getAllElementsTraversedFromOuterVertex                                                                 500                          N/A               N/A  avgt    5   8190.692 ±  610.894  ms/op
CQLMultiQueryBenchmark.getElementsWithUsingEmitRepeatSteps                                                                    100                          N/A               N/A  avgt    5    317.631 ±   92.030  ms/op
CQLMultiQueryBenchmark.getElementsWithUsingEmitRepeatSteps                                                                    500                          N/A               N/A  avgt    5  16468.902 ±  426.340  ms/op
CQLMultiQueryBenchmark.getElementsWithUsingRepeatUntilSteps                                                                   100                          N/A               N/A  avgt    5    180.623 ±   48.971  ms/op
CQLMultiQueryBenchmark.getElementsWithUsingRepeatUntilSteps                                                                   500                          N/A               N/A  avgt    5   8827.256 ±  275.217  ms/op
CQLMultiQueryBenchmark.getIdToOutVerticesProjection                                                                           100                          N/A               N/A  avgt    5     11.525 ±    4.256  ms/op
CQLMultiQueryBenchmark.getIdToOutVerticesProjection                                                                           500                          N/A               N/A  avgt    5    240.265 ±   30.486  ms/op
CQLMultiQueryBenchmark.getLabels                                                                                              100                          N/A               N/A  avgt    5    128.353 ±   74.876  ms/op
CQLMultiQueryBenchmark.getLabels                                                                                              500                          N/A               N/A  avgt    5   6946.586 ±  354.581  ms/op
CQLMultiQueryBenchmark.getNames                                                                                               100                          N/A               N/A  avgt    5    158.185 ±   78.300  ms/op
CQLMultiQueryBenchmark.getNames                                                                                               500                          N/A               N/A  avgt    5   8266.414 ±  224.934  ms/op
CQLMultiQueryBenchmark.getNeighborNames                                                                                       100                          N/A               N/A  avgt    5    165.204 ±   49.715  ms/op
CQLMultiQueryBenchmark.getNeighborNames                                                                                       500                          N/A               N/A  avgt    5   7983.727 ±  240.637  ms/op
CQLMultiQueryBenchmark.getVerticesFilteredByAndStep                                                                           100                          N/A               N/A  avgt    5     19.570 ±    4.763  ms/op
CQLMultiQueryBenchmark.getVerticesFilteredByAndStep                                                                           500                          N/A               N/A  avgt    5    416.075 ±   35.813  ms/op
CQLMultiQueryBenchmark.getVerticesFromMultiNestedRepeatStepStartingFromSingleVertex                                           100                          N/A               N/A  avgt    5    317.255 ±   56.425  ms/op
CQLMultiQueryBenchmark.getVerticesFromMultiNestedRepeatStepStartingFromSingleVertex                                           500                          N/A               N/A  avgt    5  11904.330 ±  676.284  ms/op
CQLMultiQueryBenchmark.getVerticesWithCoalesceUsage                                                                           100                          N/A               N/A  avgt    5     17.015 ±    5.919  ms/op
CQLMultiQueryBenchmark.getVerticesWithCoalesceUsage                                                                           500                          N/A               N/A  avgt    5    350.154 ±   55.172  ms/op
CQLMultiQueryBenchmark.getVerticesWithDoubleUnion                                                                             100                          N/A               N/A  avgt    5     18.984 ±    5.438  ms/op
CQLMultiQueryBenchmark.getVerticesWithDoubleUnion                                                                             500                          N/A               N/A  avgt    5    360.418 ±   33.777  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                  N/A                         true              5000  avgt    5    182.429 ±   44.259  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                  N/A                         true             50000  avgt    5   2340.425 ±  206.067  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                  N/A                        false              5000  avgt    5    180.499 ±   40.487  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                  N/A                        false             50000  avgt    5   2287.420 ±  106.430  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithUnlimitedBatch                                                    N/A                         true              5000  avgt    5    150.357 ±   45.516  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithUnlimitedBatch                                                    N/A                         true             50000  avgt    5   1906.077 ±  268.831  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithUnlimitedBatch                                                    N/A                        false              5000  avgt    5    144.797 ±   31.274  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithUnlimitedBatch                                                    N/A                        false             50000  avgt    5   1869.584 ±  218.428  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection             N/A                         true              5000  avgt    5    257.977 ±   83.060  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection             N/A                         true             50000  avgt    5   3119.575 ±  248.486  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection             N/A                        false              5000  avgt    5    649.581 ±  117.665  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection             N/A                        false             50000  avgt    5   9750.344 ±  231.916  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithSmallBatch                                                   N/A                         true              5000  avgt    5    508.997 ±  200.694  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithSmallBatch                                                   N/A                         true             50000  avgt    5   5529.369 ±  222.520  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithSmallBatch                                                   N/A                        false              5000  avgt    5    891.095 ±  142.167  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithSmallBatch                                                   N/A                        false             50000  avgt    5  13073.927 ± 1140.664  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithUnlimitedBatch                                               N/A                         true              5000  avgt    5    211.978 ±   60.261  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithUnlimitedBatch                                               N/A                         true             50000  avgt    5   3087.432 ±  276.040  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithUnlimitedBatch                                               N/A                        false              5000  avgt    5    646.969 ±  166.361  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithUnlimitedBatch                                               N/A                        false             50000  avgt    5   9834.445 ±  483.247  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesThreePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                N/A                         true              5000  avgt    5    149.255 ±   81.601  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesThreePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                N/A                         true             50000  avgt    5   1826.016 ±  230.274  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesThreePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                N/A                        false              5000  avgt    5    233.800 ±   57.818  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesThreePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                N/A                        false             50000  avgt    5   3350.430 ±  332.899  ms/op
CQLMultiQueryMultiSlicesBenchmark.vertexCentricPropertiesFetching                                                             N/A                         true              5000  avgt    5   1709.161 ±   98.722  ms/op
CQLMultiQueryMultiSlicesBenchmark.vertexCentricPropertiesFetching                                                             N/A                         true             50000  avgt    5  18500.760 ±  382.153  ms/op
CQLMultiQueryMultiSlicesBenchmark.vertexCentricPropertiesFetching                                                             N/A                        false              5000  avgt    5   2812.449 ±   73.314  ms/op
CQLMultiQueryMultiSlicesBenchmark.vertexCentricPropertiesFetching                                                             N/A                        false             50000  avgt    5  32993.734 ±  888.912  ms/op

Some things to notice:

  • GitHub Actions often times assign works to different pods with different resources. Thus, often times comparison between the previous report and the new report may not be accurate. For this, it's recommended to execute the benchmark locally on your own server or a laptop where available resources are the same between different runs.
  • By adding a single parameter with 2 states the amount of all tests is multiplied by 2. Thus, I don't recommend doing so in this PR cause we will increase the benchmark execution amount by 2. Instead, it's better to simply add a new test class (similar to existing tests) with the available parameter and test cases which are made for this specific feature evaluation.
  • Previously, to verify that new optimizations / features don't introduce any regressions I was simply executing benchmark of the latest commit from master branch and the switching back to the feature branch and executing tests again. I was doing this on a local laptop with no other heavy active processes. After that I was uploading those two benchmark tests in comments under the relative PR or creating a GitHub Gist with all the reports. This allowed to verify that the feature doesn't bring any regressions to existing benchmarks. Again, GitHub Action job is doing the same, but it's not accurate due to different resources available between benchmark runs.

For reference:
JanusGraph Benchmark CI: https://github.com/JanusGraph/janusgraph/blob/master/.github/workflows/ci-benchmark.yml
Benchmark main method (as seen some tests in the CI are skipped by default due to being too heavy and long to be executed. However, it's possible to execute them by passing relative command arguments):

public static void main(String[] args) throws RunnerException, IOException, InterruptedException {

@ntisseyre
Copy link
Contributor Author

Great contribution, thank you! I would like to see some benchmark numbers or a new benchmark checked-in for perf-related PRs, but since this feature is off by default, I am happy to accept it as it is.

Thank you @li-boxuan ! I want to add a benchmark test myself. I looked at the module, and I think it would be interesting to execute a test with the feature on/off and compare results. Is it possible to configure the test in this way?

You should just add a new param into you test and annotate it. The benchmark test will detect all the annotated parameters and will execute tests with all combinations of all provided values.

Here is an example of a boolean parameter (can be seen in this benchmark test):

@Param({"true", "false"})
boolean fastProperty;

You can create a similar param for lazy-loading feature.

A quick thing you can do in the PR #4367 :

You can see the following report under that CI job:

# Run complete. Total time: 00:56:36

REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.

Benchmark                                                                                                          (fanoutFactor)  (propertyCardinalitySingle)  (verticesAmount)  Mode  Cnt      Score      Error  Units
CQLMultiQueryBenchmark.getAdjacentVerticesLocalCounts                                                                         100                          N/A               N/A  avgt    5    223.567 ±  442.426  ms/op
CQLMultiQueryBenchmark.getAdjacentVerticesLocalCounts                                                                         500                          N/A               N/A  avgt    5   8459.878 ±  647.477  ms/op
CQLMultiQueryBenchmark.getAllElementsTraversedFromOuterVertex                                                                 100                          N/A               N/A  avgt    5    157.624 ±   37.750  ms/op
CQLMultiQueryBenchmark.getAllElementsTraversedFromOuterVertex                                                                 500                          N/A               N/A  avgt    5   8190.692 ±  610.894  ms/op
CQLMultiQueryBenchmark.getElementsWithUsingEmitRepeatSteps                                                                    100                          N/A               N/A  avgt    5    317.631 ±   92.030  ms/op
CQLMultiQueryBenchmark.getElementsWithUsingEmitRepeatSteps                                                                    500                          N/A               N/A  avgt    5  16468.902 ±  426.340  ms/op
CQLMultiQueryBenchmark.getElementsWithUsingRepeatUntilSteps                                                                   100                          N/A               N/A  avgt    5    180.623 ±   48.971  ms/op
CQLMultiQueryBenchmark.getElementsWithUsingRepeatUntilSteps                                                                   500                          N/A               N/A  avgt    5   8827.256 ±  275.217  ms/op
CQLMultiQueryBenchmark.getIdToOutVerticesProjection                                                                           100                          N/A               N/A  avgt    5     11.525 ±    4.256  ms/op
CQLMultiQueryBenchmark.getIdToOutVerticesProjection                                                                           500                          N/A               N/A  avgt    5    240.265 ±   30.486  ms/op
CQLMultiQueryBenchmark.getLabels                                                                                              100                          N/A               N/A  avgt    5    128.353 ±   74.876  ms/op
CQLMultiQueryBenchmark.getLabels                                                                                              500                          N/A               N/A  avgt    5   6946.586 ±  354.581  ms/op
CQLMultiQueryBenchmark.getNames                                                                                               100                          N/A               N/A  avgt    5    158.185 ±   78.300  ms/op
CQLMultiQueryBenchmark.getNames                                                                                               500                          N/A               N/A  avgt    5   8266.414 ±  224.934  ms/op
CQLMultiQueryBenchmark.getNeighborNames                                                                                       100                          N/A               N/A  avgt    5    165.204 ±   49.715  ms/op
CQLMultiQueryBenchmark.getNeighborNames                                                                                       500                          N/A               N/A  avgt    5   7983.727 ±  240.637  ms/op
CQLMultiQueryBenchmark.getVerticesFilteredByAndStep                                                                           100                          N/A               N/A  avgt    5     19.570 ±    4.763  ms/op
CQLMultiQueryBenchmark.getVerticesFilteredByAndStep                                                                           500                          N/A               N/A  avgt    5    416.075 ±   35.813  ms/op
CQLMultiQueryBenchmark.getVerticesFromMultiNestedRepeatStepStartingFromSingleVertex                                           100                          N/A               N/A  avgt    5    317.255 ±   56.425  ms/op
CQLMultiQueryBenchmark.getVerticesFromMultiNestedRepeatStepStartingFromSingleVertex                                           500                          N/A               N/A  avgt    5  11904.330 ±  676.284  ms/op
CQLMultiQueryBenchmark.getVerticesWithCoalesceUsage                                                                           100                          N/A               N/A  avgt    5     17.015 ±    5.919  ms/op
CQLMultiQueryBenchmark.getVerticesWithCoalesceUsage                                                                           500                          N/A               N/A  avgt    5    350.154 ±   55.172  ms/op
CQLMultiQueryBenchmark.getVerticesWithDoubleUnion                                                                             100                          N/A               N/A  avgt    5     18.984 ±    5.438  ms/op
CQLMultiQueryBenchmark.getVerticesWithDoubleUnion                                                                             500                          N/A               N/A  avgt    5    360.418 ±   33.777  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                  N/A                         true              5000  avgt    5    182.429 ±   44.259  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                  N/A                         true             50000  avgt    5   2340.425 ±  206.067  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                  N/A                        false              5000  avgt    5    180.499 ±   40.487  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                  N/A                        false             50000  avgt    5   2287.420 ±  106.430  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithUnlimitedBatch                                                    N/A                         true              5000  avgt    5    150.357 ±   45.516  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithUnlimitedBatch                                                    N/A                         true             50000  avgt    5   1906.077 ±  268.831  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithUnlimitedBatch                                                    N/A                        false              5000  avgt    5    144.797 ±   31.274  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithUnlimitedBatch                                                    N/A                        false             50000  avgt    5   1869.584 ±  218.428  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection             N/A                         true              5000  avgt    5    257.977 ±   83.060  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection             N/A                         true             50000  avgt    5   3119.575 ±  248.486  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection             N/A                        false              5000  avgt    5    649.581 ±  117.665  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection             N/A                        false             50000  avgt    5   9750.344 ±  231.916  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithSmallBatch                                                   N/A                         true              5000  avgt    5    508.997 ±  200.694  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithSmallBatch                                                   N/A                         true             50000  avgt    5   5529.369 ±  222.520  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithSmallBatch                                                   N/A                        false              5000  avgt    5    891.095 ±  142.167  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithSmallBatch                                                   N/A                        false             50000  avgt    5  13073.927 ± 1140.664  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithUnlimitedBatch                                               N/A                         true              5000  avgt    5    211.978 ±   60.261  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithUnlimitedBatch                                               N/A                         true             50000  avgt    5   3087.432 ±  276.040  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithUnlimitedBatch                                               N/A                        false              5000  avgt    5    646.969 ±  166.361  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithUnlimitedBatch                                               N/A                        false             50000  avgt    5   9834.445 ±  483.247  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesThreePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                N/A                         true              5000  avgt    5    149.255 ±   81.601  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesThreePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                N/A                         true             50000  avgt    5   1826.016 ±  230.274  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesThreePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                N/A                        false              5000  avgt    5    233.800 ±   57.818  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesThreePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                N/A                        false             50000  avgt    5   3350.430 ±  332.899  ms/op
CQLMultiQueryMultiSlicesBenchmark.vertexCentricPropertiesFetching                                                             N/A                         true              5000  avgt    5   1709.161 ±   98.722  ms/op
CQLMultiQueryMultiSlicesBenchmark.vertexCentricPropertiesFetching                                                             N/A                         true             50000  avgt    5  18500.760 ±  382.153  ms/op
CQLMultiQueryMultiSlicesBenchmark.vertexCentricPropertiesFetching                                                             N/A                        false              5000  avgt    5   2812.449 ±   73.314  ms/op
CQLMultiQueryMultiSlicesBenchmark.vertexCentricPropertiesFetching                                                             N/A                        false             50000  avgt    5  32993.734 ±  888.912  ms/op

Some things to notice:

  • GitHub Actions often times assign works to different pods with different resources. Thus, often times comparison between the previous report and the new report may not be accurate. For this, it's recommended to execute the benchmark locally on your own server or a laptop where available resources are the same between different runs.
  • By adding a single parameter with 2 states the amount of all tests is multiplied by 2. Thus, I don't recommend doing so in this PR cause we will increase the benchmark execution amount by 2. Instead, it's better to simply add a new test class (similar to existing tests) with the available parameter and test cases which are made for this specific feature evaluation.
  • Previously, to verify that new optimizations / features don't introduce any regressions I was simply executing benchmark of the latest commit from master branch and the switching back to the feature branch and executing tests again. I was doing this on a local laptop with no other heavy active processes. After that I was uploading those two benchmark tests in comments under the relative PR or creating a GitHub Gist with all the reports. This allowed to verify that the feature doesn't bring any regressions to existing benchmarks. Again, GitHub Action job is doing the same, but it's not accurate due to different resources available between benchmark runs.

For reference: JanusGraph Benchmark CI: https://github.com/JanusGraph/janusgraph/blob/master/.github/workflows/ci-benchmark.yml Benchmark main method (as seen some tests in the CI are skipped by default due to being too heavy and long to be executed. However, it's possible to execute them by passing relative command arguments):

public static void main(String[] args) throws RunnerException, IOException, InterruptedException {

Thank you @porunov !
I have added a benchmarking test here 60c7698#diff-8ce3c56c221176d06101a8cd771e4a0781620b696407b27d6a97a0405442a603R37
cc @li-boxuan

@porunov
Copy link
Member

porunov commented Apr 25, 2024

Great contribution, thank you! I would like to see some benchmark numbers or a new benchmark checked-in for perf-related PRs, but since this feature is off by default, I am happy to accept it as it is.

Thank you @li-boxuan ! I want to add a benchmark test myself. I looked at the module, and I think it would be interesting to execute a test with the feature on/off and compare results. Is it possible to configure the test in this way?

You should just add a new param into you test and annotate it. The benchmark test will detect all the annotated parameters and will execute tests with all combinations of all provided values.
Here is an example of a boolean parameter (can be seen in this benchmark test):

@Param({"true", "false"})
boolean fastProperty;

You can create a similar param for lazy-loading feature.
A quick thing you can do in the PR #4367 :

You can see the following report under that CI job:

# Run complete. Total time: 00:56:36

REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.

Benchmark                                                                                                          (fanoutFactor)  (propertyCardinalitySingle)  (verticesAmount)  Mode  Cnt      Score      Error  Units
CQLMultiQueryBenchmark.getAdjacentVerticesLocalCounts                                                                         100                          N/A               N/A  avgt    5    223.567 ±  442.426  ms/op
CQLMultiQueryBenchmark.getAdjacentVerticesLocalCounts                                                                         500                          N/A               N/A  avgt    5   8459.878 ±  647.477  ms/op
CQLMultiQueryBenchmark.getAllElementsTraversedFromOuterVertex                                                                 100                          N/A               N/A  avgt    5    157.624 ±   37.750  ms/op
CQLMultiQueryBenchmark.getAllElementsTraversedFromOuterVertex                                                                 500                          N/A               N/A  avgt    5   8190.692 ±  610.894  ms/op
CQLMultiQueryBenchmark.getElementsWithUsingEmitRepeatSteps                                                                    100                          N/A               N/A  avgt    5    317.631 ±   92.030  ms/op
CQLMultiQueryBenchmark.getElementsWithUsingEmitRepeatSteps                                                                    500                          N/A               N/A  avgt    5  16468.902 ±  426.340  ms/op
CQLMultiQueryBenchmark.getElementsWithUsingRepeatUntilSteps                                                                   100                          N/A               N/A  avgt    5    180.623 ±   48.971  ms/op
CQLMultiQueryBenchmark.getElementsWithUsingRepeatUntilSteps                                                                   500                          N/A               N/A  avgt    5   8827.256 ±  275.217  ms/op
CQLMultiQueryBenchmark.getIdToOutVerticesProjection                                                                           100                          N/A               N/A  avgt    5     11.525 ±    4.256  ms/op
CQLMultiQueryBenchmark.getIdToOutVerticesProjection                                                                           500                          N/A               N/A  avgt    5    240.265 ±   30.486  ms/op
CQLMultiQueryBenchmark.getLabels                                                                                              100                          N/A               N/A  avgt    5    128.353 ±   74.876  ms/op
CQLMultiQueryBenchmark.getLabels                                                                                              500                          N/A               N/A  avgt    5   6946.586 ±  354.581  ms/op
CQLMultiQueryBenchmark.getNames                                                                                               100                          N/A               N/A  avgt    5    158.185 ±   78.300  ms/op
CQLMultiQueryBenchmark.getNames                                                                                               500                          N/A               N/A  avgt    5   8266.414 ±  224.934  ms/op
CQLMultiQueryBenchmark.getNeighborNames                                                                                       100                          N/A               N/A  avgt    5    165.204 ±   49.715  ms/op
CQLMultiQueryBenchmark.getNeighborNames                                                                                       500                          N/A               N/A  avgt    5   7983.727 ±  240.637  ms/op
CQLMultiQueryBenchmark.getVerticesFilteredByAndStep                                                                           100                          N/A               N/A  avgt    5     19.570 ±    4.763  ms/op
CQLMultiQueryBenchmark.getVerticesFilteredByAndStep                                                                           500                          N/A               N/A  avgt    5    416.075 ±   35.813  ms/op
CQLMultiQueryBenchmark.getVerticesFromMultiNestedRepeatStepStartingFromSingleVertex                                           100                          N/A               N/A  avgt    5    317.255 ±   56.425  ms/op
CQLMultiQueryBenchmark.getVerticesFromMultiNestedRepeatStepStartingFromSingleVertex                                           500                          N/A               N/A  avgt    5  11904.330 ±  676.284  ms/op
CQLMultiQueryBenchmark.getVerticesWithCoalesceUsage                                                                           100                          N/A               N/A  avgt    5     17.015 ±    5.919  ms/op
CQLMultiQueryBenchmark.getVerticesWithCoalesceUsage                                                                           500                          N/A               N/A  avgt    5    350.154 ±   55.172  ms/op
CQLMultiQueryBenchmark.getVerticesWithDoubleUnion                                                                             100                          N/A               N/A  avgt    5     18.984 ±    5.438  ms/op
CQLMultiQueryBenchmark.getVerticesWithDoubleUnion                                                                             500                          N/A               N/A  avgt    5    360.418 ±   33.777  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                  N/A                         true              5000  avgt    5    182.429 ±   44.259  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                  N/A                         true             50000  avgt    5   2340.425 ±  206.067  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                  N/A                        false              5000  avgt    5    180.499 ±   40.487  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                  N/A                        false             50000  avgt    5   2287.420 ±  106.430  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithUnlimitedBatch                                                    N/A                         true              5000  avgt    5    150.357 ±   45.516  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithUnlimitedBatch                                                    N/A                         true             50000  avgt    5   1906.077 ±  268.831  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithUnlimitedBatch                                                    N/A                        false              5000  avgt    5    144.797 ±   31.274  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithUnlimitedBatch                                                    N/A                        false             50000  avgt    5   1869.584 ±  218.428  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection             N/A                         true              5000  avgt    5    257.977 ±   83.060  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection             N/A                         true             50000  avgt    5   3119.575 ±  248.486  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection             N/A                        false              5000  avgt    5    649.581 ±  117.665  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection             N/A                        false             50000  avgt    5   9750.344 ±  231.916  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithSmallBatch                                                   N/A                         true              5000  avgt    5    508.997 ±  200.694  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithSmallBatch                                                   N/A                         true             50000  avgt    5   5529.369 ±  222.520  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithSmallBatch                                                   N/A                        false              5000  avgt    5    891.095 ±  142.167  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithSmallBatch                                                   N/A                        false             50000  avgt    5  13073.927 ± 1140.664  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithUnlimitedBatch                                               N/A                         true              5000  avgt    5    211.978 ±   60.261  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithUnlimitedBatch                                               N/A                         true             50000  avgt    5   3087.432 ±  276.040  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithUnlimitedBatch                                               N/A                        false              5000  avgt    5    646.969 ±  166.361  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithUnlimitedBatch                                               N/A                        false             50000  avgt    5   9834.445 ±  483.247  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesThreePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                N/A                         true              5000  avgt    5    149.255 ±   81.601  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesThreePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                N/A                         true             50000  avgt    5   1826.016 ±  230.274  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesThreePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                N/A                        false              5000  avgt    5    233.800 ±   57.818  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesThreePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                N/A                        false             50000  avgt    5   3350.430 ±  332.899  ms/op
CQLMultiQueryMultiSlicesBenchmark.vertexCentricPropertiesFetching                                                             N/A                         true              5000  avgt    5   1709.161 ±   98.722  ms/op
CQLMultiQueryMultiSlicesBenchmark.vertexCentricPropertiesFetching                                                             N/A                         true             50000  avgt    5  18500.760 ±  382.153  ms/op
CQLMultiQueryMultiSlicesBenchmark.vertexCentricPropertiesFetching                                                             N/A                        false              5000  avgt    5   2812.449 ±   73.314  ms/op
CQLMultiQueryMultiSlicesBenchmark.vertexCentricPropertiesFetching                                                             N/A                        false             50000  avgt    5  32993.734 ±  888.912  ms/op

Some things to notice:

  • GitHub Actions often times assign works to different pods with different resources. Thus, often times comparison between the previous report and the new report may not be accurate. For this, it's recommended to execute the benchmark locally on your own server or a laptop where available resources are the same between different runs.
  • By adding a single parameter with 2 states the amount of all tests is multiplied by 2. Thus, I don't recommend doing so in this PR cause we will increase the benchmark execution amount by 2. Instead, it's better to simply add a new test class (similar to existing tests) with the available parameter and test cases which are made for this specific feature evaluation.
  • Previously, to verify that new optimizations / features don't introduce any regressions I was simply executing benchmark of the latest commit from master branch and the switching back to the feature branch and executing tests again. I was doing this on a local laptop with no other heavy active processes. After that I was uploading those two benchmark tests in comments under the relative PR or creating a GitHub Gist with all the reports. This allowed to verify that the feature doesn't bring any regressions to existing benchmarks. Again, GitHub Action job is doing the same, but it's not accurate due to different resources available between benchmark runs.

For reference: JanusGraph Benchmark CI: https://github.com/JanusGraph/janusgraph/blob/master/.github/workflows/ci-benchmark.yml Benchmark main method (as seen some tests in the CI are skipped by default due to being too heavy and long to be executed. However, it's possible to execute them by passing relative command arguments):

public static void main(String[] args) throws RunnerException, IOException, InterruptedException {

Thank you @porunov ! I have added a benchmarking test here 60c7698#diff-8ce3c56c221176d06101a8cd771e4a0781620b696407b27d6a97a0405442a603R37 cc @li-boxuan

It was not triggered due to failed checkstyle:

Error:  /home/runner/work/janusgraph/janusgraph/janusgraph-benchmark/src/main/java/org/janusgraph/LazyLoadBenchmark.java:19:27: Using the '.*' form of import should be avoided - org.janusgraph.core.*. [AvoidStarImport]
Error:  /home/runner/work/janusgraph/janusgraph/janusgraph-benchmark/src/main/java/org/janusgraph/LazyLoadBenchmark.java:24:35: Using the '.*' form of import should be avoided - org.openjdk.jmh.annotations.*. [AvoidStarImport]

@ntisseyre
Copy link
Contributor Author

LazyLoadBenchmark

Oh, thanks! Fixed

@porunov
Copy link
Member

porunov commented Apr 25, 2024

LazyLoadBenchmark

Oh, thanks! Fixed

Hmm. Seems OOM.

# Benchmark: org.janusgraph.LazyLoadBenchmark.getProperties
# Parameters: (isLazyLoad = false, verticesAmount = 5000)
Iteration   1: 173.570 ms/op
Iteration   2: 168.559 ms/op
Iteration   3: 169.724 ms/op
Iteration   4: 167.155 ms/op
Iteration   5: 169.652 ms/op
# Parameters: (isLazyLoad = false, verticesAmount = 100000)
java.lang.OutOfMemoryError: GC overhead limit exceeded

So, it seems never executed any benchmark test with isLazyLoad = true. I guess you could reduce verticesAmount or execute the test locally. That said, it's strange why OOM happened on only 100k vertices.

@ntisseyre
Copy link
Contributor Author

LazyLoadBenchmark

Oh, thanks! Fixed

Hmm. Seems OOM.

# Benchmark: org.janusgraph.LazyLoadBenchmark.getProperties
# Parameters: (isLazyLoad = false, verticesAmount = 5000)
Iteration   1: 173.570 ms/op
Iteration   2: 168.559 ms/op
Iteration   3: 169.724 ms/op
Iteration   4: 167.155 ms/op
Iteration   5: 169.652 ms/op
# Parameters: (isLazyLoad = false, verticesAmount = 100000)
java.lang.OutOfMemoryError: GC overhead limit exceeded

So, it seems never executed any benchmark test with isLazyLoad = true. I guess you could reduce verticesAmount or execute the test locally. That said, it's strange why OOM happened on only 100k vertices.

I have reduced it to only 5k vertices and put isLazyLoad=true to be executed first.
When i ran it locally, I observed it executes 2 times faster comparing to isLazyLoad=false

@porunov
Copy link
Member

porunov commented Apr 26, 2024

Below are the tests I executed on my local laptop with minimal parallel processes running.

Master branch basic benchmark tests:

GraphCentricQueryBenchmark.getVertices              100000                 N/A   10000             true  avgt    5     7.279 ±    1.083  ms/op
GraphCentricQueryBenchmark.getVertices              100000                 N/A   10000            false  avgt    5     5.077 ±    0.143  ms/op
GraphCentricQueryBenchmark.getVertices              100000                 N/A  250000             true  avgt    5   245.352 ±   36.570  ms/op
GraphCentricQueryBenchmark.getVertices              100000                 N/A  250000            false  avgt    5   255.863 ±   23.198  ms/op
GraphCentricQueryBenchmark.getVertices          2147483647                 N/A   10000             true  avgt    5     7.259 ±    0.359  ms/op
GraphCentricQueryBenchmark.getVertices          2147483647                 N/A   10000            false  avgt    5     5.248 ±    0.223  ms/op
GraphCentricQueryBenchmark.getVertices          2147483647                 N/A  250000             true  avgt    5   230.278 ±   20.577  ms/op
GraphCentricQueryBenchmark.getVertices          2147483647                 N/A  250000            false  avgt    5   143.663 ±   13.211  ms/op
JanusGraphSpeedBenchmark.basicAddAndDelete             N/A                1000     N/A              N/A  avgt    5   388.187 ±  483.345  ms/op
JanusGraphSpeedBenchmark.basicAddAndDelete             N/A               10000     N/A              N/A  avgt    5  1387.198 ± 1485.462  ms/op
JanusGraphSpeedBenchmark.basicAddAndDelete             N/A              100000     N/A              N/A  avgt    5  7293.442 ± 4041.513  ms/op
JanusGraphSpeedBenchmark.basicCount                    N/A                1000     N/A              N/A  avgt    5     0.949 ±    0.095  ms/op
JanusGraphSpeedBenchmark.basicCount                    N/A               10000     N/A              N/A  avgt    5    12.261 ±    0.404  ms/op
JanusGraphSpeedBenchmark.basicCount                    N/A              100000     N/A              N/A  avgt    5   195.550 ±    8.741  ms/op
MgmtOlapJobBenchmark.runClearIndex                     N/A                 N/A   10000              N/A  avgt    5   214.964 ±    1.516  ms/op
MgmtOlapJobBenchmark.runReindex                        N/A                 N/A   10000              N/A  avgt    5   255.134 ±    1.426  ms/op

Current PR basic benchmark tests:

Benchmark                                   (hardMaxLimit)  (numberOfVertices)  (size)  (useSmartLimit)  Mode  Cnt     Score      Error  Units
GraphCentricQueryBenchmark.getVertices              100000                 N/A   10000             true  avgt    5     7.510 ±    0.443  ms/op
GraphCentricQueryBenchmark.getVertices              100000                 N/A   10000            false  avgt    5     5.428 ±    0.280  ms/op
GraphCentricQueryBenchmark.getVertices              100000                 N/A  250000             true  avgt    5   221.762 ±   20.090  ms/op
GraphCentricQueryBenchmark.getVertices              100000                 N/A  250000            false  avgt    5   266.333 ±   11.015  ms/op
GraphCentricQueryBenchmark.getVertices          2147483647                 N/A   10000             true  avgt    5     7.496 ±    0.283  ms/op
GraphCentricQueryBenchmark.getVertices          2147483647                 N/A   10000            false  avgt    5     5.147 ±    0.298  ms/op
GraphCentricQueryBenchmark.getVertices          2147483647                 N/A  250000             true  avgt    5   231.363 ±    9.417  ms/op
GraphCentricQueryBenchmark.getVertices          2147483647                 N/A  250000            false  avgt    5   150.624 ±    8.143  ms/op
JanusGraphSpeedBenchmark.basicAddAndDelete             N/A                1000     N/A              N/A  avgt    5   381.277 ±  488.346  ms/op
JanusGraphSpeedBenchmark.basicAddAndDelete             N/A               10000     N/A              N/A  avgt    5  1363.017 ± 1539.981  ms/op
JanusGraphSpeedBenchmark.basicAddAndDelete             N/A              100000     N/A              N/A  avgt    5  7071.629 ± 2560.979  ms/op
JanusGraphSpeedBenchmark.basicCount                    N/A                1000     N/A              N/A  avgt    5     0.966 ±    0.106  ms/op
JanusGraphSpeedBenchmark.basicCount                    N/A               10000     N/A              N/A  avgt    5    11.893 ±    0.592  ms/op
JanusGraphSpeedBenchmark.basicCount                    N/A              100000     N/A              N/A  avgt    5   208.851 ±   69.922  ms/op
MgmtOlapJobBenchmark.runClearIndex                     N/A                 N/A   10000              N/A  avgt    5   215.058 ±    1.976  ms/op
MgmtOlapJobBenchmark.runReindex                        N/A                 N/A   10000              N/A  avgt    5   254.814 ±    2.692  ms/op

The benchmark test introduced in #4367

LazyLoadBenchmark.getProperties          true              5000  avgt    5   75.577 ± 4.859  ms/op
LazyLoadBenchmark.getProperties         false              5000  avgt    5  156.141 ± 7.985  ms/op

Conclusion:
With the default transactions behavior performance between this PR and the latest commit in master branch is the same. No any regression was detected.
The added benchmark test LazyLoadBenchmark shows the use-case where performance increased about 2 times with lazy loading enabled in transactions.
I would prefer adding LazyLoadBenchmark test into the main codebase as well, but I think we can do it later via a separate PR. Merging this PR. Thank you @ntisseyre for this awesome optimization!

@porunov porunov merged commit eed8756 into JanusGraph:master Apr 26, 2024
174 checks passed
@porunov porunov added this to the Release v1.1.0 milestone Apr 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: external Externally-managed CLA kind/performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants