Performance Optimization: Lazy Load Vertex Relations #4343

ntisseyre · 2024-03-20T19:09:43Z

Optimizing reads of large amounts of vertexes by doing lazy-load (lazy-deserialization) of vertex properties or edges.
Lazy-load is not enabled by default and should be set explicitly on a TransactionBuilder.
Leveraging vertex's query-cache for multiQuery operations instead of rebuilding the query while loading vertex relations.

porunov

Thank you @ntisseyre ! LGTM. I have left two nitpicks, but otherwise it looks good.
@JanusGraph/committers merging this PR using lazy consensus in a week.

janusgraph-core/src/main/java/org/janusgraph/core/JanusGraphLazyRelation.java

janusgraph-core/src/main/java/org/janusgraph/graphdb/transaction/TransactionConfiguration.java

porunov

One more nitpick is the commit message. It says "Initial commit", but better to use more concrete commit description.

li-boxuan

I love the idea of lazy-loading but I am not sure if I understand when this is useful and how powerful it is. Is it possible to add a new benchmark under janusgraph-benchmark module?

Could you also please run a full build with your feature turned on by default? You could create a bogus PR to trigger CI runs. I would love to see if that can help us capture bugs.

li-boxuan · 2024-04-10T04:44:44Z

janusgraph-core/src/main/java/org/janusgraph/core/JanusGraphLazyProperty.java

+
+import java.util.Iterator;
+
+public class JanusGraphLazyProperty<V> extends JanusGraphLazyRelation<V> implements JanusGraphVertexProperty<V> {


Consider: renaming this to JanusGraphLazyVertexProperty. A "property" can be attached to a vertex, an edge, or a vertex property.

li-boxuan · 2024-04-10T04:59:50Z

janusgraph-core/src/main/java/org/janusgraph/core/TransactionBuilder.java

+     * <p>
+     * When enabled, it can have a boost on large scale read operations, when only certain type of relations are being read.
+     *
+     * @return Object with the skip db-cache reads check settings


Wrong comment?

li-boxuan · 2024-04-10T05:04:12Z

janusgraph-core/src/main/java/org/janusgraph/graphdb/transaction/StandardJanusGraphTx.java

@@ -320,6 +326,7 @@ public void close() {
                    config.getVertexCacheSize(), effectiveVertexCacheSize, MIN_VERTEX_CACHE_SIZE);
        }

+        relTypeCache = new ConcurrentHashMap<>(30);


If only used by JanusGraphLazyRelation, looks like there's no reason to initialize relTypeCache if this transaction doesn't have lazyLoadRelations() enabled?

docs/basics/transactions.md

janusgraph-core/src/main/java/org/janusgraph/core/JanusGraphLazyRelation.java

Signed-off-by: ntisseyre <ntisseyre@apple.com>

porunov · 2024-04-23T14:45:41Z

Comments explained in #4367 (comment) make sense to me. @li-boxuan will you be able to check if you have any concerns or other comments regarding this PR?

li-boxuan

Great contribution, thank you! I would like to see some benchmark numbers or a new benchmark checked-in for perf-related PRs, but since this feature is off by default, I am happy to accept it as it is.

ntisseyre · 2024-04-24T16:13:09Z

Great contribution, thank you! I would like to see some benchmark numbers or a new benchmark checked-in for perf-related PRs, but since this feature is off by default, I am happy to accept it as it is.

Thank you @li-boxuan !
I want to add a benchmark test myself. I looked at the module, and I think it would be interesting to execute a test with the feature on/off and compare results. Is it possible to configure the test in this way?

porunov · 2024-04-24T20:52:49Z

Great contribution, thank you! I would like to see some benchmark numbers or a new benchmark checked-in for perf-related PRs, but since this feature is off by default, I am happy to accept it as it is.

Thank you @li-boxuan ! I want to add a benchmark test myself. I looked at the module, and I think it would be interesting to execute a test with the feature on/off and compare results. Is it possible to configure the test in this way?

You should just add a new param into you test and annotate it. The benchmark test will detect all the annotated parameters and will execute tests with all combinations of all provided values.

Here is an example of a boolean parameter (can be seen in this benchmark test):

@Param({"true", "false"})
boolean fastProperty;

You can create a similar param for lazy-loading feature.

A quick thing you can do in the PR #4367 :

Add the above parameter into all existing JanusGraph tests ( https://github.com/JanusGraph/janusgraph/tree/master/janusgraph-benchmark/src/main/java/org/janusgraph )
Each test has getConfiguration method where you can reuse your values for JanusGraph configuration creation based on the available parameters. Just set the lazy loading to true or false there.
Push changes into the testing branch and the benchmark process will start (GitHub Action CI). After that the report should be available the end of step called Run mvn verify --projects janusgraph-benchmark under the Performance regression check job. Here is an example of the performance regression check of your branch (i.e. this PR): https://github.com/ntisseyre/janusgraph/actions/runs/8681353435/job/23803868577

You can see the following report under that CI job:

# Run complete. Total time: 00:56:36

REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.

Benchmark                                                                                                          (fanoutFactor)  (propertyCardinalitySingle)  (verticesAmount)  Mode  Cnt      Score      Error  Units
CQLMultiQueryBenchmark.getAdjacentVerticesLocalCounts                                                                         100                          N/A               N/A  avgt    5    223.567 ±  442.426  ms/op
CQLMultiQueryBenchmark.getAdjacentVerticesLocalCounts                                                                         500                          N/A               N/A  avgt    5   8459.878 ±  647.477  ms/op
CQLMultiQueryBenchmark.getAllElementsTraversedFromOuterVertex                                                                 100                          N/A               N/A  avgt    5    157.624 ±   37.750  ms/op
CQLMultiQueryBenchmark.getAllElementsTraversedFromOuterVertex                                                                 500                          N/A               N/A  avgt    5   8190.692 ±  610.894  ms/op
CQLMultiQueryBenchmark.getElementsWithUsingEmitRepeatSteps                                                                    100                          N/A               N/A  avgt    5    317.631 ±   92.030  ms/op
CQLMultiQueryBenchmark.getElementsWithUsingEmitRepeatSteps                                                                    500                          N/A               N/A  avgt    5  16468.902 ±  426.340  ms/op
CQLMultiQueryBenchmark.getElementsWithUsingRepeatUntilSteps                                                                   100                          N/A               N/A  avgt    5    180.623 ±   48.971  ms/op
CQLMultiQueryBenchmark.getElementsWithUsingRepeatUntilSteps                                                                   500                          N/A               N/A  avgt    5   8827.256 ±  275.217  ms/op
CQLMultiQueryBenchmark.getIdToOutVerticesProjection                                                                           100                          N/A               N/A  avgt    5     11.525 ±    4.256  ms/op
CQLMultiQueryBenchmark.getIdToOutVerticesProjection                                                                           500                          N/A               N/A  avgt    5    240.265 ±   30.486  ms/op
CQLMultiQueryBenchmark.getLabels                                                                                              100                          N/A               N/A  avgt    5    128.353 ±   74.876  ms/op
CQLMultiQueryBenchmark.getLabels                                                                                              500                          N/A               N/A  avgt    5   6946.586 ±  354.581  ms/op
CQLMultiQueryBenchmark.getNames                                                                                               100                          N/A               N/A  avgt    5    158.185 ±   78.300  ms/op
CQLMultiQueryBenchmark.getNames                                                                                               500                          N/A               N/A  avgt    5   8266.414 ±  224.934  ms/op
CQLMultiQueryBenchmark.getNeighborNames                                                                                       100                          N/A               N/A  avgt    5    165.204 ±   49.715  ms/op
CQLMultiQueryBenchmark.getNeighborNames                                                                                       500                          N/A               N/A  avgt    5   7983.727 ±  240.637  ms/op
CQLMultiQueryBenchmark.getVerticesFilteredByAndStep                                                                           100                          N/A               N/A  avgt    5     19.570 ±    4.763  ms/op
CQLMultiQueryBenchmark.getVerticesFilteredByAndStep                                                                           500                          N/A               N/A  avgt    5    416.075 ±   35.813  ms/op
CQLMultiQueryBenchmark.getVerticesFromMultiNestedRepeatStepStartingFromSingleVertex                                           100                          N/A               N/A  avgt    5    317.255 ±   56.425  ms/op
CQLMultiQueryBenchmark.getVerticesFromMultiNestedRepeatStepStartingFromSingleVertex                                           500                          N/A               N/A  avgt    5  11904.330 ±  676.284  ms/op
CQLMultiQueryBenchmark.getVerticesWithCoalesceUsage                                                                           100                          N/A               N/A  avgt    5     17.015 ±    5.919  ms/op
CQLMultiQueryBenchmark.getVerticesWithCoalesceUsage                                                                           500                          N/A               N/A  avgt    5    350.154 ±   55.172  ms/op
CQLMultiQueryBenchmark.getVerticesWithDoubleUnion                                                                             100                          N/A               N/A  avgt    5     18.984 ±    5.438  ms/op
CQLMultiQueryBenchmark.getVerticesWithDoubleUnion                                                                             500                          N/A               N/A  avgt    5    360.418 ±   33.777  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                  N/A                         true              5000  avgt    5    182.429 ±   44.259  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                  N/A                         true             50000  avgt    5   2340.425 ±  206.067  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                  N/A                        false              5000  avgt    5    180.499 ±   40.487  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                  N/A                        false             50000  avgt    5   2287.420 ±  106.430  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithUnlimitedBatch                                                    N/A                         true              5000  avgt    5    150.357 ±   45.516  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithUnlimitedBatch                                                    N/A                         true             50000  avgt    5   1906.077 ±  268.831  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithUnlimitedBatch                                                    N/A                        false              5000  avgt    5    144.797 ±   31.274  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithUnlimitedBatch                                                    N/A                        false             50000  avgt    5   1869.584 ±  218.428  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection             N/A                         true              5000  avgt    5    257.977 ±   83.060  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection             N/A                         true             50000  avgt    5   3119.575 ±  248.486  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection             N/A                        false              5000  avgt    5    649.581 ±  117.665  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection             N/A                        false             50000  avgt    5   9750.344 ±  231.916  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithSmallBatch                                                   N/A                         true              5000  avgt    5    508.997 ±  200.694  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithSmallBatch                                                   N/A                         true             50000  avgt    5   5529.369 ±  222.520  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithSmallBatch                                                   N/A                        false              5000  avgt    5    891.095 ±  142.167  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithSmallBatch                                                   N/A                        false             50000  avgt    5  13073.927 ± 1140.664  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithUnlimitedBatch                                               N/A                         true              5000  avgt    5    211.978 ±   60.261  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithUnlimitedBatch                                               N/A                         true             50000  avgt    5   3087.432 ±  276.040  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithUnlimitedBatch                                               N/A                        false              5000  avgt    5    646.969 ±  166.361  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithUnlimitedBatch                                               N/A                        false             50000  avgt    5   9834.445 ±  483.247  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesThreePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                N/A                         true              5000  avgt    5    149.255 ±   81.601  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesThreePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                N/A                         true             50000  avgt    5   1826.016 ±  230.274  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesThreePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                N/A                        false              5000  avgt    5    233.800 ±   57.818  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesThreePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                N/A                        false             50000  avgt    5   3350.430 ±  332.899  ms/op
CQLMultiQueryMultiSlicesBenchmark.vertexCentricPropertiesFetching                                                             N/A                         true              5000  avgt    5   1709.161 ±   98.722  ms/op
CQLMultiQueryMultiSlicesBenchmark.vertexCentricPropertiesFetching                                                             N/A                         true             50000  avgt    5  18500.760 ±  382.153  ms/op
CQLMultiQueryMultiSlicesBenchmark.vertexCentricPropertiesFetching                                                             N/A                        false              5000  avgt    5   2812.449 ±   73.314  ms/op
CQLMultiQueryMultiSlicesBenchmark.vertexCentricPropertiesFetching                                                             N/A                        false             50000  avgt    5  32993.734 ±  888.912  ms/op

Some things to notice:

GitHub Actions often times assign works to different pods with different resources. Thus, often times comparison between the previous report and the new report may not be accurate. For this, it's recommended to execute the benchmark locally on your own server or a laptop where available resources are the same between different runs.
By adding a single parameter with 2 states the amount of all tests is multiplied by 2. Thus, I don't recommend doing so in this PR cause we will increase the benchmark execution amount by 2. Instead, it's better to simply add a new test class (similar to existing tests) with the available parameter and test cases which are made for this specific feature evaluation.
Previously, to verify that new optimizations / features don't introduce any regressions I was simply executing benchmark of the latest commit from master branch and the switching back to the feature branch and executing tests again. I was doing this on a local laptop with no other heavy active processes. After that I was uploading those two benchmark tests in comments under the relative PR or creating a GitHub Gist with all the reports. This allowed to verify that the feature doesn't bring any regressions to existing benchmarks. Again, GitHub Action job is doing the same, but it's not accurate due to different resources available between benchmark runs.

For reference:
JanusGraph Benchmark CI: https://github.com/JanusGraph/janusgraph/blob/master/.github/workflows/ci-benchmark.yml
Benchmark main method (as seen some tests in the CI are skipped by default due to being too heavy and long to be executed. However, it's possible to execute them by passing relative command arguments):

janusgraph/janusgraph-benchmark/src/main/java/org/janusgraph/BenchmarkRunner.java

Line 95 in 1c53402

    
           public static void main(String[] args) throws RunnerException, IOException, InterruptedException {

ntisseyre · 2024-04-24T23:03:01Z

Great contribution, thank you! I would like to see some benchmark numbers or a new benchmark checked-in for perf-related PRs, but since this feature is off by default, I am happy to accept it as it is.

Thank you @li-boxuan ! I want to add a benchmark test myself. I looked at the module, and I think it would be interesting to execute a test with the feature on/off and compare results. Is it possible to configure the test in this way?

You should just add a new param into you test and annotate it. The benchmark test will detect all the annotated parameters and will execute tests with all combinations of all provided values.

Here is an example of a boolean parameter (can be seen in this benchmark test):

@Param({"true", "false"})
boolean fastProperty;

You can create a similar param for lazy-loading feature.

A quick thing you can do in the PR #4367 :

Add the above parameter into all existing JanusGraph tests ( https://github.com/JanusGraph/janusgraph/tree/master/janusgraph-benchmark/src/main/java/org/janusgraph )
Each test has getConfiguration method where you can reuse your values for JanusGraph configuration creation based on the available parameters. Just set the lazy loading to true or false there.
Push changes into the testing branch and the benchmark process will start (GitHub Action CI). After that the report should be available the end of step called Run mvn verify --projects janusgraph-benchmark under the Performance regression check job. Here is an example of the performance regression check of your branch (i.e. this PR): https://github.com/ntisseyre/janusgraph/actions/runs/8681353435/job/23803868577

You can see the following report under that CI job:

# Run complete. Total time: 00:56:36

REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.

Benchmark                                                                                                          (fanoutFactor)  (propertyCardinalitySingle)  (verticesAmount)  Mode  Cnt      Score      Error  Units
CQLMultiQueryBenchmark.getAdjacentVerticesLocalCounts                                                                         100                          N/A               N/A  avgt    5    223.567 ±  442.426  ms/op
CQLMultiQueryBenchmark.getAdjacentVerticesLocalCounts                                                                         500                          N/A               N/A  avgt    5   8459.878 ±  647.477  ms/op
CQLMultiQueryBenchmark.getAllElementsTraversedFromOuterVertex                                                                 100                          N/A               N/A  avgt    5    157.624 ±   37.750  ms/op
CQLMultiQueryBenchmark.getAllElementsTraversedFromOuterVertex                                                                 500                          N/A               N/A  avgt    5   8190.692 ±  610.894  ms/op
CQLMultiQueryBenchmark.getElementsWithUsingEmitRepeatSteps                                                                    100                          N/A               N/A  avgt    5    317.631 ±   92.030  ms/op
CQLMultiQueryBenchmark.getElementsWithUsingEmitRepeatSteps                                                                    500                          N/A               N/A  avgt    5  16468.902 ±  426.340  ms/op
CQLMultiQueryBenchmark.getElementsWithUsingRepeatUntilSteps                                                                   100                          N/A               N/A  avgt    5    180.623 ±   48.971  ms/op
CQLMultiQueryBenchmark.getElementsWithUsingRepeatUntilSteps                                                                   500                          N/A               N/A  avgt    5   8827.256 ±  275.217  ms/op
CQLMultiQueryBenchmark.getIdToOutVerticesProjection                                                                           100                          N/A               N/A  avgt    5     11.525 ±    4.256  ms/op
CQLMultiQueryBenchmark.getIdToOutVerticesProjection                                                                           500                          N/A               N/A  avgt    5    240.265 ±   30.486  ms/op
CQLMultiQueryBenchmark.getLabels                                                                                              100                          N/A               N/A  avgt    5    128.353 ±   74.876  ms/op
CQLMultiQueryBenchmark.getLabels                                                                                              500                          N/A               N/A  avgt    5   6946.586 ±  354.581  ms/op
CQLMultiQueryBenchmark.getNames                                                                                               100                          N/A               N/A  avgt    5    158.185 ±   78.300  ms/op
CQLMultiQueryBenchmark.getNames                                                                                               500                          N/A               N/A  avgt    5   8266.414 ±  224.934  ms/op
CQLMultiQueryBenchmark.getNeighborNames                                                                                       100                          N/A               N/A  avgt    5    165.204 ±   49.715  ms/op
CQLMultiQueryBenchmark.getNeighborNames                                                                                       500                          N/A               N/A  avgt    5   7983.727 ±  240.637  ms/op
CQLMultiQueryBenchmark.getVerticesFilteredByAndStep                                                                           100                          N/A               N/A  avgt    5     19.570 ±    4.763  ms/op
CQLMultiQueryBenchmark.getVerticesFilteredByAndStep                                                                           500                          N/A               N/A  avgt    5    416.075 ±   35.813  ms/op
CQLMultiQueryBenchmark.getVerticesFromMultiNestedRepeatStepStartingFromSingleVertex                                           100                          N/A               N/A  avgt    5    317.255 ±   56.425  ms/op
CQLMultiQueryBenchmark.getVerticesFromMultiNestedRepeatStepStartingFromSingleVertex                                           500                          N/A               N/A  avgt    5  11904.330 ±  676.284  ms/op
CQLMultiQueryBenchmark.getVerticesWithCoalesceUsage                                                                           100                          N/A               N/A  avgt    5     17.015 ±    5.919  ms/op
CQLMultiQueryBenchmark.getVerticesWithCoalesceUsage                                                                           500                          N/A               N/A  avgt    5    350.154 ±   55.172  ms/op
CQLMultiQueryBenchmark.getVerticesWithDoubleUnion                                                                             100                          N/A               N/A  avgt    5     18.984 ±    5.438  ms/op
CQLMultiQueryBenchmark.getVerticesWithDoubleUnion                                                                             500                          N/A               N/A  avgt    5    360.418 ±   33.777  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                  N/A                         true              5000  avgt    5    182.429 ±   44.259  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                  N/A                         true             50000  avgt    5   2340.425 ±  206.067  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                  N/A                        false              5000  avgt    5    180.499 ±   40.487  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                  N/A                        false             50000  avgt    5   2287.420 ±  106.430  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithUnlimitedBatch                                                    N/A                         true              5000  avgt    5    150.357 ±   45.516  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithUnlimitedBatch                                                    N/A                         true             50000  avgt    5   1906.077 ±  268.831  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithUnlimitedBatch                                                    N/A                        false              5000  avgt    5    144.797 ±   31.274  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithUnlimitedBatch                                                    N/A                        false             50000  avgt    5   1869.584 ±  218.428  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection             N/A                         true              5000  avgt    5    257.977 ±   83.060  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection             N/A                         true             50000  avgt    5   3119.575 ±  248.486  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection             N/A                        false              5000  avgt    5    649.581 ±  117.665  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection             N/A                        false             50000  avgt    5   9750.344 ±  231.916  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithSmallBatch                                                   N/A                         true              5000  avgt    5    508.997 ±  200.694  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithSmallBatch                                                   N/A                         true             50000  avgt    5   5529.369 ±  222.520  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithSmallBatch                                                   N/A                        false              5000  avgt    5    891.095 ±  142.167  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithSmallBatch                                                   N/A                        false             50000  avgt    5  13073.927 ± 1140.664  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithUnlimitedBatch                                               N/A                         true              5000  avgt    5    211.978 ±   60.261  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithUnlimitedBatch                                               N/A                         true             50000  avgt    5   3087.432 ±  276.040  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithUnlimitedBatch                                               N/A                        false              5000  avgt    5    646.969 ±  166.361  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithUnlimitedBatch                                               N/A                        false             50000  avgt    5   9834.445 ±  483.247  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesThreePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                N/A                         true              5000  avgt    5    149.255 ±   81.601  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesThreePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                N/A                         true             50000  avgt    5   1826.016 ±  230.274  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesThreePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                N/A                        false              5000  avgt    5    233.800 ±   57.818  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesThreePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                N/A                        false             50000  avgt    5   3350.430 ±  332.899  ms/op
CQLMultiQueryMultiSlicesBenchmark.vertexCentricPropertiesFetching                                                             N/A                         true              5000  avgt    5   1709.161 ±   98.722  ms/op
CQLMultiQueryMultiSlicesBenchmark.vertexCentricPropertiesFetching                                                             N/A                         true             50000  avgt    5  18500.760 ±  382.153  ms/op
CQLMultiQueryMultiSlicesBenchmark.vertexCentricPropertiesFetching                                                             N/A                        false              5000  avgt    5   2812.449 ±   73.314  ms/op
CQLMultiQueryMultiSlicesBenchmark.vertexCentricPropertiesFetching                                                             N/A                        false             50000  avgt    5  32993.734 ±  888.912  ms/op

Some things to notice:

GitHub Actions often times assign works to different pods with different resources. Thus, often times comparison between the previous report and the new report may not be accurate. For this, it's recommended to execute the benchmark locally on your own server or a laptop where available resources are the same between different runs.
By adding a single parameter with 2 states the amount of all tests is multiplied by 2. Thus, I don't recommend doing so in this PR cause we will increase the benchmark execution amount by 2. Instead, it's better to simply add a new test class (similar to existing tests) with the available parameter and test cases which are made for this specific feature evaluation.
Previously, to verify that new optimizations / features don't introduce any regressions I was simply executing benchmark of the latest commit from master branch and the switching back to the feature branch and executing tests again. I was doing this on a local laptop with no other heavy active processes. After that I was uploading those two benchmark tests in comments under the relative PR or creating a GitHub Gist with all the reports. This allowed to verify that the feature doesn't bring any regressions to existing benchmarks. Again, GitHub Action job is doing the same, but it's not accurate due to different resources available between benchmark runs.

For reference: JanusGraph Benchmark CI: https://github.com/JanusGraph/janusgraph/blob/master/.github/workflows/ci-benchmark.yml Benchmark main method (as seen some tests in the CI are skipped by default due to being too heavy and long to be executed. However, it's possible to execute them by passing relative command arguments):

janusgraph/janusgraph-benchmark/src/main/java/org/janusgraph/BenchmarkRunner.java

Line 95 in 1c53402

    
           public static void main(String[] args) throws RunnerException, IOException, InterruptedException {

Thank you @porunov !
I have added a benchmarking test here 60c7698#diff-8ce3c56c221176d06101a8cd771e4a0781620b696407b27d6a97a0405442a603R37
cc @li-boxuan

porunov · 2024-04-25T14:15:27Z

Great contribution, thank you! I would like to see some benchmark numbers or a new benchmark checked-in for perf-related PRs, but since this feature is off by default, I am happy to accept it as it is.

Thank you @li-boxuan ! I want to add a benchmark test myself. I looked at the module, and I think it would be interesting to execute a test with the feature on/off and compare results. Is it possible to configure the test in this way?

You should just add a new param into you test and annotate it. The benchmark test will detect all the annotated parameters and will execute tests with all combinations of all provided values.
Here is an example of a boolean parameter (can be seen in this benchmark test):

@Param({"true", "false"})
boolean fastProperty;

You can create a similar param for lazy-loading feature.
A quick thing you can do in the PR #4367 :

Add the above parameter into all existing JanusGraph tests ( https://github.com/JanusGraph/janusgraph/tree/master/janusgraph-benchmark/src/main/java/org/janusgraph )
Each test has getConfiguration method where you can reuse your values for JanusGraph configuration creation based on the available parameters. Just set the lazy loading to true or false there.
Push changes into the testing branch and the benchmark process will start (GitHub Action CI). After that the report should be available the end of step called Run mvn verify --projects janusgraph-benchmark under the Performance regression check job. Here is an example of the performance regression check of your branch (i.e. this PR): https://github.com/ntisseyre/janusgraph/actions/runs/8681353435/job/23803868577

You can see the following report under that CI job:

# Run complete. Total time: 00:56:36

REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.

Benchmark                                                                                                          (fanoutFactor)  (propertyCardinalitySingle)  (verticesAmount)  Mode  Cnt      Score      Error  Units
CQLMultiQueryBenchmark.getAdjacentVerticesLocalCounts                                                                         100                          N/A               N/A  avgt    5    223.567 ±  442.426  ms/op
CQLMultiQueryBenchmark.getAdjacentVerticesLocalCounts                                                                         500                          N/A               N/A  avgt    5   8459.878 ±  647.477  ms/op
CQLMultiQueryBenchmark.getAllElementsTraversedFromOuterVertex                                                                 100                          N/A               N/A  avgt    5    157.624 ±   37.750  ms/op
CQLMultiQueryBenchmark.getAllElementsTraversedFromOuterVertex                                                                 500                          N/A               N/A  avgt    5   8190.692 ±  610.894  ms/op
CQLMultiQueryBenchmark.getElementsWithUsingEmitRepeatSteps                                                                    100                          N/A               N/A  avgt    5    317.631 ±   92.030  ms/op
CQLMultiQueryBenchmark.getElementsWithUsingEmitRepeatSteps                                                                    500                          N/A               N/A  avgt    5  16468.902 ±  426.340  ms/op
CQLMultiQueryBenchmark.getElementsWithUsingRepeatUntilSteps                                                                   100                          N/A               N/A  avgt    5    180.623 ±   48.971  ms/op
CQLMultiQueryBenchmark.getElementsWithUsingRepeatUntilSteps                                                                   500                          N/A               N/A  avgt    5   8827.256 ±  275.217  ms/op
CQLMultiQueryBenchmark.getIdToOutVerticesProjection                                                                           100                          N/A               N/A  avgt    5     11.525 ±    4.256  ms/op
CQLMultiQueryBenchmark.getIdToOutVerticesProjection                                                                           500                          N/A               N/A  avgt    5    240.265 ±   30.486  ms/op
CQLMultiQueryBenchmark.getLabels                                                                                              100                          N/A               N/A  avgt    5    128.353 ±   74.876  ms/op
CQLMultiQueryBenchmark.getLabels                                                                                              500                          N/A               N/A  avgt    5   6946.586 ±  354.581  ms/op
CQLMultiQueryBenchmark.getNames                                                                                               100                          N/A               N/A  avgt    5    158.185 ±   78.300  ms/op
CQLMultiQueryBenchmark.getNames                                                                                               500                          N/A               N/A  avgt    5   8266.414 ±  224.934  ms/op
CQLMultiQueryBenchmark.getNeighborNames                                                                                       100                          N/A               N/A  avgt    5    165.204 ±   49.715  ms/op
CQLMultiQueryBenchmark.getNeighborNames                                                                                       500                          N/A               N/A  avgt    5   7983.727 ±  240.637  ms/op
CQLMultiQueryBenchmark.getVerticesFilteredByAndStep                                                                           100                          N/A               N/A  avgt    5     19.570 ±    4.763  ms/op
CQLMultiQueryBenchmark.getVerticesFilteredByAndStep                                                                           500                          N/A               N/A  avgt    5    416.075 ±   35.813  ms/op
CQLMultiQueryBenchmark.getVerticesFromMultiNestedRepeatStepStartingFromSingleVertex                                           100                          N/A               N/A  avgt    5    317.255 ±   56.425  ms/op
CQLMultiQueryBenchmark.getVerticesFromMultiNestedRepeatStepStartingFromSingleVertex                                           500                          N/A               N/A  avgt    5  11904.330 ±  676.284  ms/op
CQLMultiQueryBenchmark.getVerticesWithCoalesceUsage                                                                           100                          N/A               N/A  avgt    5     17.015 ±    5.919  ms/op
CQLMultiQueryBenchmark.getVerticesWithCoalesceUsage                                                                           500                          N/A               N/A  avgt    5    350.154 ±   55.172  ms/op
CQLMultiQueryBenchmark.getVerticesWithDoubleUnion                                                                             100                          N/A               N/A  avgt    5     18.984 ±    5.438  ms/op
CQLMultiQueryBenchmark.getVerticesWithDoubleUnion                                                                             500                          N/A               N/A  avgt    5    360.418 ±   33.777  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                  N/A                         true              5000  avgt    5    182.429 ±   44.259  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                  N/A                         true             50000  avgt    5   2340.425 ±  206.067  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                  N/A                        false              5000  avgt    5    180.499 ±   40.487  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                  N/A                        false             50000  avgt    5   2287.420 ±  106.430  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithUnlimitedBatch                                                    N/A                         true              5000  avgt    5    150.357 ±   45.516  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithUnlimitedBatch                                                    N/A                         true             50000  avgt    5   1906.077 ±  268.831  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithUnlimitedBatch                                                    N/A                        false              5000  avgt    5    144.797 ±   31.274  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithUnlimitedBatch                                                    N/A                        false             50000  avgt    5   1869.584 ±  218.428  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection             N/A                         true              5000  avgt    5    257.977 ±   83.060  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection             N/A                         true             50000  avgt    5   3119.575 ±  248.486  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection             N/A                        false              5000  avgt    5    649.581 ±  117.665  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection             N/A                        false             50000  avgt    5   9750.344 ±  231.916  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithSmallBatch                                                   N/A                         true              5000  avgt    5    508.997 ±  200.694  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithSmallBatch                                                   N/A                         true             50000  avgt    5   5529.369 ±  222.520  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithSmallBatch                                                   N/A                        false              5000  avgt    5    891.095 ±  142.167  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithSmallBatch                                                   N/A                        false             50000  avgt    5  13073.927 ± 1140.664  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithUnlimitedBatch                                               N/A                         true              5000  avgt    5    211.978 ±   60.261  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithUnlimitedBatch                                               N/A                         true             50000  avgt    5   3087.432 ±  276.040  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithUnlimitedBatch                                               N/A                        false              5000  avgt    5    646.969 ±  166.361  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithUnlimitedBatch                                               N/A                        false             50000  avgt    5   9834.445 ±  483.247  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesThreePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                N/A                         true              5000  avgt    5    149.255 ±   81.601  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesThreePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                N/A                         true             50000  avgt    5   1826.016 ±  230.274  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesThreePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                N/A                        false              5000  avgt    5    233.800 ±   57.818  ms/op
CQLMultiQueryMultiSlicesBenchmark.getValuesThreePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection                N/A                        false             50000  avgt    5   3350.430 ±  332.899  ms/op
CQLMultiQueryMultiSlicesBenchmark.vertexCentricPropertiesFetching                                                             N/A                         true              5000  avgt    5   1709.161 ±   98.722  ms/op
CQLMultiQueryMultiSlicesBenchmark.vertexCentricPropertiesFetching                                                             N/A                         true             50000  avgt    5  18500.760 ±  382.153  ms/op
CQLMultiQueryMultiSlicesBenchmark.vertexCentricPropertiesFetching                                                             N/A                        false              5000  avgt    5   2812.449 ±   73.314  ms/op
CQLMultiQueryMultiSlicesBenchmark.vertexCentricPropertiesFetching                                                             N/A                        false             50000  avgt    5  32993.734 ±  888.912  ms/op

Some things to notice:

GitHub Actions often times assign works to different pods with different resources. Thus, often times comparison between the previous report and the new report may not be accurate. For this, it's recommended to execute the benchmark locally on your own server or a laptop where available resources are the same between different runs.
By adding a single parameter with 2 states the amount of all tests is multiplied by 2. Thus, I don't recommend doing so in this PR cause we will increase the benchmark execution amount by 2. Instead, it's better to simply add a new test class (similar to existing tests) with the available parameter and test cases which are made for this specific feature evaluation.
Previously, to verify that new optimizations / features don't introduce any regressions I was simply executing benchmark of the latest commit from master branch and the switching back to the feature branch and executing tests again. I was doing this on a local laptop with no other heavy active processes. After that I was uploading those two benchmark tests in comments under the relative PR or creating a GitHub Gist with all the reports. This allowed to verify that the feature doesn't bring any regressions to existing benchmarks. Again, GitHub Action job is doing the same, but it's not accurate due to different resources available between benchmark runs.

For reference: JanusGraph Benchmark CI: https://github.com/JanusGraph/janusgraph/blob/master/.github/workflows/ci-benchmark.yml Benchmark main method (as seen some tests in the CI are skipped by default due to being too heavy and long to be executed. However, it's possible to execute them by passing relative command arguments):

janusgraph/janusgraph-benchmark/src/main/java/org/janusgraph/BenchmarkRunner.java

Line 95 in 1c53402

    
           public static void main(String[] args) throws RunnerException, IOException, InterruptedException {

Thank you @porunov ! I have added a benchmarking test here 60c7698#diff-8ce3c56c221176d06101a8cd771e4a0781620b696407b27d6a97a0405442a603R37 cc @li-boxuan

It was not triggered due to failed checkstyle:

Error:  /home/runner/work/janusgraph/janusgraph/janusgraph-benchmark/src/main/java/org/janusgraph/LazyLoadBenchmark.java:19:27: Using the '.*' form of import should be avoided - org.janusgraph.core.*. [AvoidStarImport]
Error:  /home/runner/work/janusgraph/janusgraph/janusgraph-benchmark/src/main/java/org/janusgraph/LazyLoadBenchmark.java:24:35: Using the '.*' form of import should be avoided - org.openjdk.jmh.annotations.*. [AvoidStarImport]

ntisseyre · 2024-04-25T14:48:24Z

LazyLoadBenchmark

Oh, thanks! Fixed

porunov · 2024-04-25T23:48:35Z

LazyLoadBenchmark

Oh, thanks! Fixed

Hmm. Seems OOM.

# Benchmark: org.janusgraph.LazyLoadBenchmark.getProperties
# Parameters: (isLazyLoad = false, verticesAmount = 5000)
Iteration   1: 173.570 ms/op
Iteration   2: 168.559 ms/op
Iteration   3: 169.724 ms/op
Iteration   4: 167.155 ms/op
Iteration   5: 169.652 ms/op

# Parameters: (isLazyLoad = false, verticesAmount = 100000)
java.lang.OutOfMemoryError: GC overhead limit exceeded

So, it seems never executed any benchmark test with isLazyLoad = true. I guess you could reduce verticesAmount or execute the test locally. That said, it's strange why OOM happened on only 100k vertices.

ntisseyre · 2024-04-26T12:39:49Z

LazyLoadBenchmark

Oh, thanks! Fixed

Hmm. Seems OOM.
# Benchmark: org.janusgraph.LazyLoadBenchmark.getProperties
# Parameters: (isLazyLoad = false, verticesAmount = 5000)
Iteration   1: 173.570 ms/op
Iteration   2: 168.559 ms/op
Iteration   3: 169.724 ms/op
Iteration   4: 167.155 ms/op
Iteration   5: 169.652 ms/op
# Parameters: (isLazyLoad = false, verticesAmount = 100000)
java.lang.OutOfMemoryError: GC overhead limit exceeded
So, it seems never executed any benchmark test with isLazyLoad = true. I guess you could reduce verticesAmount or execute the test locally. That said, it's strange why OOM happened on only 100k vertices.

I have reduced it to only 5k vertices and put isLazyLoad=true to be executed first.
When i ran it locally, I observed it executes 2 times faster comparing to isLazyLoad=false

porunov · 2024-04-26T17:59:47Z

Below are the tests I executed on my local laptop with minimal parallel processes running.

Master branch basic benchmark tests:

GraphCentricQueryBenchmark.getVertices              100000                 N/A   10000             true  avgt    5     7.279 ±    1.083  ms/op
GraphCentricQueryBenchmark.getVertices              100000                 N/A   10000            false  avgt    5     5.077 ±    0.143  ms/op
GraphCentricQueryBenchmark.getVertices              100000                 N/A  250000             true  avgt    5   245.352 ±   36.570  ms/op
GraphCentricQueryBenchmark.getVertices              100000                 N/A  250000            false  avgt    5   255.863 ±   23.198  ms/op
GraphCentricQueryBenchmark.getVertices          2147483647                 N/A   10000             true  avgt    5     7.259 ±    0.359  ms/op
GraphCentricQueryBenchmark.getVertices          2147483647                 N/A   10000            false  avgt    5     5.248 ±    0.223  ms/op
GraphCentricQueryBenchmark.getVertices          2147483647                 N/A  250000             true  avgt    5   230.278 ±   20.577  ms/op
GraphCentricQueryBenchmark.getVertices          2147483647                 N/A  250000            false  avgt    5   143.663 ±   13.211  ms/op
JanusGraphSpeedBenchmark.basicAddAndDelete             N/A                1000     N/A              N/A  avgt    5   388.187 ±  483.345  ms/op
JanusGraphSpeedBenchmark.basicAddAndDelete             N/A               10000     N/A              N/A  avgt    5  1387.198 ± 1485.462  ms/op
JanusGraphSpeedBenchmark.basicAddAndDelete             N/A              100000     N/A              N/A  avgt    5  7293.442 ± 4041.513  ms/op
JanusGraphSpeedBenchmark.basicCount                    N/A                1000     N/A              N/A  avgt    5     0.949 ±    0.095  ms/op
JanusGraphSpeedBenchmark.basicCount                    N/A               10000     N/A              N/A  avgt    5    12.261 ±    0.404  ms/op
JanusGraphSpeedBenchmark.basicCount                    N/A              100000     N/A              N/A  avgt    5   195.550 ±    8.741  ms/op
MgmtOlapJobBenchmark.runClearIndex                     N/A                 N/A   10000              N/A  avgt    5   214.964 ±    1.516  ms/op
MgmtOlapJobBenchmark.runReindex                        N/A                 N/A   10000              N/A  avgt    5   255.134 ±    1.426  ms/op

Current PR basic benchmark tests:

Benchmark                                   (hardMaxLimit)  (numberOfVertices)  (size)  (useSmartLimit)  Mode  Cnt     Score      Error  Units
GraphCentricQueryBenchmark.getVertices              100000                 N/A   10000             true  avgt    5     7.510 ±    0.443  ms/op
GraphCentricQueryBenchmark.getVertices              100000                 N/A   10000            false  avgt    5     5.428 ±    0.280  ms/op
GraphCentricQueryBenchmark.getVertices              100000                 N/A  250000             true  avgt    5   221.762 ±   20.090  ms/op
GraphCentricQueryBenchmark.getVertices              100000                 N/A  250000            false  avgt    5   266.333 ±   11.015  ms/op
GraphCentricQueryBenchmark.getVertices          2147483647                 N/A   10000             true  avgt    5     7.496 ±    0.283  ms/op
GraphCentricQueryBenchmark.getVertices          2147483647                 N/A   10000            false  avgt    5     5.147 ±    0.298  ms/op
GraphCentricQueryBenchmark.getVertices          2147483647                 N/A  250000             true  avgt    5   231.363 ±    9.417  ms/op
GraphCentricQueryBenchmark.getVertices          2147483647                 N/A  250000            false  avgt    5   150.624 ±    8.143  ms/op
JanusGraphSpeedBenchmark.basicAddAndDelete             N/A                1000     N/A              N/A  avgt    5   381.277 ±  488.346  ms/op
JanusGraphSpeedBenchmark.basicAddAndDelete             N/A               10000     N/A              N/A  avgt    5  1363.017 ± 1539.981  ms/op
JanusGraphSpeedBenchmark.basicAddAndDelete             N/A              100000     N/A              N/A  avgt    5  7071.629 ± 2560.979  ms/op
JanusGraphSpeedBenchmark.basicCount                    N/A                1000     N/A              N/A  avgt    5     0.966 ±    0.106  ms/op
JanusGraphSpeedBenchmark.basicCount                    N/A               10000     N/A              N/A  avgt    5    11.893 ±    0.592  ms/op
JanusGraphSpeedBenchmark.basicCount                    N/A              100000     N/A              N/A  avgt    5   208.851 ±   69.922  ms/op
MgmtOlapJobBenchmark.runClearIndex                     N/A                 N/A   10000              N/A  avgt    5   215.058 ±    1.976  ms/op
MgmtOlapJobBenchmark.runReindex                        N/A                 N/A   10000              N/A  avgt    5   254.814 ±    2.692  ms/op

The benchmark test introduced in #4367

LazyLoadBenchmark.getProperties          true              5000  avgt    5   75.577 ± 4.859  ms/op
LazyLoadBenchmark.getProperties         false              5000  avgt    5  156.141 ± 7.985  ms/op

Conclusion:
With the default transactions behavior performance between this PR and the latest commit in master branch is the same. No any regression was detected.
The added benchmark test LazyLoadBenchmark shows the use-case where performance increased about 2 times with lazy loading enabled in transactions.
I would prefer adding LazyLoadBenchmark test into the main codebase as well, but I think we can do it later via a separate PR. Merging this PR. Thank you @ntisseyre for this awesome optimization!

ntisseyre force-pushed the lazy_relations branch 3 times, most recently from 821111c to dfbb92e Compare March 20, 2024 19:20

janusgraph-bot added the cla: external Externally-managed CLA label Mar 25, 2024

ntisseyre force-pushed the lazy_relations branch 4 times, most recently from 8a8c0d2 to 1711b2b Compare April 7, 2024 11:46

porunov approved these changes Apr 9, 2024

View reviewed changes

janusgraph-core/src/main/java/org/janusgraph/core/JanusGraphLazyRelation.java Outdated Show resolved Hide resolved

janusgraph-core/src/main/java/org/janusgraph/graphdb/transaction/TransactionConfiguration.java Show resolved Hide resolved

porunov reviewed Apr 9, 2024

View reviewed changes

ntisseyre force-pushed the lazy_relations branch 2 times, most recently from bb98134 to 367df51 Compare April 10, 2024 01:42

li-boxuan reviewed Apr 10, 2024

View reviewed changes

ntisseyre force-pushed the lazy_relations branch 6 times, most recently from 9709a6f to d725e10 Compare April 12, 2024 19:36

Lazy load relations and cached query optimization

cd52b37

Signed-off-by: ntisseyre <ntisseyre@apple.com>

ntisseyre force-pushed the lazy_relations branch from d725e10 to cd52b37 Compare April 14, 2024 22:32

porunov mentioned this pull request Apr 23, 2024

Lazy relations test [cql-tests][tp-tests] #4367

Closed

li-boxuan approved these changes Apr 24, 2024

View reviewed changes

porunov merged commit eed8756 into JanusGraph:master Apr 26, 2024
174 checks passed

porunov added this to the Release v1.1.0 milestone Apr 26, 2024

porunov added the kind/performance label Apr 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance Optimization: Lazy Load Vertex Relations #4343

Performance Optimization: Lazy Load Vertex Relations #4343

ntisseyre commented Mar 20, 2024

porunov left a comment

porunov left a comment

li-boxuan left a comment

li-boxuan Apr 10, 2024

ntisseyre Apr 10, 2024

li-boxuan Apr 10, 2024

ntisseyre Apr 10, 2024

li-boxuan Apr 10, 2024

ntisseyre Apr 10, 2024

porunov commented Apr 23, 2024

li-boxuan left a comment •

edited

Loading

ntisseyre commented Apr 24, 2024

porunov commented Apr 24, 2024 •

edited

Loading

ntisseyre commented Apr 24, 2024

porunov commented Apr 25, 2024

ntisseyre commented Apr 25, 2024

porunov commented Apr 25, 2024

ntisseyre commented Apr 26, 2024

porunov commented Apr 26, 2024


		import java.util.Iterator;

		public class JanusGraphLazyProperty<V> extends JanusGraphLazyRelation<V> implements JanusGraphVertexProperty<V> {

Performance Optimization: Lazy Load Vertex Relations #4343

Performance Optimization: Lazy Load Vertex Relations #4343

Conversation

ntisseyre commented Mar 20, 2024

porunov left a comment

Choose a reason for hiding this comment

porunov left a comment

Choose a reason for hiding this comment

li-boxuan left a comment

Choose a reason for hiding this comment

li-boxuan Apr 10, 2024

Choose a reason for hiding this comment

ntisseyre Apr 10, 2024

Choose a reason for hiding this comment

li-boxuan Apr 10, 2024

Choose a reason for hiding this comment

ntisseyre Apr 10, 2024

Choose a reason for hiding this comment

li-boxuan Apr 10, 2024

Choose a reason for hiding this comment

ntisseyre Apr 10, 2024

Choose a reason for hiding this comment

porunov commented Apr 23, 2024

li-boxuan left a comment • edited Loading

Choose a reason for hiding this comment

ntisseyre commented Apr 24, 2024

porunov commented Apr 24, 2024 • edited Loading

ntisseyre commented Apr 24, 2024

porunov commented Apr 25, 2024

ntisseyre commented Apr 25, 2024

porunov commented Apr 25, 2024

ntisseyre commented Apr 26, 2024

porunov commented Apr 26, 2024

li-boxuan left a comment •

edited

Loading

porunov commented Apr 24, 2024 •

edited

Loading