Add support for lazy loading and imports of some expensive subpackages and modules to speed up Perun startup time #259

JiriPavela · 2024-09-30T12:08:41Z

This PR implements the mechanism proposed in #223 for the most expensive modules that were being imported during perun --version, perun import, and perun showdiff.

tfiedor

If I am looking right, view is not fully lazy loaded and it was quite a source of the problems (with holoviews/bokeh being quite huge). Is that WIP?
Can you show some numbers? Like running perun help, perun status, perun init and maybe perun showdiff before and after? Just few runs and few numbers so we can compare how much we have saved, because it did come with a cost (worse readability, bigger complexity, and I fear it might be harder for IDEs to suggest).

Well done anyway.

tfiedor · 2024-10-06T11:50:00Z

perun/postprocess/regression_analysis/run.py

@@ -94,7 +100,7 @@ def store_model_counts(analysis: list[dict[str, Any]]) -> None:
 @click.option(
    "--regression_models",
    "-r",
-    type=click.Choice(regression_models.get_supported_models()),
+    type=click.Choice(perun.utils.structs.postprocess_public.get_supported_models()),


Why full name? This seems like automatic change.

Great, thanks for the catch. Fixed.

tfiedor · 2024-10-06T11:53:11Z

perun/utils/common/cli_kit.py

@@ -832,6 +834,8 @@ def set_optimization(_: click.Context, param: click.Argument, value: str) -> str
    :param value: value of the parameter
    :return: the value
    """
+    from perun.collect.trace.optimizations.optimization import Optimization


Is this intended? Again it looks automatic and imports in functions are shuned by the linting.

This is actually intended. Optimization is a global object in the optimizations module and it is too complicated to refactor it right now, as many more modules and packages would have to be lazy imported. So this is a temporary solution, similarly to how many other packages do import their nested modules in lazy_get_cli_commands, e.g., view, collect, etc.

Add fixme there then, so we do not forget.

There are many more places where local imports are used in the codebase right now before we decided to adopt the lazy-loader approach, so this is something I definitely intend to cover in subsequent PRs regardless. However, for peace of mind, I added a TODO comment there.

tfiedor · 2024-10-06T11:54:40Z

perun/utils/structs/check_public.py

@@ -0,0 +1,29 @@
+from __future__ import annotations


_public sounds little bit weird. Maybe just call check_structs, etc. since we have common_structs and hence it will be uniform?

tfiedor · 2024-10-06T11:56:50Z

MANIFEST.in

@@ -0,0 +1,2 @@
+recursive-include perun *.pyi
+include perun/py.typed


What is this py.typed file? It is some stub? Can we add some comment why it is here and why it is empty? Can it contain like comment # Empty file needed for lazy_loading?

This is a requirement of PEP 561 and described also in mypy documentation. Packages that distribute both runtime and type stub files (.pyi files) need to contain a py.typed file as well to indicate support for type hints. The MANIFEST.in file is then needed for sdist distribution to include the .pyi and py.typed files. As the PEP does not specify what the py.typed files should contain and it is easy enough to find an explanation for the file online, I'd just keep it empty.

JiriPavela · 2024-10-06T19:40:31Z

If I am looking right, view is not fully lazy loaded and it was quite a source of the problems (with holoviews/bokeh being quite huge). Is that WIP?

Right now, view modules are being lazy loaded the naive way, i.e., by local imports in lazy_get_cli_commands. As this PR is meant to be more of a hotfix, I didn't touch code that already achieved the same goal, albeit in another way. Subsequent PRs will aim to port the entire codebase to lazy_loader approach.

Can you show some numbers? Like running perun help, perun status, perun init and maybe perun showdiff before and after? Just few runs and few numbers so we can compare how much we have saved, because it did come with a cost (worse readability, bigger complexity, and I fear it might be harder for IDEs to suggest).

I agree with the worse readability and bigger complexity. That is sadly the price of Python not having a native support for lazy loading. IDE, intellisense or autocomplete should not be affected, that's what the __getattr__, __dir__, __all__ = lazy.attach_stub(... in __init__.py files is there for. Nonetheless, some IDEs still run into trouble with it, but imports in the form of from perun import check as check usually solve it (see similar bugs with some other packages), although pylint complains about such import statements.

As for the numbers, I measured five runs of different commands and chose the median values:

perun --version: 2.414s vs 0.287s (8.4x speedup)
perun --help: 2.404s vs 0.294s (8.2x speedup)
perun init: 2.416s vs 0.296s (8.2x speedup)
perun import: 2.504s vs 0.539s (4.6x speedup)
perun showdiff (nontrivial input that contains a lot of processing): 4.478s vs 2.503s (1.8x speedup)

JiriPavela added 2 commits September 30, 2024 14:30

Remove dependency on collect command in CLI

3fb27d9

Prototype implementation of lazy loading for check/detection_kit.py

8d537d3

JiriPavela force-pushed the lazy-loading branch from dba5714 to 8d537d3 Compare September 30, 2024 12:30

JiriPavela added 2 commits October 5, 2024 23:05

Optimize Perun startup and import times

b780b0d

Fix documentation generator

adeb7a5

JiriPavela changed the title ~~Add support for lazy loading and imports of expensive subpackages and modules~~ Add support for lazy loading and imports of some expensive subpackages and modules to speed up Perun startup time Oct 5, 2024

Remove commented out profiling code

adc5387

JiriPavela marked this pull request as ready for review October 5, 2024 21:40

JiriPavela requested a review from tfiedor October 5, 2024 21:40

JiriPavela mentioned this pull request Oct 6, 2024

Fix empty cli stats record #260

Merged

tfiedor approved these changes Oct 6, 2024

View reviewed changes

Rename _public struct modules to _structs

c123e13

JiriPavela merged commit 3a24630 into Perfexionists:devel Oct 6, 2024
17 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for lazy loading and imports of some expensive subpackages and modules to speed up Perun startup time #259

Add support for lazy loading and imports of some expensive subpackages and modules to speed up Perun startup time #259

JiriPavela commented Sep 30, 2024 •

edited

Loading

tfiedor left a comment

tfiedor Oct 6, 2024

JiriPavela Oct 6, 2024

tfiedor Oct 6, 2024

JiriPavela Oct 6, 2024

tfiedor Oct 6, 2024

JiriPavela Oct 6, 2024

tfiedor Oct 6, 2024

JiriPavela Oct 6, 2024

tfiedor Oct 6, 2024

JiriPavela Oct 6, 2024

JiriPavela commented Oct 6, 2024

		@@ -0,0 +1,2 @@
		recursive-include perun *.pyi
		include perun/py.typed

Add support for lazy loading and imports of some expensive subpackages and modules to speed up Perun startup time #259

Add support for lazy loading and imports of some expensive subpackages and modules to speed up Perun startup time #259

Conversation

JiriPavela commented Sep 30, 2024 • edited Loading

tfiedor left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JiriPavela commented Oct 6, 2024

JiriPavela commented Sep 30, 2024 •

edited

Loading