Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enh]: Allow for kwargs in LazyFrame.collect #1042

Open
FBruzzesi opened this issue Sep 22, 2024 · 1 comment
Open

[Enh]: Allow for kwargs in LazyFrame.collect #1042

FBruzzesi opened this issue Sep 22, 2024 · 1 comment
Labels
enhancement New feature or request needs discussion

Comments

@FBruzzesi
Copy link
Member

We would like to learn about your use case. For example, if this feature is needed to adopt Narwhals in an open source project, could you please enter the link to it below?

No response

Please describe the purpose of the new feature or describe the problem to solve.

Allow to pass collect arguments to the underlying backend method call. For instance with the current implementation it would not be feasible to run polars with its streaming engine.

Ideally this should be as flexible as possible and not necessarily follow the polars API. Reason for this is that each backend collect-like function allow for different arguments.

Suggest a solution if possible.

I suggest two possible implementations:

  • sklearn-like: pass arguments with a convention such as polars__streaming, polars__engine, dask__optimize_graph and so on

  • engine specific dict:

     def compute(
         self,
         *,
         polars_kwargs: dict[str, Any] | None = None,
         dask_kwargs: dict[str, Any] | None = None,
         <engine_kwargs>: dict[str, Any] | None = None,
         ...
         ):

If you have tried alternatives, please describe them below.

No response

Additional information that may help us understand your needs.

No response

@FBruzzesi FBruzzesi added enhancement New feature or request needs discussion labels Sep 22, 2024
@MarcoGorelli
Copy link
Member

MarcoGorelli commented Sep 22, 2024

yup agree - for duckdb for example there's a variety of formats you may want to collect into (pyarrow, pandas, python..)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request needs discussion
Projects
None yet
Development

No branches or pull requests

2 participants