[DO NOT MERGE] [SNOW-1491199] merge phase0 server side #2136

sfc-gh-lspiegelberg · 2024-08-21T00:30:24Z

Captures work for server-side snowpark phase0.

…: table(), filter() (#1468) A very basic initial attempt at serializing the AST. I'm trying to maintain a parallel codebase for phases 0 and 1 for now, since it would be a shame to do this work twice. Once we complete and ship phase 0, we'll be able to drastically simplify the phase 1 client. Unlike what I mentioned before, this implementation doesn't flush dependencies of eagerly evaluated expressions. Instead, any client-side value is appended to the pending batch. This is simpler to implement and will likely work well, although we may need to do some dependency analysis on the server to ensure we don't issue unnecessary queries.

…/snowpark-python into server-side-snowpark

Updates our server branch with recent snowpark changes.

1. Which Jira issue is this PR addressing? Make sure that there is an accompanying issue to your PR.  Fixes SNOW-0 2. Fill out the following pre-review checklist: - [ ] I am adding a new automated test(s) to verify correctness of my new code - [ ] I am adding new logging messages - [ ] I am adding a new telemetry message - [ ] I am adding new credentials - [x] I am adding a new dependency 3. Please describe how your code solves the related issue. Update `ast_pb2.py` (already present in the repository). Add the `setuptools` dependencies required for development. Include the module path for `ast_pb2.py` in the manifest, so that the file makes it into the Snowpark wheel.

…r-side-snowpark

…1610)

…es until we send a response on TCM creation (#1617)

Run `update-from-devvm.sh` from within `src/snowflake/snowpark/_internal/proto/` with a running local devvm to update the proto file on the thin client.

….py (#1766) Modifies `setup.py` to use the latest HEAD of https://github.com/snowflakedb/snowflake-connector-python/tree/server-side-snowpark which includes connector changes (most notable adding the `_dataframe_ast` field for phase 0). To update your local dev environment run ``` pip uninstall snowflake-connector-python -y python -m pip install --no-cache -e ".[development,pandas]" ``` Running the pip command should show `git clone` in the logs.

This is the thin-client PR complementary to https://github.com/snowflakedb/snowflake/pull/183143/files.

…frameAST field. (#1794) Vendors snowflake vcrpy from https://github.com/Snowflake-Labs/snowflake-vcrpy (could not get install working, therefore vendoring it) with custom Snowflake changes to track requests in vendored urllib3 within the snowflake python connector. Adds decorator `check_ast_encode_invoked` (applied with `autouse=True` to all tests) which checks that every query send contains `dataframeAst` property for phase 0, and errors out together with traceback information whenever tests need to be fixed / APIs are missing that need to be encoded within the AST.

… with Python 3.8 (#1796) Remove temporarily Modin tests as Modin is incompatible with Python 3.8.

Exclude protobuf file in precommit (flake8), as protobuf does not adhere to good coding standards and a protobuf update will consequently fail precommit.

…ad of sbt (#1811) Convenience script `update-unparser.sh` uses sbt build currently. With introduction of the bazel scala targets dependencies (i.e., IR changes) switch to using them to create an updates unparser.sh as IR/protobuf updates will be reflected.

…bility and readability (#2464) Fixes SNOW-1738538 Update test expectations to textproto ast for better stability and readability

Fixes doctest for phase0 with `--ast-enabled` by switching the runner to macOS and fixing up various `logging.<func>` usages by switching them to `_logger` as in the rest of the code base. Doctest do not pass at the moment (error: returning <BLANKLINE>, interference with logging module, cf. JIRA) under ubuntu (windows: not tested). Other: - warn for missing unparser.jar file now only if `--update-expectations` is passed. - Use file-specific logger instead of global loggers.

…st ast (#2497) Fix multi ast eval validation in decoding expectation test ast

…st_dataframe.py (#2498) Fixes several tests to reduce test failures for the merge-gate with `session.ast_enabled=True`. With this PR all tests except for the interval related ones pass for `test_dataframe.py`. Details: - Adds `ast_id` to shallow copy protocol for Dataframe. - Modify APIs so checks happen before AST encoding to pass negative tests. - Add missing `ast_stmt` to `select(table_function(...))` - Fix filter test. - Fix overflow issue when using `datetime.time` due to timezone encoding. - Support array type (array.py from stdlib) by mapping to list. - Fix na functions to carry out checks for `subset` parameter. - Explicitly type check for `col` and `column` as expected in existing tests. - Add `NotImplementedError` for interval type in `make_interval`. - Fix wrong usage of Expression in `function.col(...)`, use `Column` instead in `apply_in_pandas`.

Changes addressing column.py.

Addresses PR feedback for dataframe.py file.

sfc-gh-aalam · 2024-10-28T22:22:18Z

src/snowflake/snowpark/dataframe_writer.py

+    @publicapi
+    def mode(self, save_mode: str, _emit_ast: bool = True) -> "DataFrameWriter":
        """Set the save mode of this :class:`DataFrameWriter`.


FYI, some more public APIs were added - .partition_by, option, options

cherry-pick (#2510)

sfc-gh-aalam · 2024-10-29T17:17:35Z