Skip to content

Commit

Permalink
Add CLI (#4)
Browse files Browse the repository at this point in the history
* Working initial implementation

With tests

Remove unneeded steps

Comment?

Uncomment

Add more black configuration

Fix test command

Fix mkdir test

Update release workflows

Update tag convention

Fix coverage and ruff source

Add developer notes

Fix coverage configuration

Use a counter for data for shorter file paths

Fix lint

Remove extraneous colon

Install zip for Windows

Add more tests

Use sys.executable for full path to python

Badges

Rename to repzip as a working name

Add codecov.yml configuration

* Rename to rpzip

* Finish renaming to rpzip

* Swap to ruff formatter; fix lint

* Pin below pytest 8 for pytest-cases compatibility

* More README polish

* Update changelog

---------

Co-authored-by: Jay Qi <jayqi@users.noreply.github.com>
  • Loading branch information
jayqi and jayqi authored Jan 28, 2024
1 parent 84c6f1f commit fe6275d
Show file tree
Hide file tree
Showing 13 changed files with 489 additions and 52 deletions.
54 changes: 54 additions & 0 deletions .github/workflows/release-cli.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
name: release

on:
push:
tags:
- "cli-v*"

jobs:
build:
name: Publish CLI release
runs-on: "ubuntu-latest"
defaults:
run:
working-directory: ./cli

steps:
- uses: actions/checkout@v3

- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.10"

- name: Install hatch
run: |
pip install hatch
- name: Check that versions match
id: version
run: |
echo "Release tag: [${{ github.event.release.tag_name }}]"
PACKAGE_VERSION=$(hatch run rpzip --version)
echo "Package version: [$PACKAGE_VERSION]"
[ ${{ github.event.release.tag_name }} == "v$PACKAGE_VERSION" ] || { exit 1; }
echo "::set-output name=major_minor_version::v${PACKAGE_VERSION%.*}"
- name: Build package
run: |
hatch build
- name: Publish to Test PyPI
uses: pypa/gh-action-pypi-publish@v1.3.0
with:
user: ${{ secrets.PYPI_TEST_USERNAME }}
password: ${{ secrets.PYPI_TEST_PASSWORD }}
repository_url: https://test.pypi.org/legacy/
skip_existing: true

- name: Publish to Production PyPI
uses: pypa/gh-action-pypi-publish@v1.3.0
with:
user: ${{ secrets.PYPI_PROD_USERNAME }}
password: ${{ secrets.PYPI_PROD_PASSWORD }}
skip_existing: false
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
name: release
name: release-lib

on:
release:
types:
- published
push:
tags:
- "v*"

jobs:
build:
name: Build and publish new release
name: Publish library release
runs-on: "ubuntu-latest"

steps:
Expand Down
48 changes: 15 additions & 33 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,12 @@ jobs:
run: |
pip install hatch
- name: Install zip on Windows
if: matrix.os == 'windows-latest'
run: |
choco install zip
- name: Run tests
run: |
hatch run tests.py${{ matrix.python-version }}:test
Expand Down Expand Up @@ -89,36 +95,12 @@ jobs:
.venv-sdist/$PYTHON_BIN -m pip install dist/repro_zipfile-*.tar.gz --force-reinstall
.venv-sdist/$PYTHON_BIN -c "from repro_zipfile import ReproducibleZipFile"
# - name: Test building documentation
# run: |
# hatch run docs:build
# if: matrix.os == 'ubuntu-latest' && matrix.python-version == '3.10'

# - name: Deploy site preview to Netlify
# if: |
# matrix.os == 'ubuntu-latest' && matrix.python-version == '3.10'
# && github.event.pull_request != null
# uses: nwtgck/actions-netlify@v1.1
# with:
# publish-dir: "./site"
# production-deploy: false
# github-token: ${{ secrets.GITHUB_TOKEN }}
# deploy-message: "Deploy from GitHub Actions"
# enable-pull-request-comment: true
# enable-commit-comment: false
# overwrites-pull-request-comment: true
# alias: deploy-preview-${{ github.event.number }}
# env:
# NETLIFY_AUTH_TOKEN: ${{ secrets.NETLIFY_AUTH_TOKEN }}
# NETLIFY_SITE_ID: ${{ secrets.NETLIFY_SITE_ID }}
# timeout-minutes: 1

# notify:
# name: Notify failed build
# needs: [code-quality, tests]
# if: failure() && github.event.pull_request == null
# runs-on: ubuntu-latest
# steps:
# - uses: jayqi/failed-build-issue-action@v1
# with:
# github-token: ${{ secrets.GITHUB_TOKEN }}
notify:
name: Notify failed build
needs: [code-quality, tests]
if: failure() && github.event.pull_request == null
runs-on: ubuntu-latest
steps:
- uses: jayqi/failed-build-issue-action@v1
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
6 changes: 5 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
# Changelog
# Changelog — repro-zipfile

## v0.3.0 (2024-01-27)

- Added a `cli` installation extra for installing the rpzip package, which includes a command-line program

## v0.2.0 (2024-01-08)

Expand Down
20 changes: 20 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,12 @@ Please file an issue in the [issue tracker](https://github.com/drivendataorg/rep

This project uses [Hatch](https://github.com/pypa/hatch) as its project management tool.

### Directory structure

This is a monorepo containing both the repro-zipfile library package and the rpzip CLI package. The root of the repository contains files relevant to the library package, and the CLI package is in the subdirectory `cli/`.

Tests for both packages are combined in `tests/`.

### Tests

To run tests in your current environment, you should install from source with the `tests` extra to additionally install test dependencies (pytest). Then, use pytest to run the tests.
Expand Down Expand Up @@ -58,3 +64,17 @@ hatch run typecheck
### Configuring IDEs with the Virtual Environment

The default hatch environment is configured to be located in `./venv/`. To configure your IDE to use it, point it at that environment's Python interpreter located at `./venv/bin/python`.

### Releases and publishing to PyPI

The release process of building and publishing the packages is done using GitHub Actions CI. There are two workflows:

- `release-lib` — for the repro-zipfile library package
- `release-cli` — for the rpzip CLI package

Each package should be released independently.

To trigger a release, publish a release through the GitHub web UI. Use a different tag naming scheme to determine which release workflow you trigger:

- `v*` (e.g., `v0.1.0`) to publish repro-zipfile
- `cli-v*` (e.g., `cli-v0.1.0`) to publish rpzip
35 changes: 32 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,13 @@
[![tests](https://github.com/drivendataorg/repro-zipfile/actions/workflows/tests.yml/badge.svg?branch=main)](https://github.com/drivendataorg/repro-zipfile/actions/workflows/tests.yml?query=branch%3Amain)
[![codecov](https://codecov.io/gh/drivendataorg/repro-zipfile/branch/main/graph/badge.svg)](https://codecov.io/gh/drivendataorg/repro-zipfile)

**A tiny, zero-dependency replacement for Python's `zipfile.ZipFile` for creating reproducible/deterministic ZIP archives.**
**A tiny, zero-dependency replacement for Python's `zipfile.ZipFile` library for creating reproducible/deterministic ZIP archives.**

"Reproducible" or "deterministic" in this context means that the binary content of the ZIP archive is identical if you add files with identical binary content in the same order. It means you can reliably check equality of the contents of two ZIP archives by simply comparing checksums of the archive using a hash function like MD5 or SHA-256.

This Python package provides a `ReproducibleZipFile` class that works exactly like [`zipfile.ZipFile`](https://docs.python.org/3/library/zipfile.html#zipfile-objects) from the Python standard library, except that all files written to the archive have their last-modified timestamps set to a fixed value.
This Python package provides a `ReproducibleZipFile` class that works exactly like [`zipfile.ZipFile`](https://docs.python.org/3/library/zipfile.html#zipfile-objects) from the Python standard library, except that certain file metadata are set to fixed values. See the ["How does repro-zipfile work?" section](#how-does-repro-zipfile-work) below for details.

You can also optionally install a command-line program, **rpzip**. See the ["rpzip command line program"](#rpzip-command-line-program) section further below.

## Installation

Expand Down Expand Up @@ -45,7 +47,34 @@ Note that files must be written to the archive in the same order to reproduce an

See [`examples/usage.py`](./examples/usage.py) for an example script that you can run, and [`examples/demo_vs_zipfile.py`](./examples/demo_vs_zipfile.py) for a demonstration in contrast with the standard library's zipfile module.

For more advanced usage, such as customizing the fixed metadata values, see the following section.
For more advanced usage, such as customizing the fixed metadata values, see the subsections under ["How does repro-zipfile work?"](#how-does-repro-zipfile-work).

## rpzip command-line program

[![PyPI](https://img.shields.io/pypi/v/rpzip.svg)](https://pypi.org/project/rpzip/)

You can optionally install a lightweight command-line program, **rpzip**. This includes an additional dependency on the [typer](https://typer.tiangolo.com/) CLI framework. You can install it either directly or using the `cli` extra with repro-zipfile:

```bash
pip install rpzip
# or
pip install repro-zipfile[cli]
```

rpzip is designed to a partial drop-in replacement ubiquitous [zip](https://linux.die.net/man/1/zip) program. Use `rpzip --help` to see the documentation. Here are some usage examples:

```bash
# Archive a single file
rpzip archive.zip examples/data.txt
# Archive multiple files
rpzip archive.zip examples/data.txt README.md
# Archive multiple files with a shell glob
rpzip archive.zip examples/*.py
# Archive a directory recursively
rpzip -r archive.zip examples
```

In addition to the fixed file metadata done by repro-zipfile, rpzip will also always sort all paths being written.

## How does repro-zipfile work?

Expand Down
5 changes: 5 additions & 0 deletions cli/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Changelog — rpzip

## v0.1.0 (2024-01-27)

Initial release! 🎉
20 changes: 20 additions & 0 deletions cli/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
MIT License

Copyright (c) 2023 DrivenData Inc.

Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the “Software”), to deal in
the Software without restriction, including without limitation the rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
the Software, and to permit persons to whom the Software is furnished to do so,
subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
14 changes: 14 additions & 0 deletions cli/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# rpzip — a CLI backed by repro-zipfile

[![PyPI](https://img.shields.io/pypi/v/rpzip.svg)](https://pypi.org/project/rpzip/)
[![Supported Python versions](https://img.shields.io/pypi/pyversions/rpzip)](https://pypi.org/project/rpzip/)
[![tests](https://github.com/drivendataorg/repro-zipfile/actions/workflows/tests.yml/badge.svg?branch=main)](https://github.com/drivendataorg/repro-zipfile/actions/workflows/tests.yml?query=branch%3Amain)
[![codecov](https://codecov.io/gh/drivendataorg/repro-zipfile/branch/main/graph/badge.svg)](https://codecov.io/gh/drivendataorg/repro-zipfile)

**A lightweight command-line program for creating reproducible/deterministic ZIP archives.**

"Reproducible" or "deterministic" in this context means that the binary content of the ZIP archive is identical if you add files with identical binary content in the same order. It means you can reliably check equality of the contents of two ZIP archives by simply comparing checksums of the archive using a hash function like MD5 or SHA-256.

This package provides a command-line program named **rpzip**. It is designed as a partial drop-in replacement for the ubiquitous [zip](https://linux.die.net/man/1/zip) program and implements a commonly used subset of zip's inferface.

For further documentation, see the ["rpzip command line program"](https://github.com/drivendataorg/repro-zipfile#rpzip-command-line-program) section of the repro-zipfile README.
45 changes: 45 additions & 0 deletions cli/pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[project]
name = "rpzip"
dynamic = ["version"]
description = "A lightweight command-line program for creating reproducible/deterministic ZIP archives."
readme = "README.md"
requires-python = ">=3.8"
license = "MIT"
keywords = ["zipfile", "zip", "reproducible", "deterministic", "cli"]
authors = [{ name = "DrivenData", email = "info@drivendata.org" }]
classifiers = [
"Intended Audience :: Developers",
"Intended Audience :: Science/Research",
"License :: OSI Approved :: MIT License",
"Operating System :: OS Independent",
"Programming Language :: Python",
"Programming Language :: Python :: 3.8",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"Topic :: System :: Archiving",
"Topic :: System :: Archiving :: Compression",
"Topic :: System :: Archiving :: Packaging",
]
dependencies = ["repro-zipfile", "typer>=0.9.0", "typing_extensions>=3.9 ; python_version < '3.9'"]

[project.scripts]
rpzip = "rpzip:app"

[project.urls]
Documentation = "https://github.com/drivendataorg/repro-zipfile#readme"
Issues = "https://github.com/drivendataorg/repro-zipfile/issues"
Source = "https://github.com/drivendataorg/repro-zipfile/tree/main/cli"

[tool.hatch.version]
path = "rpzip.py"

## TOOLS ##

[tool.black]
line-length = 99
Loading

0 comments on commit fe6275d

Please sign in to comment.