Skip to content

alan-turing-institute/arc-selective-forgetting

Repository files navigation

Selective Forgetting

This project builds on the TOFU paper and codebase (Maini, Feng, Schwarzschild et al., 2024), which published a dataset for benchmarking approximate unlearning methods - techniques for removing knowledge of a concept from a large language model.

We explore two research questions:

  • Does the granularity of the concepts being forgotten impact the quality of forgetting that can be achieved? By granularity here we mean the position in a hierarchy of concepts, for example a book is published by an author, an author writes multiple books, and a publisher publishes multiple authors.

  • Does removing a relationship between two entities reduce the performance of the model on unrelated questions about those entities?

To address these we create a new TOFU-inspired question-answer dataset with entities of different types (publishers, authors, books) and with each question-answer pair in the dataset labelled with the entities it refers to.

Installation/Development

You can pip install the dependencies and arcsf library with pip install . in your preferred virtual environment.

We developed the code with Poetry:

  1. Install dependencies with Poetry

    poetry install
  2. Install pre-commit hooks:

    poetry run pre-commit install --install-hooks

Usage

See the config files and scripts readmes.

About

ARC project repository for Selective Forgetting

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •