Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Current status of the Obsoletes metadata? #154

Open
pganssle opened this issue May 26, 2018 · 10 comments
Open

Current status of the Obsoletes metadata? #154

pganssle opened this issue May 26, 2018 · 10 comments

Comments

@pganssle
Copy link
Member

From a question on the PyPA IRC, I looked into how one would specify that one project obsoletes another one (such as when the PyPI name changes but the package provides the same importable name). It seems that setuptools supports the obsoletes keyword and will successfully produce a PKG-INFO with the right information, you can use the following example to test this:

Steps to reproduce:

1. Create initial packages:

mkdir -p test_obsoletes/testpkg test_obsoleted/testpkg0
touch test_obsoletes/setup.py test_obsoletes/testpkg/__init__.py
touch test_obsoleted/setup.py test_obsoleted/testpkg/__init__.py

Result:

$ tree
.
├── test_obsoleted
│   ├── setup.py
│   └── testpkg
│       └── __init__.py
└── test_obsoletes
    ├── setup.py
    └── testpkg
        └── __init__.py

2. Populate the setup.py:

test_obsoletes/setup.py:

from setuptools import setup, find_packages

setup(name='testpkg',
      packages=find_packages(),
      version='0.0.2',
      obsoletes=['testpkg0']
      )

test_obsoleted/setup.py:

from setuptools import setup, find_packages


setup(name='testpkg0',
      packages=find_packages(),
      version='0.0.1'
      )

3. Check the metadata

Looks like the obsoletes keyword properly produces the Obsoletes metadata:

$ cd test_obsoletes
$ python setup.py sdist
...
$ cat testpkg.egg-info/PKG-INFO 
Metadata-Version: 1.1
Name: testpkg
Version: 0.0.2
Summary: UNKNOWN
Home-page: UNKNOWN
Author: UNKNOWN
Author-email: UNKNOWN
License: UNKNOWN
Description: UNKNOWN
Platform: UNKNOWN
Obsoletes: testpkg0
$ cd ..

3. Make a virtualenv

python3.6 -m virtualenv temp
source temp/bin/activate

4. Install the obsoleted package

pip install ./test_obsoleted

Result:

$ pip freeze
testpkg0==0.0.1

5. Install the obsoleting package:

pip install ./test_obsoletes

Result:

$ pip freeze
testpkg==0.0.2
testpkg0==0.0.1

The current version of the distutils documentation indicates that you can use the obsoletes keyword, and indeed this does produce the right metadata, but pip doesn't actually do anything with that information. I am guessing that this is because obsoletes is deprecated, but is there no replacement for this? Is there no way to say, "You should not have both this package and that package installed at the same time"?

The options I see:

  1. pip starts respecting Obsoletes and will uninstall packages obsoleted by what is currently being installed. This may cause things to break so there may need to be a temporary workaround to prevent conflicts from arising).
  2. pip starts issuing warnings when an obsoleted package is present ("Hey this obsoletes X which you also have installed. You probably want to do pip uninstall X"). Not sure how well this will work if the obsoleted and obsoleting packages provide the same files.
  3. If a replacement mechanism is already in place, document it. Make it pretty clear in distutils and setuptools what the new way is, and start issuing deprecation warnings when obsoletes is found in a setup.py.
  4. If a replacement mechanism is not in place, develop it and then GOTO 3.
  5. If no replacement mechanism exists and none is desired, again, make it very clear in the documentation and with deprecation warnings that this keyword does nothing and have some clear explanation for what you are supposed to do in the situation that Obsoletes was developed for.
@pganssle
Copy link
Member Author

I'm noticing that Obsoletes-Dist is part of the Core Metadata specification. So presumably we're looking at option 3?

@ncoghlan
Copy link
Member

ncoghlan commented May 27, 2018

It's Option 4 - the problem we have with Obsoletes-Dist is that it means that a hostile fork can claim that it obsoletes the existing project even when the maintainers of the original project disagree.

The proposed replacement is an Obsoleted-By field that goes on the original project, thus ensuring it can only be set with the cooperation of the original authors.

There isn't a currently active doc that discusses that, but earlier versions of PEP 426 had draft definitions for new Provides and Obsoleted-By fields that attempted to handle the fact that PyPI isn't a curated repo and hence arbitrary packages hosted there need to be assumed to be potentially hostile: https://github.com/python/peps/blob/918d676de1375aa3d6f16c3ee3c1117887e68714/pep-0426.txt#L863

(Marking completely defunct projects as obsolete would need to be handled via a different mechanism that didn't require a new release of the obsolete project, but I figure if we ever do that it will be better handled as an online service, potentially even part of Warehouse, with an explicit governance process akin to PEP 541's handling of name reassignments)

@pganssle
Copy link
Member Author

pganssle commented May 27, 2018

It's Option 4 - the problem we have with Obsoletes-Dist is that it means that a hostile fork can claim that it obsoletes the existing project even when the maintainers of the original project disagree.

The proposed replacement is an Obsoleted-By field that goes on the original project, thus ensuring it can only be set with the cooperation of the original authors.

This situation already exists, because right now what happens is that if two packages provide the same importable name, pip will just overwrite the files with whichever one was installed (or uninstalled) most recently. You can test this with my original example by putting print('Obsoleted') and print('Obsoletes') in the respective __init__.py files, you get:

$ pip install ./test_obsoletes
...
$ python -c 'import testpkg'
Obsoletes
$ pip install ./test_obsoleted
...
$ python -c 'import testpkg'
Obsoleted
$ pip uninstall testpkg0
...
$ pip freeze
testpkg==0.0.2
$ python -c 'import testpkg'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'testpkg'

The only difference that having Obsoletes or Obsoletes-Dist would do is make it so that the obsoleted packages are deleted before they are replaced. Alternatively it could stop up the installation of the obsoleting package and say, "The thing you obsolete is in this distribution, uninstall it first or don't install this one."

The Obsoleted-By keyword is also unnecessary, because the motivating problem is that package mylib used to depend on pyldap, which was the maintained fork of python-ldap. python-ldap and pyldap then merged, and pyldap >= 3.0 is just an empty package that depends on python-ldap. Both packages provide the ldap importable name, IIRC. Now mylib wants to change its dependency to be on python-ldap, but when they do so, anyone upgrading will still have the old dependency in their distribution and nothing would trigger it to get upgraded to version 3.0.

So, basically, you can already do an "obsoleted by", you can also silently do the dangerous thing of overwriting another package's importable name. What you can't do is do the responsible thing and say, "Hey actually this package is intended to be an upgrade of that other package, which is why they're not intended to be installed side-by-side."

The idea that you could just "obsolete" some random package was also my first concern, but in the end I think it's relatively minor. njs on #pypa (my guess was that that's @njsmith but I could be wrong) had a more compelling alternative, which is to widen the field to just Conflicts-Dist, since "I obsolete X" is just a very specific reason that your package conflicts with X. There are other reasons why two packages shouldn't both be installed at the same time, and it's probably worth declaring ahead of time and possibly enforcing at install time (though that last bit definitely needs to be overridable).

@njsmith
Copy link
Member

njsmith commented May 27, 2018

There was some discussion of this in #pypa today, that seems relevant enough to paste here:

[18:07:24] <dstufft> obsoletes is like requires and provides
[18:07:31] <dstufft> it is used to specify import-able names
[18:07:34] <dstufft> not names on PyPI
[18:09:55] <pganssle> dstufft: Is it used for anything? It seems deprecated.
[18:10:01] <dstufft> it is not
[18:10:06] <dstufft> neither is obsoletes-dist
[18:10:23] <dstufft> obsoletes-dist isn't used because it flows in the wrong direction
[18:11:03] <dstufft> e.g. I shouldn't be able to say Django is obsoleted by my random package
[18:33:37] <pganssle> dstufft: I dunno, that was my thought at first, too, but I'm not sure it's a *big* problem.
[18:34:08] <pganssle> What happens if you just have a package that just provides a `django` package now?
[18:35:27] <pganssle> And there's a pretty significant real use case for this, which is people who create drop-in replacements that are intended to displace� other software, particularly forks where the original maintainers are unresponsive.
[18:36:21] <pganssle> I am not sure it would be terrible if `ruamel.yaml` had been able to provide just a `yaml` package and declare that it obsoletes `pyyaml`.
[18:37:02] <pganssle> I think right now the conflicting packages fail silently, whereas this would fail at install time if you had drop-in-replacement packages.
[18:38:10] <pganssle> I do think either way it makes sense for it to be able to go the other way, where `pyyaml` can say, "Actually use `ruamel.yaml`", but that leads to the same problem that Xel was discussing above.
[18:38:51] <pganssle> Namely that if all your dependencies explicitly depend on the obsoleting package, the obsoleted package will end up orphaned.
[22:23:13] <njs> pganssle: 'Conflicts-Dist' metadata would definitely be useful to track packages that don't work together (possibly with particular versions) -- there are a number of cases where this happens, beyond forks
[22:23:38] <njs> unfortunately pip needs a resolver before a Conflicts-Dist field can do anything useful
[22:23:43] <njs> but eventually that will happen
[22:24:10] <njs> and I feel like it's unclear what semantics people expect from Obsoletes-Dist, beyond what Conflicts-Dist would do?
[23:34:50] <pganssle> Who wrote the two separate PEPs that created Obsoletes and Obsoletes-Dist?
[23:35:09] <pganssle> I think Conflicts-Dist would probably� work fine.
[23:38:19] <njs> AFAIK the non-Dist versions of these fields mostly come from Long Long Ago, when there was no pip or pypi or anything and the metadata spec was mostly based on guesses and good intentions
[23:38:50] <njs> and then the -Dist versions are from an era where people tried to clean up some of the stuff from the previous era, but didn't necessarily rethink it all from scratch
[23:44:02] <pganssle> In any case, yeah, I think opening it up to Conflicts is probably useful in some general sense, but Obsoletes already exists and it has a pretty clear usage.
[23:44:18] <pganssle> It's a bit weird that it's in the spec but the metadata isn't actually used anywhere by anything.
[23:45:04] <njs> it's clear when you might tag something as Obsoletes:, but it's not at all clear what tools should do in response to that tag :-)
[23:46:05] <njs> because it's not making a specific technical claim, it's making a kind of social organizational claim: "we think downstream users should consider switching to this other thing"
[23:52:13] <Xel> getting closer to what a distribution does ;)
[23:53:25] <njs> (a priori, it would even make sense to use the Obsoletes: tag for two packages that can be installed at the same time, but where one is the old deprecated one and people should be switching to the other) 
[00:02:10] <pganssle> Sure, that's fair. Some sort of warning maybe?
[00:02:33] <pganssle> "Hey we notice you have X installed, but Y has marked itself as obsoleting X."
[00:03:14] <pganssle> Obviously that could get annoying. "Hey we notice you have a car, but bicycles, Segways and boosted boards have marked themselves as obsoleting cars."

@ncoghlan
Copy link
Member

For the current ability to silently corrupt an installation, I think that's a missing feature in pip: if files from a project already exist on disk, but the corresponding install metadata is missing, then pip should refuse to proceed with the installation until you move the other files out of the way or pass a --force option to make the install happen anyway. (It could potentially also keep a reverse-lookup cache somewhere, mapping installed files back to the project that owns them, such that it could put the conflicting project's name in the error message when the file was previously laid down by pip)

I do think a Conflicts-Dist to straight up prevent co-installation could be an improvement on the status quo.

On some other points:

@pganssle
Copy link
Member Author

For the current ability to silently corrupt an installation, I think that's a missing feature in pip

Yes, this is a whole can of worms to open, honestly. It probably needs to be opened at some point, but this causes a lot of headaches at my day job when we use DPKG, which, sensibly, prevents two packages from installing the same file in the same location. Since pip doesn't do this, a lot of namespace packages (at least those still supporting Python 2, which is probably most of them) are structured in such a way where more than one of the packages in the namespace provides an __init__.py. There are other headaches as well, like resolving what happens when you do pip2 install twine and pip3 install twine and both try to install an entry point into /usr/bin/twine.

In the end something reasonable has to happen and it might involve warnings and reference counts or something, but I think Obsoletes-Dist and Conflicts-Dist are somewhat simpler to handle and can be fixed sooner than that.

I would propose that we work out a plan for Conflicts-Dist and have Obsoletes-Dist be a deprecated special case of Conflicts-Dist. pip should require a --force (or possibly --choose-conflicts or something) in order to allow side-by-side installation of two packages, one of which is obsoleted by the other. Once we have a metadata version 2.2 that includes Conflicts-Dist, setuptools will start issuing deprecation warnings for the obsoletes keyword (or obsoletes_dist or whatever we go with) in favor of a conflicts or conflicts_dist keyword.

@pganssle
Copy link
Member Author

pganssle commented May 27, 2018

Though I guess @njsmith 's point in the chat about how "Obsoletes" might not be conflicting, it might just be that one is deprecated in favor of the other still needs to be addressed. That may weigh in favor of "Obsoletes" being just a warning until "Conflicts" comes around. That said, I'm not really sure why you need to mark deprecation at the packaging metadata level. That can easily be handled at runtime, on websites, etc. The main reason to include it in the packaging metadata is when it's actually a problem to install them at the same time, which would argue in favor of the "special case of Conflicts" approach.

@dan-blanchard
Copy link

dan-blanchard commented Oct 1, 2018

Is there a currently supported mechanism that can be used to handle package renames like with the msgpack library (see their current README) where you can prevent msgpack-python from being installed if msgpack is installed? As a maintainer of a packager that relies on msgpack, I'm trying to figure out a way I can update my setup.py so that people will automatically get switched from msgpack-python to msgpack when they install my package.

@timwhite
Copy link

timwhite commented Apr 2, 2019

+1 for Conflicts-Dist

@timthelion
Copy link

Conflicts-Dist is a good solution. It would have saved me hours of work debugging an error which was caused by pipenv intermitantly installing either jsonfield or django-jsonfield based on the changing order of operations while installing dependencies.
https://stackoverflow.com/questions/57959565/how-can-i-ban-a-package-from-being-added-to-pipenv-lock-and-installed-by-pipenv/57982959#57982959

navytux added a commit to navytux/ZODB that referenced this issue Nov 30, 2020
…atibility with ZODB4 and ZODB5

This depend-only eggs have to be installed manually for
forward-compatibility with ZODB4 and ZODB5, so that e.g. installing
any egg that depends on 'ZODB' (not 'ZODB3') will not pull in ZODB5.
Example of those eggs are zodbtools, zodburi, ...

Unfortunately there is no way for this to work automatically - for
example there is 'provides' and 'provides-dist' setup keywords and
corresponding metadata, but no Python package manager actually supports
that:

    https://packaging.python.org/specifications/core-metadata/#rarely-used-fields
    https://www.python.org/dev/peps/pep-0314/#provides-multiple-use
    https://www.python.org/dev/peps/pep-0345/#provides-dist-multiple

    pypa/packaging-problems#154   (obsoletes/provides not supported)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants