Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spacy 2.0 fails to label plural proper nouns as NNPS #1254

Closed
anna-hope opened this issue Aug 10, 2017 · 4 comments
Closed

Spacy 2.0 fails to label plural proper nouns as NNPS #1254

anna-hope opened this issue Aug 10, 2017 · 4 comments
Labels
lang / en English language data and models models Issues related to the statistical models perf / accuracy Performance: accuracy

Comments

@anna-hope
Copy link

anna-hope commented Aug 10, 2017

Sample input:

>>> doc = "Would this work with 5 iPhones?"
>>> doc[5].tag_ # expected NNPS
NNP

This occurs with every other proper noun I have tested. Spacy 1.9 correctly tags "iPhones" as NNPS.

Info about spaCy

  • spaCy version: 2.0.0a7
  • Platform: Linux-4.4.0-43-Microsoft-x86_64-with-debian-stretch-sid
  • Python version: 3.6.1
  • Models: en_core_web_sm, en_default
@ines ines added performance 🌙 nightly Discussion and contributions related to nightly builds lang / en English language data and models models Issues related to the statistical models labels Oct 4, 2017
@ines
Copy link
Member

ines commented Oct 27, 2017

Thanks for the report! Just tested it with the latest en_core_web_sm with the version currently on develop, and I'm getting NNS – still wrong, but this time in the other direction.

We're about to push some changes to the parser and are currently training new models, so once they're ready, I'll check again and see how they perform on your example.

@ines
Copy link
Member

ines commented Nov 3, 2017

Update: tried it with the new models and v2.0.0a18 and still getting NNS for your example.

However, for the latest version, we've also updated the training process and the examples, so you might be able to achieve pretty good results by updating the tagger with a few correct NNPS examples. See here for the example code: https://github.com/explosion/spaCy/blob/develop/examples/training/train_tagger.py

@ines ines removed the 🌙 nightly Discussion and contributions related to nightly builds label Nov 9, 2017
@ines ines added perf / accuracy Performance: accuracy and removed performance labels Aug 15, 2018
@ines
Copy link
Member

ines commented Dec 14, 2018

Merging this with #3052. We've now added a master thread for incorrect predictions and related reports – see the issue for more details.

@ines ines closed this as completed Dec 14, 2018
@lock
Copy link

lock bot commented Jan 13, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Jan 13, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lang / en English language data and models models Issues related to the statistical models perf / accuracy Performance: accuracy
Projects
None yet
Development

No branches or pull requests

2 participants