Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preset span suggester KeyError on training #12861

Closed
FaediMichele opened this issue Jul 26, 2023 · 3 comments
Closed

Preset span suggester KeyError on training #12861

FaediMichele opened this issue Jul 26, 2023 · 3 comments
Labels
feat / spancat Feature: Span Categorizer

Comments

@FaediMichele
Copy link

FaediMichele commented Jul 26, 2023

I want a spancat that categorize spans of text suggested by another component. I want to train the spancat from the gold standard so I'm just interested in the classification part and not in the SpanSuggester. To do this I have a training and validation docbin files, each doc have the spangroup 'sc' and each span have its own label. For the config file I just changed the default ngram suggester with the spacy.preset_span_suggester.v1.

[...] 
[components.spancat.suggester]
@misc = "spacy.preset_spans_suggester.v1"
spans_key = "sc"
[...]

As soon as the training start it crash in:
[...]/spacy/pipeline/spancat.py", line 123, in preset_spans_suggester
if doc.spans[spans_key]:

So I tried debug the preset_span_suggester function to print the spans that the doc contains and print an empty dict, but the input data is correct

EDIT:
I came to the conclusion that preset_span_suggester is not meant for this. I think that I need a component that copy the SpanGroup from the "reference" to the "predicted" document of the Examples, then I can use the preset_span_suggester, am I right?

@shadeMe shadeMe added the feat / spancat Feature: Span Categorizer label Jul 26, 2023
@shadeMe
Copy link
Contributor

shadeMe commented Jul 27, 2023

Have you added component that suggests the spans to the annotating_components list in the config? That will save its annotations in the predicted document during training.

@adrianeboyd adrianeboyd added the more-info-needed This issue needs more information label Jul 31, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Aug 4, 2023

This issue has been automatically closed because there has been no response to a request for more information from the original author. With only the information that is currently in the issue, there's not enough information to take action. If you're the original author, feel free to reopen the issue if you have or find the answers needed to investigate further.

@github-actions github-actions bot closed this as completed Aug 4, 2023
@github-actions github-actions bot removed the more-info-needed This issue needs more information label Aug 4, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Sep 4, 2023

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 4, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feat / spancat Feature: Span Categorizer
Projects
None yet
Development

No branches or pull requests

3 participants