spacy spancat pipeline performance improvement above texcat #11646
-
How much has the spanCategorizer improved your models? I am curious. I have been using the textcat for categorizing text with a recall of about 85%. I wonder how much applying a spancategorizer could make a difference. I am trying to predict if a question of a questionnaire will bring confidential (personally identifiable) information (such as name, telephone number, address, social security number, etc.). Some questions may be very long, and then the textcat gets confused. I am expecting that being able to catch key terms should improve the prediction, but I wonder how much improvement others have brought to their models. Many thanks for your answers! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
Let me link #11663, since it's related. As mentioned there, using NER/spancat annotations as input to textcat is possible, but not very likely to help. #10470 covers this approach and links to some other Discussions on the issue. That said, if you do try it we'd love to hear about how it goes! |
Beta Was this translation helpful? Give feedback.
Let me link #11663, since it's related.
As mentioned there, using NER/spancat annotations as input to textcat is possible, but not very likely to help. #10470 covers this approach and links to some other Discussions on the issue.
That said, if you do try it we'd love to hear about how it goes!