Trying to understand spancat vs NER use-cases #8930

thalishsajeed · 2021-08-11T09:34:52Z

thalishsajeed
Aug 11, 2021

I'm trying to understand situations where I am better off using spancat vs NER.

Description from the doc Pipeline component for labeling potentially overlapping spans of text.

Are overlapping entities the only scenario where I should opt for spancat?

Let's say my use case is the WORK_OF_ART entity, currently if some span such as "Harry Potter and the Chamber of Secrets" is identified as a WORK_OF_ART , Harry Potter no longer gets identified as a name.

Let's say if currently I am using two separate NER models to identify "Harry Potter and the Chamber of Secrets" as WORK_OF_ART and "Harry Potter" as PERSON. Would I be able to train a single spancat model to do this task better than NER?

I'm having trouble establishing boundaries on when I "should not" use spancat and why is spancat not a replacement for NER.

Answered by polm

Aug 11, 2021

When you use the default NER model, it has the constraint that a single token can't be more than one label, so it learns tradeoffs between different labels. In contrast the spancat component can't make that assumption, so it has more limited information to draw from.

To give a concrete if somewhat contrived example, consider these sentences.

John lives in XXX.
John lives at XXX.

In 1, XXX could be a GPE (country, state, city) or a LOC (non-GPE location). In 2 it would not be a GPE (X "John lives at Spain") but could be a LOC ("the North Pole").

In spancat these associations would have to be learned separately for each label type, since the fact that "lives at" is followed by a LOC doesn…

View full answer

polm · 2021-08-11T13:51:58Z

polm
Aug 11, 2021

When you use the default NER model, it has the constraint that a single token can't be more than one label, so it learns tradeoffs between different labels. In contrast the spancat component can't make that assumption, so it has more limited information to draw from.

To give a concrete if somewhat contrived example, consider these sentences.

John lives in XXX.
John lives at XXX.

In 1, XXX could be a GPE (country, state, city) or a LOC (non-GPE location). In 2 it would not be a GPE (X "John lives at Spain") but could be a LOC ("the North Pole").

In spancat these associations would have to be learned separately for each label type, since the fact that "lives at" is followed by a LOC doesn't rule out a GPE (since multiple labels are possible and the decisions are independent). But in the default NER model, when it sees that "lives at" is often followed by a LOC it automatically takes part of the probability space away from GPE and other labels (because it has to). So because this is a single intellectual step in the basic NER model, there's fewer invalid/useless states for it to get stuck in.

The result of this is that generally the basic NER model should have better accuracy. Because of that I would generally recommend the NER model instead of spancat unless you specifically need some of the spancat features.

5 replies

thalishsajeed Aug 11, 2021
Author

Thanks, which leads me to ask what would be a scenario where one might say that spancat is the better workflow to be followed.

polm Aug 12, 2021

You should use spancat if you need the features it provides, like overlapping spans or per-span scores, or if you need to specify candidate spans directly.

thalishsajeed Aug 12, 2021
Author

hmm. Do you think there's a way to use spancat for targeted sentiment analysis/ Entity Level Sentiment . Like maybe I can pass the candidate spans as already detected NER entities and spancat has to categorize it as NEGATIVE. NEUTRAL, POSITIVE. I'm linking a parallel thread I have started if you want more details. I'd be interested in hearing your thoughts.

Link - #8931

polm Aug 12, 2021

I saw #8931, I think Thomas's advice is good and you should follow it. If you want to do with spancat you can do that by passing a window around your entities or something.

Spancat might work better than splitting + using textcat because it can use the sentence as a whole for reference, but it might not make much difference. You can try both things.

In general, when you ask whether X or Y is better is machine learning, the best thing is to try both approaches and measure how well they do.

thalishsajeed Aug 12, 2021
Author

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trying to understand spancat vs NER use-cases #8930

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 5 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Trying to understand spancat vs NER use-cases #8930

thalishsajeed Aug 11, 2021

Replies: 1 comment · 5 replies

polm Aug 11, 2021

thalishsajeed Aug 11, 2021 Author

polm Aug 12, 2021

thalishsajeed Aug 12, 2021 Author

polm Aug 12, 2021

thalishsajeed Aug 12, 2021 Author

thalishsajeed
Aug 11, 2021

Replies: 1 comment 5 replies

polm
Aug 11, 2021

thalishsajeed Aug 11, 2021
Author

thalishsajeed Aug 12, 2021
Author

thalishsajeed Aug 12, 2021
Author