SpanCategorizer: SPAN context vs content #8947
-
When training SpanCategorizer, is there any way of controlling training to emphasize span context over the span content? A disclaimer first: I admit a very limited knowledge of how Spacy neural networks work under the hood... In my observation of Spacy NER (and now SpanCat) behavior, it seems to take into account (while training) BOTH the surrounding 'window' (context) AND the labelled NER/span 'value' (content). For illustration, when 'training' on the following 3 sentences, where trained PLACE entities are London and Paris:
Then the PLACE prediction for sentence 'It is not too cold in London in winter' may return 'London' because training has 'taught' the model that London (the content) is the PLACE - regardless of the context. In my case, I need to de-emphasize the entity/span content. Depending upon the context, 'London' above may be a PLACE or a totally different entity/span label. The knowledge that London frequently happened to be the PLACE is counter-productive. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
There is not a flag or value you can modify to change this behavior. Basically the model will decide on how to represent context and token info based on the training data. What you can do is augment your training data by replacing your labeled entities with other words. If you use common nouns then the model may be able to learn to rely on context more, though I think the effect might not be very significant. If you're feeling very adventurous you could try detecting entities, masking them by replacing them with a filler token like Are you doing something where you label roles and not just entities? Like if you have "Alice sold Bob a car" and "Bob bought a car from Alice" and you're labelling not PERSON but BUYER and SELLER? That's called Semantic Role Labelling, and it's very similar to NER but harder. You might want to look at the research for that. |
Beta Was this translation helpful? Give feedback.
There is not a flag or value you can modify to change this behavior. Basically the model will decide on how to represent context and token info based on the training data.
What you can do is augment your training data by replacing your labeled entities with other words. If you use common nouns then the model may be able to learn to rely on context more, though I think the effect might not be very significant.
If you're feeling very adventurous you could try detecting entities, masking them by replacing them with a filler token like
XXX
, and seeing how that's labelled, to avoid any influence from token identity. But I'm not sure I'd recommend that.Are you doing something where you label r…