Skip to content

NER Overfitting word position in sentence #9998

Discussion options

You must be logged in to vote

Sorry you're having trouble with this, we have never seen a report of this before. I have worked on a similar model before and observed the same pattern in the data, though it was long enough ago that I was using a CRF or something.

The spaCy NER model does not explicitly encode position as a parameter, so it's hard to point at one thing as the cause. My best guesses are:

  1. Because the tok2vec uses a CNN, even with convolutions, if your signals are always on the left edge that will distort the model, since they will always reach the same weights in the network.
  2. Because the NER model is transition-based, it is overfitting the NULL → BRAND transition at the start.

You might be able to modif…

Replies: 1 comment 4 replies

Comment options

You must be logged in to vote
4 replies
@annis-souames
Comment options

@polm
Comment options

@svlandeg
Comment options

@annis-souames
Comment options

Answer selected by svlandeg
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat / ner Feature: Named Entity Recognizer perf / accuracy Performance: accuracy
3 participants