Overfitted model #9236
-
Hello, I have tried training a new model and it does relatively well with previous data but not well with new data; meaning it has a problem generalizing. I suspect this is because of overfitting. Here are my data debug and training results:
The accuracy of 99% suggests overfitting. What are the common methods in spacy to be better to generalize more or to reduce overfitting? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
The most common cause of overfitting is a tiny dataset, but that doesn't appear to affect you so there are some other things you'll have to investigate. One thing is you have a lot of misaligned tokens, you should look into why that's happening. Another is that you have significant overlap between your training and dev set - that won't cause overfitting directly, but you'll want to fix it. Putting aside those problems, what kind of generalization issues is your model having specifically? Is it sensitive to case changes or something? In that case you'll want to look at data augmentation, which is an easy and important way to build robustness. Another thing you can do is check what your labelled examples look like. You have a lot of training data, but how many different words are actually under a given label? Maybe you don't have enough variation in your training data for some reason. Without seeing it it's hard to say more about this. I think this is especially likely given that you're getting 99f1 even on your dev set. |
Beta Was this translation helpful? Give feedback.
The most common cause of overfitting is a tiny dataset, but that doesn't appear to affect you so there are some other things you'll have to investigate.
One thing is you have a lot of misaligned tokens, you should look into why that's happening. Another is that you have significant overlap between your training and dev set - that won't cause overfitting directly, but you'll want to fix it.
Putting aside those problems, what kind of generalization issues is your model having specifically? Is it sensitive to case changes or something? In that case you'll want to look at data augmentation, which is an easy and important way to build robustness.
Another thing you can do is check what your lab…