-
-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add simplenet architecture #1679
base: main
Are you sure you want to change the base?
Conversation
-added simplenet.py to timm/models -added simplenet.md to docs/models -added an entry to docs/models.md
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
@Coderx7 thanks for the PR, looks like a decent lightweight model, but the big stack of layers in a sequential doesn't really line up with other timm models, makes it hard to support many default features like feature extraction at strided stage boundaries, layer grouping, block based grad checkpointing, etc.... Any chance you could organize the net into stem + stages[blocks[]] ? |
@rwightman my pleasure. I tried to follow your vgg implementation and implement everything that was there. |
@Coderx7 RexNet is probably the simplest example, ResNetV2 and RegNet are decent examples as well...
I also just refactored Levit to use stages (for feat extraction support), and it's similar to this net in that there aren't strided convs, but a 'downsample' layer that'd be at the start of strided stages. |
So looking at the net layout, two possible structures stand out:
|
@rwightman Thanks a lot for the examples. I guess I'll give resnext a try and hopefully get it refactored soon. |
@rwightman :
|
@Coderx7 structure looks nice for conversion I usually write a fn called See:
Mapping a purely line 0..num_model_layers to stages is going to be a bit of fun, probably need to use regex, finding a rule that you can increment stage_idx on (ie every time outdim changes). Last ditch is just to iterate both state dicts together like the levit example and assume they line up (they should), assert that the num elements matches... |
that checkpoint filter should be passed to the builder ie https://github.com/rwightman/pytorch-image-models/blob/main/timm/models/levit.py#L765 |
@rwightman I got the checkpoint working, however for some reason when I try the
what should I specify in module name in feature_info list, what is it looking for?
feature_info:
and this is the
|
@Coderx7 feature info should be filled with the module name of the 'deepest' layer for a given stride, so usually the nn.Module before a dowsample layer. In this case, you'd want |
@rwightman Thanks. but there are two things here, first I believe I did just that but still got that same error anyway! I'll give that another try ans see how it goes. |
In the model create helper you should enable the flatten_sequential and ensure the default # of out indices matches the net
Most models have some sort of pattern and systematic spacing between the strided layers so figured that'd be the same here for the configs, I realize they could be put anywhere but doesn't seem that useful to have no depth between strides. The concept of the stage is essentially encapsulate the layers at the same stride, and sometimes there are stages w/o any stride but a different width, conv type (depthwise vs not), or other trait in common with all layer repeats in the stage. |
@rwightman Thanks a lot. thats a fair point, however this was never meant to scale that way. it was designed with something completely different in mind. it was meant to show how one could maximize a networks performance under constraint (fixed_param count, depth and basic operators) while keeping everything simple and not resorting to any complex strategies. having that said, thankfully I seem to pretty much have done everything and the only thing that seems to still be an issue is that the last stage has a bigger featuremap size(thus smaller stride) than its previous counter part. it seems timm has issues with it.
How should I be handling this other than merging the last two stages? |
@rwightman would you kindly have a look here and tell me what to do for the last part? thanks |
@Coderx7 reduction is spatial reduction (from the input image size), it's only complained about if it decreases, it's not used directly by timm but some downstream users want to know that for calculating interpolation ratios. If you look at the rexnet example it should *=2 every time there is a strided layer, majority of imagnet networks ar stride 32, num chs does not have any restrictions for increasing/decreasing although |
@rwightman I thought the idea was to provide featuremaps of different sizes for downstream usage not capturing only the strides of 2 per say. assert 'reduction' in fi and fi['reduction'] >= prev_reduction is not disabled this wont work.
so which option should I take and hopefully finish this up? |
@rwightman I'd really appreciate if you could kindly have a look and decide on the next step so I can finalize the changes accordingly and have it finished. |
@Coderx7 sorry I have a lot on my plate right now, wrapping a up a few things before I'm on vacation for a bit. I'm going to have to leave this one hanging for a bit as I don't think we're on the same page. The net is simple as per its name and I didn't see any merging, or upscaling or anything that could result in a feature map increasing in size, it's reducing by 2 at each downscale. I feel we're lost in semantics. |
@rwightman out of the last three conv layers, two (2048 and 256) have kernel size=1 and they use padding of 1, that causes the featuremap size to increase from previously 7x7 (after the down sampling) to 9x9(after conv 1x1), the next conv1x1 layer increases that to 11x11, thus causing the effective reduction to vary that way. |
@rwightman May I ask if your vacation is over and if we can hopefully get this last step worked out? |
@Coderx7 been trying to get on top of my own tasks since getting back. I looked at this a bit more, not really liking the padding issue that is the reason for the expanding dim... having a padding of 1 for a 1x1 conv makes zero sense to me. It's adding data to the signal path that's not meaningful. So, I'm hesitant to add alltogether with quirks like that present... |
@rwightman Thanks, I really appreciate it knowing how busy your schedule is. |
@Coderx7 in deep learning it would seem almost any extra activations (or parameters) can/will be used to improve the loss in optimization, but I'd argue not particularly useful ones (and possibly harmful for segmentation/obj detection as they'd add a 'border' effect at the feature level). They get blended back into the signal via the subsequent 3x3 conv. I did test these and per the goal of running faster, the extra padding does have a measurable speed impact (not significant but there). The rest of the net is fine, simple as per the name which isn't bad to have in timm as they can be the best option for some tasks. If the padding issue fixed (padding == kernel_size//2 should do fine for this net) and retrained I'd definitely include with the tweaks mentioned. Do you have hparams for these? I have two idle 2x Titan RTX machines right now, I could put them to work if you push any outstanding changes re arch to this PR. |
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as outdated.
This comment was marked as outdated.
the last two 1x1 convs now use no padding, this is done to make the architetcure in line with what timms standards. because of this change the pretrained weights are no more valid and this needs to be retrained.
These are the updated imagenet pretrained weights with improved accuracy.
@rwightman Hi, hope you are doing great, |
@rwightman its been a few months since my last changes, could you kindly please tell me if everything is OK or I'm missing sth here? |
This pull request add SimpleNet architecture. Simplenetv1 is a 2016 architecture comprised of only the most basic operators that comprises a plain CNN network. It outperformed many deeper and more complex architectures such as VGGNet,ResNet,etc on several benchmark datasets. This is its results on ImageNet dataset.
-added simplenet.py to timm/models
-added simplenet.md to docs/models
-added an entry to docs/models.md
Here are some more information concerning how they perform taken from our official pytorch repository:
SimpleNet performs very decently, it outperforms VGGNet, variants of ResNet and MobileNets(1-3)
and its pretty fast as well! and its all using plain old CNN!.
Here's an example of benchmark run on small variants of simplenet and some other known architectures such as mobilenets.
Small variants of simplenet consistently achieve high performance/accuracy:
and this is a sample for larger models:
Note:
These benchmarks are run on a PC with GTX1080, Pytorch 1.11, fp32 and nchw configuration.
I hope this is useful for the community.