Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enforce SSL "bug?" #149

Closed
krancour opened this issue Mar 21, 2016 · 19 comments
Closed

Enforce SSL "bug?" #149

krancour opened this issue Mar 21, 2016 · 19 comments

Comments

@krancour
Copy link
Contributor

I'm not certain this is a bug, but I thought I'd open it up to discussion.

Currently, the choice to enforce the use of HTTPS (redirect if proto is HTTP), is made at the platform / router level. I have #148 open to track a possible enhancement that makes that configurable on an app-by-app basis. Regardless of whether this were configured router-wide or on an app-by-app basis...

If nginx.ssl.enforce: "true", but no cert is available for a given domain, regardless of that is a subdomain of the platform domain, or a "custom" domain, then that app has no vhost listening on 443. The end result is that the request falls to the default vhost and a 404 is returned.

I'm not clear whether this is a bug (maybe enforcing HTTPS should only happen if there's a cert available for the given domain?) or if this is really just the expected behavior... i.e. "Hey... you asked me to enforce HTTPS... you gave me no cert to use... that is a hard failure."

Unless anyone has a very strong opinion that this requires code changes, my approach to this is going to be to clarify the behavior in router's documentation.

@krancour krancour added the bug label Mar 21, 2016
@krancour krancour self-assigned this Mar 21, 2016
@krancour
Copy link
Contributor Author

After further thought...

I am of the opinion that enforcing HTTPS should not be contingent upon cert availability... that would make it too easy for an app admin (in the Google operational model) to circumvent the wishes of a cluster admin by simply not providing a cert for his/her custom domain.

@helgi I am interested in your opinion on this one.

@helgi
Copy link
Contributor

helgi commented Mar 22, 2016

enforce = true + no cert for domain = should explode. Falling back to HTTP in that scenario is breaking the enforce contract / configuration and could lead to some annoying to track down bugs, such as cert not being located (for whatever reason) and then the site is downgraded down to HTTP without anyone realising for a while

@krancour
Copy link
Contributor Author

Ok. So I think this isn't a bug, but does call for a little bit of doc to clarify this behavior. Retagging accordingly.

@krancour krancour added needs docs and removed bug labels Mar 22, 2016
@sstarcher
Copy link

So I just ran into this and it was a little painful to figure out. This was not the behavior of Deis v1 as far as I can remember.

@krancour
Copy link
Contributor Author

@sstarcher, the behavior of v1.x was that if an app with a custom / vanity domain didn't have its own cert, the router / platform cert got used instead. And this would always result in a cert warning.

https://github.com/deis/deis/blob/master/router/rootfs/etc/confd/templates/nginx.conf#L211-L225

imo, how the v2 router handles this same scenario is an improvement. If an app with a custom / vanity domain doesn't have a cert, then it's not listening on 443 at all. (To me, this makes more sense than simply falling back to an incorrect cert.) Therefore, any request to it using HTTPS falls to the default vhost and the user gets both a cert warning (as they also would have in v1) and, if they ignore that warning, a 404. imo, 404 is the correct response for such a scenario because the end user has requested a URL that really does not exist simply by virtue of the correct cert not having been provided.

The more I contemplate this issue, the more I feel the current behavior is exactly what one should expect. As I'd noted some time ago, if anything, this just needs to be documented. What I'm not clear on is where this doc addition would be most effective. @sstarcher since you ran into some of this recently... do you have any thoughts as to where a mention of this would have saved you some grief?

I am contemplating here:

https://github.com/deis/router/blob/master/README.md#ssl

And here:

https://github.com/deis/workflow/blob/master/src/applications/ssl-certificates.md

Something along the lines of:

Please note that an application's domains, including custom ("vanity") domains never listen for HTTPS traffic on port 443 unless an appropriate certificate has been provided. HTTPS requests to such domains will be handled by the router's default virtual host and will result in both a certificate warning and a 404. This can be corrected by providing the appropriate certificate(s).

@sstarcher
Copy link

@krancour I relied on the behavior from v1. We have a few domains that we do not have certs for, but I still redirect over SSL and yes they certainly get cert warnings. If this is the expected behavior for Deis v2 I guess I could respecify my platform cert for everything I don't have a cert for.

Basically Deis v2 if a you configure SSL and the apps name is not the prefix of the domain SSL will fail.
App named public and an additional domain of hello.com is set.
public.a.b - works
public.wha.who - works
hello.com - 404

@krancour
Copy link
Contributor Author

public.a.b - works
public.wha.who - works

But it's really, "works," I think. There are definitely cert warnings in those cases. From an end-user's perspective-- that's broken.

fwiw, I wish those didn't work (or even "work"). It's a quirk of a regex being used to match URLs if the platform domain has not been defined because we wanted an easy install experience targeted toward first-time users, wherein providing any specific configuration was 100% optional and would work even if skipped. This forced the router's router.deis.io/nginx.platformDomain annotation to be optional, with the aforementioned regex being used in its absence.

I cannot see it as being reasonable for any vhost to listen on 443 if it doesn't have an adequate certificate. Deliberately using the wrong cert doesn't seem like it really improves anything.

@sstarcher
Copy link

sstarcher commented May 27, 2016

@krancour That reasoning makes sense, but with the current setup of Deis and how it handles certs and domains I think you are going to make it very error prone for users in that case. I don't know about most users, but I would rather serve a site with the wrong cert and falling back to the platform cert instead of serving a 404.

If Deis had inbuilt error checking when enforce HTTPS was turned on for adding domains without attached certs giving the user an easy way to be aware of the mistake that would go a long way.

This is even more so the case for people coming from Deis v1.

@krancour
Copy link
Contributor Author

krancour commented May 27, 2016

I don't know about most users, but I would rather serve a site with the wrong cert and falling back to the platform cert instead of serving a 404.

This is probably, largely, a matter of perspective. I'm deferring to the principle of least surprise. I think it's less surprising that SSL doesn't work when a appropriate cert hasn't been provided, than for it to "work" using the wrong cert.

@sstarcher
Copy link

My understanding is that was the reason for the existence of the platform cert was as a fallback. So I would classify using the platform cert as the least surprise. Even more so since the platform cert is still used for anything that starts with the app name. Currently an exception is being made for vanity domains that is not applied to APP_NAME.* domains.

@sstarcher
Copy link

I will say for my use case it would also cause me pains if the current functionality of APP_NAME.* domains were to break and require explicit SSL certs.

Internal our primary cluster is *.cluster.private which references either *.green.private or *.blue.private
So if the cluster is *.green.private. My app APP will resolve correctly over SSL for APP.cluster.private and APP.green.private. When we migrate to a new version we never upgrade in place instead we create a second cluster called *.blue.private allowing us to smoke test the system.

After that we load over addresses from *.cluster.private to start using *.green.private. If you were to fully enforce what your talking about it would break my entire workflow.

@krancour
Copy link
Contributor Author

krancour commented May 27, 2016

My understanding is that was the reason for the existence of the platform cert was as a fallback.

I don't know that I'd classify it as such. The platform cert (which is assumed to be a wildcard cert) has always been to provide SSL to the default domains for each app. So if your platform domain is example.com and your platform cert is for *.example.com, that cert is meant to provide SSL for the dozens or hundreds of apps that may be addressable as subdomains of example.com. As we've discussed, in v1, domains lacking a proper cert may have "fallen back" to using the platform cert, but I don't believe that quirk was by design. The likely reason it was acceptable, was because it prevented the request from falling to the default vhost (which in v1, iirc, would have been whichever occurs first alphabetically) and the request, thus, being serviced by the wrong application. Obviously, no one would have wanted that behavior. v2 isn't prone to that issue since we have explicitly implemented a default vhost to act as a "junk drawer" that serves 404s for requests not matched by any other vhost.

Remove yourself from the v1 context for a moment and try to imagine you are brand new to Workflow. Approach armed only with what you know about SSL. If you haven't provided a cert for foo.com, wouldn't HTTPS not working for foo.com be a more reasonable expectation than it (inexplicably) "working," albeit with the completely wrong certificate (*.example.com)?

Another thing to keep in mind here. By having things not work when SSL isn't configured properly, it encourages users to incur the effort to configure SSL properly. Cert warnings are a disservice to users. If a user has to ignore a cert warning to access one of your sites under normal conditions, they won't know when there's actually a problem with their secure connection to your site.

@krancour
Copy link
Contributor Author

The obvious rebuttal to my last argument, of course, would be for non-production apps-- development or staging environments, etc.-- where the end users are developers who understand the cert warning and accompanying risks, but in such cases, vanity domains probably aren't required.

@sstarcher
Copy link

@krancour I would agree with your previous statement if platform domain was a required entity. I could certainly be in the minority.

@krancour
Copy link
Contributor Author

Like I said... I do still wish it were required that we set that. I'm actually going to add mention of it to this doc. I feel that splits the difference between something that should be set and not requiring it to be set so as to preserve the easy installation experience.

@sstarcher
Copy link

@krancour making the platform domain required would certainly destroy my workflow ;)

@krancour
Copy link
Contributor Author

It won't be. But I also don't really understand how it is your workflow (whether in v1 or v2) depends on all the ambiguity that results from the things we've discussed.

@sstarcher
Copy link

@krancour as long as the platform domain is not required I can work around the not falling through to the platform cert. I'll just have to explicitly attach a few domains I normally let fall to the platform cert.

@Cryptophobia
Copy link

This issue was moved to teamhephy/router#21

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants