This project uses Ansible to automate bootstrapping a Kubernetes cluster around FreeIPA infrastructure on DigitalOcean cloud instances running CoreOS Container Linux. Containerized services, including email and web, may be run atop this platform.
FreeIPA provides core services essential to clusters, such as DNS, user/host authentication/authorization, and SSL certificate authorities. A Kubernetes cluster built on this infrastructure orchestrates various containerized services, simplifying service management across the cluster. These components run in CoreOS Container Linux, a minimal OS that might be seen as a cluster node appliance requiring minimal maintenance. And the OS runs on DigitalOcean droplets for inexpensive, automated provisioning on the cloud. Provisioning and configuration management is automated at all levels with Ansible, and a full cluster may be bootstrapped and running within an hour by issuing a few simple commands.
This project means to automate setup of a basic Internet site with self-hosted email, web and phone system. Service is redundant where possible to increase reliability. The components are chosen to scale up, but the down-scaled, three-node bootstrap configuration is complete, and meant to be ready for production.
Challenges:
- Bootstrapping CoreOS before FreeIPA DNS and CA are available
- Running etcd and Kubernetes on SSL certs without IP SANs, as all guides call for
Although the goal is production readiness, this is unfinished and experimental work. It is not represented to be fit for any purpose. Readers are strongly advised to contemplate the risks before using this work in any critical scenario.
At present, the Ansible playbooks can complete the following tasks for a 3-node cluster with no manual intervention. The end result is a running Kubernetes cluster, still with a few minor known issues.
FIXME refer to below instead of these bullets
- Provision DigitalOcean cloud servers
- Configure CoreOS image with ignition
- Add and configure block storage for filesystems and swap
- Other basic OS configuration
- Deploy etcd and flanneld with temporary DNS and CA
- Set up temporary
dnsmasq
container with etcd SRV records - Set up temporary
cfssl
CA with etcd TLS certificates - Start etcd cluster
- Configure and run flanneld
- Configure
etcdctl
client on nodes
- Set up temporary
- Install FreeIPA server, replicas and clients
- Add and configure block storage for IPA data
- Configure and run FreeIPA server install
- Migrate to FreeIPA DNS
- Configure and run FreeIPA replica and client installs
- Harden publicly exposed services
- Basic DNS and other configuration
- Configure Docker TLS for remote control
- IPA: Create CA and install monitored service certificates
- Configure Docker TLS port with client cert authentication
- Create client cert
- Restart docker service with TLS configuration
- Migrate etcd to FreeIPA certs
- IPA: Create CA and install monitored service certificates
- Set up cluster DNS SRV records
- Configure and start kubernetes
- IPA: Create CA and install monitored service certificates
- Template kube-system pod manifests, kubelet service, kube configuration, etc. for API server and other nodes, with inter-node TLS and etcd TLS endpoints
- Start kubelet service, wait for API availability, and check for pod creation
- Install and configure
kubectl
client with TLS client certs - Install and configure the k8s dns and dashboard add-ons
-
Initial site setup (only run once):
# Copy `hosts.yaml` template and edit cp hosts.yaml.example hosts.yaml # You will certainly want to edit: # - domain_name and kerberos_realm # - network_prefix: each host, first 2 octets MUST BE unique # You might want to edit: # - host names: host1 etc. # - additional hosts: minimum 3 hosts # - defaults: size_id (default 1gb), region_id (nyc1) $EDITOR hosts.yaml # Optionally build container if not pulling from docker hub ./container -b # Start shell in container; the following `ansible-*` commands # should all be run in the container ./container # Set up password vault once; will prompt for DigitalOcean # token, FreeIPA admin and directory passwords ansible-playbook playbooks/init-site.yaml
-
Install whole cluster in one command:
ansible-playbook playbooks/site.yaml
-
Delete a node or the whole cluster
# Destroy host1 ansible-playbook playbooks/destroy.yaml -e confirm=host -l host1 # Destroy whole cluster ansible-playbook playbooks/destroy.yaml -e confirm=all
To get IPA server on a calico network from the start will need a restructure.
Done:
- If bootstrapping:
- Gen and deploy certs with
cfssl
- Point dns at
dnsmasq
in container withSRV
records
- Gen and deploy certs with
- If not bootstrapping:
- Gen and deploy certs with IPA
- Bring up etcd cluster with calico
- Point dns at IPA servers
- Provision IPA
TODO:
- Add
SRV
records, etc. - Re-gen certs with certmonger
- Clean up bootstrapping
- Migrate onto new certs
-
Misc commands:
# Re-collect facts about host ansible host1 -m setup \ -e ansible_python_interpreter=/home/core/bin/python \ -e ansible_ssh_user=core # List all variables for a host ansible host1 -m debug -a "var=hostvars[inventory_hostname]" # Also vars, environment, group_names, groups
-
Ansible ipa_* tests:
- Testing against live IPA server
env \ PYTHONPATH=$(pwd)/lib \ IPA_HOST=host1.example.com \ IPA_USER=admin \ IPA_PASS=mysecretpw \ IPA_DOMAIN=example.com \ IPA_NSRECORD=host1.example.com. \ nosetests -v test/units/modules/identity/ipa/
-
Run nosetests in ansible repo
PYTHONPATH=$(pwd)/lib nosetests -v test/units/modules/identity/ipa/
Run etcdctl
with SSL:
cd /media/state/etcd
etcdctl --endpoint=https://$(hostname):2380 \
--ca-file=ca.pem --cert-file=cert.pem --key-file=key.pem \
cluster-health
Run shell in running container:
ssh -t core@$HOST0 docker exec -it ipa bash
FreeIPA information commands
# List all servers and replicas
ipa-replica-manage list
# List agreements for a server
ipa-replica-manage list host1.example.com
Querying LDAP needs SASL auth mech explicitly defined
docker exec -it ipa ldapsearch -H ldaps://host1.example.com -Y GSSAPI
Run shell in ipaclient
container, ready to run emacs
docker exec -it --detach-keys ctrl-^ ipaclient env TERM=screen bash
Misc. kubernetes commands
kubectl config *
kubectl cluster-info
kubectl get cs
kubectl --namespace=kube-system get pods
kubectl --namespace=kube-system describe pods kube-dns-v20-d7crm
kubectl --namespace=kube-system logs kube-dns-v20-d7crm kubedns
kubectl --namespace=kube-system replace --force -f var/k8s/dns-addon.yaml
kubectl --namespace=kube-system delete pods kube-dns-v20-d7crm
kubectl --namespace=kube-system port-forward kubernetes-dashboard-v1.6.0-xcgh7 9090
kubectl --namespace=kube-system exec kube-dns-v20-jh9sb -c kubedns -- nslookup host1
kubectl --namespace=kube-system describe pods kube-dns-v20-jh9sb
At the lowest level, Ansible automates bootstrapping.
Ansible's enormous number of modules handle 90% of our needs. The missing 10% primarily handle FreeIPA object classes that the roles in this repo use extensively, such as the SSL-related objects CA, CA ACL, certificate and service, and also DNS zones and records. There is also a parted module copied from upstream with bugfixes.
A number of filter plugins, some for specific purposes and some general, simplify playbooks. It has been found, however, that some of them duplicate existing functionality in Ansible and should be factored out.
Documentation used during development:
- Glossary of Play, Role, Block, Task directives
- Local actions on stackoverflow, incl. nice syntax
- And CoreOS Container Linux:
- And Docker:
- Manage Docker with Ansible
- Docker connection (st. similar merged into Ansible)
- Plugin development:
- Local facts
ansible.cfg
:fact_path = /home/centos/ansible_facts.d
- Providing cached facts from modules
- Local facts
- This online YAML parser is very helpful
First step is to provision DigitalOcean droplets with CoreOS image.
- DigitalOcean API
- Python DigitalOcean API
- And Ansible:
- And CoreOS:
In this configuration, etcd requires DNS SRV records for initial cluster discovery and TLS certificates for communication. Running atop etcd, Calico and its libnetwork plugin manage the Docker networks that containers attach to.
FreeIPA provides the DNS service and TLS certificate authority for
etcd, but requires Calico for a fixed IP routable across cluster
nodes, introducing a chicken-and-egg problem. This is overcome by
bootstrapping etcd with a temporary dnsmasq
DNS service configured
with cluster discovery SRV records, and a cfssl
TLS CA to generate
temporary SSL certificates. Once the etcd cluster is initialized, the
DNS service may be torn down (SRV records no longer needed for
discovery, and /etc/hosts
taking the place of A records). In later
stages, FreeIPA can be installed and permanent etcd certificates will
be generated with a dedicated sub-CA, and installed and monitored and
automatically renewed with certmonger.
With certificates installed, nodes join the etcd cluster [FIXME]
-
Basic configuration:
-
SSL on CoreOS:
-
Temporary SSL certs and DNS service
- Generate self-signed certificates with
cfssl
- Provide DNS SRV records with
dnsmasq
dnsmasq
manual
- Generate self-signed certificates with
-
Flannel networking:
With basic CoreOS clustering in place, first the FreeIPA server and then the replicas and clients may be installed in Docker containers.
FreeIPA consists of many microservices tied together with a web UI and a complex installer. Installation is non-trivial, but there is an official project to containerize FreeIPA server and replicas.
Not all cluster nodes need to run a FreeIPA server. These nodes instead run only the certmonger service, needed to manage local service SSL certificates from the remote IPA server. Because of lack of consensus in the FreeIPA community about providing an official client container with certmonger service, one is created from a customized fork for use in this project.
The FreeIPA DNS service is used internally by the cluster, and so the
internal service IP must be fixed and routable across the cluster.
Routing internal IPs across nodes is possible with flannel, but there
is no way to guarantee a fixed container IP on the main docker0
network. Instead, a separate Docker 'ipa' network is managed by a
separate flanneld instance, configured with reservations to guarantee
the network address remains fixed on a node, and configured with
30-bit CIDRs to guarantee the container IP remains fixed within the
network.
Also internally, the CoreOS host will also be enrolled in the FreeIPA domain with SSSD.
FreeIPA services will also be exposed on the Internet for remote clients of the domain. Services therefore must be hardened, for example by disabling DNS recursion and disabling anonymous LDAP queries.
The FreeIPA container does not run inside Kubernetes. It is unknown whether there are technical limitations with starting Kubernetes during either initial bootstrapping or normal node rebooting while DNS services are not yet available. At the time of writing, the FreeIPA container cannot run in k8s, because of shared PID space changes in version 1.7 that prevent systemd from starting. (This may be fixed in version 1.8; fqdn-based TLS is broken in version 1.6.)
-
Docker:
-
FreeIPA man-pages:
-
Docs:
- RHEL7 IdM Guide
- RHEL7 system auth guide certmonger
- RHEL6 replication docs
- NSS
certutil
- FreeIPA behind SSL proxy
Once CoreOS clustering and FreeIPA is running, Kubernetes may be installed.
Installation follows the (abandoned?) CoreOS documentation, except that TLS certificates go by FQDN in the subject CN because FreeIPA cannot create TLS certificates with IP address SANs. This seems to be an unusual use case, but seems to work (see below).
FIXME Ansible management of k8s resources
Previous versions of CoreOS used fleet for container orchestration, but fleet is now deprecated in favor of Kubernetes.
-
Kubernetes and CoreOS
- CoreOS Kubernetes docs
- Kubernetes CoreOS docs
- Kubernetes is replacing fleet in CoreOS
-
Kubernetes and Ansible
- Kubernetes module in Ansible
- Ansible examples incl. etcd2, docker in kubernetes
- Other projects to set up Kubernetes on CoreOS with Ansible
- GH thesamet/ansible-kubernetes-coreos
- GH sebiwi/kubernetes-coreos; adds Vagrant
- GH deimosfr/ansible-coreos-kubernetes; for "production usage"
-
Kubernetes and FreeIPA
- Kubernetes TLS certs without IP SANs may work
- FreeIPA won't issue IP SANs
HAProxy for load balancing, but really for reverse-proxying multiple web services on a single IP.
Postfix, Dovecot
- port.direct Harbor integrates K8s and FreeIPA (and others)
- A paper from Tremolo Security about a K8s and FreeIPA integration
DNS recursion is disabled in named.conf and in the IPA config, I
thought, but now it's recursing publicly again. This happened when
the manual iptables rules were removed; now external DNS queries
appear to be coming from the br-ipa
router address, an internal 10.0.0.0/8
address.
Also, the DOCKER-ISOLATION
chain, which blocks packets between
the br-ipa
and docker0
bridges, is going to cause problems for
containers on docker0
attempting to access the local DNS server.
There appears to be no way of fixing this until docker v. 17.0, which
supports a DOCKER-USER
iptables chain. Right now, Docker inserts
the isolation chain at the top of the forward chain every time it
restarts, and maybe even more often.
The solution will inevitably be to restore manual iptables to the
br-ipa
network.
...Or, use calico?
Create docker network using docker network create --driver calico --ipam-driver calico-ipam
, then specify container IP
https://docs.projectcalico.org/v2.5/getting-started/docker/tutorials/ipam
To do this, use the dockerd --cluster-store
stuff.
https://docs.projectcalico.org/v2.5/getting-started/docker/installation/requirements
How to run the calico container:
https://docs.projectcalico.org/v2.5/getting-started/docker/installation/manual
- FIXME The current
playbooks/roles/calico-deploy/templates/calico-rkt.service.j2
needs to be fixed with the right resolv.conf or something.
Later, integrate with k8s:
https://docs.projectcalico.org/v2.5/getting-started/kubernetes/
- Container IP address issues:
- nsupdate
incorrect section name
error update_server_ip_address
rationaleipa-server-install-options
issues; includes mention ofincorrect section name
problem andupdate_server_ip_address
function- Points to BZ 1377973, about
--ip-address=$IP
where$IP
is not configured on any container interface; apparently fixed in v. 4.5
- Points to BZ 1377973, about
- nsupdate
- Container FreeIPA v. 4.5 support
- Script debugging options PR #156
- PR for client-mode
Verify that top-level client certs can't auth against services configured with sub-CAs: Docker, etcd, k8s
These should be added to automation
-
Connect replica servers to eliminate host1 as SPOF
ipa-replica-manage connect host2.example.com host3.example.com
-
Create ipa sidekick
/etc/resolv.conf
service to install FreeIPA/Google DNS servers at start/stop
DO isn't the cheapest anymore, and Dogtag struggles in a $10/month 1gb droplet.
-
Scaleway: For 30% less money, get 300% more CPU and RAM and 150% more disk, and 300% more disk for the same money. Apparently no Ansible modules, but there is an API and a (read-only?) Python module.
-
OVH VPS: For 30% less money, get 50% less CPU, 300% more RAM, 60% less disk, in NA
Possibilities:
- Kubernetes Logging Agent For Elasticsearch add-on
- Kubernetes Logging Using Elasticsearch and Kibana
docs
- Deis.com blog, Kubernetes Logging With Elasticsearch and Kibana
- Security recommendations say don't schedule pods on k8s master
- However, this is possible; see master isolation docs
- Enrol CoreOS in IPA
- This probably won't work at all right now.
http://docs.ansible.com/ansible/latest/playbooks_loops.html#looping-over-the-inventory
docker inspect --format='{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' ipa
See the --cluster-store
argument for doing something in etcd
https://docs.docker.com/engine/reference/commandline/dockerd/