Skip to content

Commit

Permalink
doc: Add documentation on how to set up a dedicated Ceph cluster network
Browse files Browse the repository at this point in the history
Signed-off-by: Gabriel Mougard <gabriel.mougard@canonical.com>
  • Loading branch information
gabrielmougard committed Apr 16, 2024
1 parent 9cee6c1 commit 85b44a1
Show file tree
Hide file tree
Showing 10 changed files with 187 additions and 1 deletion.
4 changes: 4 additions & 0 deletions doc/.wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -37,3 +37,7 @@ Jira
VM
YAML
CephFS
disaggregated
subnets
GbE
QSFP
12 changes: 12 additions & 0 deletions doc/explanation/microcloud.rst
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,18 @@ MicroCloud will still be usable, but you will see some limitations:
As a result of this, network forwarding works at a basic level only, and external addresses must be forwarded to a specific cluster member and don't fail over.
- There is no support for hardware acceleration, load balancers, or ACL functionality within the local network.

Dedicated networks for Ceph
~~~~~~~~~~~~~~~~~~~~~~~~~~~
You can set up dedicated networks for Ceph to differentiate between public and internal Ceph traffic.

- You can choose to have a dedicated network for the public Ceph traffic and another dedicated network for the Ceph internal (cluster) traffic. This is a fully disaggregated Ceph network setup.
In the fully disaggregated setup, each of your cluster members must have at least two additional network interfaces, with
both IP addresses being respectively in the same subnet as the Ceph public network and the Ceph internal network you specified.
- Alternatively, you can choose to have a dedicated network where both the public and the internal Ceph traffic will flow. This is a partially disaggregated Ceph network setup.
In the partially disaggregated setup, each of your cluster members must have at least one additional network interface with an IP address in the same subnet as the Ceph dedicated network you specified.

See :ref:`howto-ceph-networking` for how to set up dedicated networks for Ceph.

Storage
-------

Expand Down
155 changes: 155 additions & 0 deletions doc/how-to/ceph_networking.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
.. _howto-ceph-networking:

How to configure Ceph networking
================================

When running :command:`microcloud init`, you are asked if you want to provide custom subnets for the Ceph cluster.
Here are the questions you will be asked:

- ``What subnet (either IPv4 or IPv6 CIDR notation) would you like your Ceph public traffic on? [default: 203.0.113.0/24]: <answer>``
- ``What subnet (either IPv4 or IPv6 CIDR notation) would you like your Ceph internal traffic on? [default: 203.0.113.0/24]: <answer>``

You can choose to skip both questions (just hit `Enter`) and use the default value which is the subnet used for the internal MicroCloud traffic.
This is referred to as a *usual* Ceph networking setup.

.. figure:: /images/ceph_network_usual_setup.png
:alt: All the Ceph traffic is on the same network interface
:align: center

But sometimes, you want to be able to use different network interfaces for some Ceph related usages.
Let's imagine you have machines with network interfaces that are tailored for high throughput and low latency data transfer,
like 100 GbE+ QSFP links, and other ones that might be more suited for management traffic, like 1 GbE or 10 GbE links.

In this case, it would probably be ideal to set your Ceph internal (or cluster) traffic on the high throughput network interface and the Ceph public traffic on the management network interface. This is referred to as a *fully disaggregated* Ceph networking setup.

.. figure:: /images/ceph_network_full_setup.png
:alt: Each type of Ceph traffic is on a different network interface, tailored for its usage.
:align: center

You could also decide to put both type of traffic on the same high throughput and low latency network interface. This is referred to as a *partially disaggregated* Ceph networking setup.

.. figure:: /images/ceph_network_partial_setup.png
:alt: Both the Ceph public and internal traffic are on the same high throughput network interface.
:align: center

To use a fully or partially disaggregated Ceph networking setup with your MicroCloud, specify the corresponding subnets during the MicroCloud initialisation process.
The following instructions build on the :ref:`get-started` tutorial and show how you can test setting up a MicroCloud with disaggregated Ceph networking inside a LXD setup.

1. Create the dedicated networks for Ceph:
a. First, just like when you created an uplink network for MicroCloud so that the cluster members could have external connectivity, you will need to create a dedicated network for the Ceph cluster members to communicate with each other. Let's call it ``cephbr0``::

lxc network create cephbr0

b. Create a second network. Let's call it ``cephbr1``::

lxc network create cephbr1

c. Enter the following commands to find out the assigned IPv4 and IPv6 addresses for the networks and note them down::

lxc network get cephbr0 ipv4.address
lxc network get cephbr0 ipv6.address
lxc network get cephbr1 ipv4.address
lxc network get cephbr1 ipv6.address

2. Create the network interfaces that will be used for the Ceph networking setup for each VM:
a. Add the network device for the ``cephbr0`` and ``cephbr1`` network::

lxc config device add micro1 eth2 nic network=cephbr0 name=eth2
lxc config device add micro2 eth2 nic network=cephbr0 name=eth2
lxc config device add micro3 eth2 nic network=cephbr0 name=eth2
lxc config device add micro4 eth2 nic network=cephbr0 name=eth2
lxc config device add micro1 eth3 nic network=cephbr1 name=eth3
lxc config device add micro2 eth3 nic network=cephbr1 name=eth3
lxc config device add micro3 eth3 nic network=cephbr1 name=eth3
lxc config device add micro4 eth3 nic network=cephbr1 name=eth3

3. Now, just like in the tutorial, start the VMs.

4. On each VM, bring the network interfaces up and give them an IP address within their network subnet:
a. For the ``cephbr0`` network, do the following for each VM::

# If the `cephbr0` gateway address is `10.0.1.1/24` (subnet should be `10.0.1.0/24`)
ip link set enp7s0 up
# `X` should be a number between 2 and 254, different for each VM
ip addr add 10.0.1.X/24 dev enp7s0

b. Do the same for ``cephbr1`` on each VM::

# If the `cephbr1` gateway address is `10.0.2.1/24` (subnet should be `10.0.2.0/24`)
ip link set enp8s0 up
# `X` should be a number between 2 and 254, different for each VM
ip addr add 10.0.2.X/24 dev enp8s0

5. Now, you can start the MicroCloud initialisation process and provide the subnets you noted down in step 1.c when asked for the Ceph networking subnets.

#. We will use ``cephbr0`` for the Ceph internal traffic and ``cephbr1`` for the Ceph public traffic. In the real life, you'd choose the fast subnet for the internal Ceph traffic::

What subnet (either IPv4 or IPv6 CIDR notation) would you like your Ceph public traffic on? [default: 203.0.113.0/24]: 10.0.2.0/24

Interface "enp7s0" ("10.0.2.3") detected on cluster member "micro2"
Interface "enp7s0" ("10.0.2.4") detected on cluster member "micro3"
Interface "enp7s0" ("10.0.2.2") detected on cluster member "micro1"

What subnet (either IPv4 or IPv6 CIDR notation) would you like your Ceph internal traffic on? [default: 203.0.113.0/24]: 10.0.1.0/24

Interface "enp7s0" ("10.0.1.3") detected on cluster member "micro2"
Interface "enp7s0" ("10.0.1.4") detected on cluster member "micro3"
Interface "enp7s0" ("10.0.1.2") detected on cluster member "micro1"

7. The MicroCloud initialisation process will now continue as usual and the Ceph cluster will be configured with the networking setup you provided.
8. You can inspect the Ceph network setup:
a. Inspect the Ceph configuration file:

.. terminal::
:input: microceph.ceph config dump
:user: root
:host: micro1
:scroll:

WHO MASK LEVEL OPTION VALUE RO
global advanced cluster_network 10.0.1.0/24 *
global advanced public_network 10.0.2.0/24 *
global advanced osd_pool_default_crush_rule 2

b. Inspect your Ceph related network traffic:

.. terminal::
:input: lxc launch ubuntu:22.04 u5 -s remote
:user: root
:host: micro1
:scroll:

Creating c1
Starting c1

c. At the same time, observe the Ceph traffic on the ``enp7s0`` (or ``enp8s0`` in a fully disaggregated setup) interface (on any cluster member) using ``tcpdump``:

.. terminal::
:input: tcpdump -i enp7s0
:user: root
:host: micro2
:scroll:

17:48:48.600971 IP 10.0.1.4.6804 > micro1.48746: Flags [P.], seq 329386555:329422755, ack 245889462, win 24576, options [nop,nop,TS val 3552095031 ecr 3647909539], length 36200
17:48:48.600971 IP 10.0.1.4.6804 > micro1.48746: Flags [P.], seq 329422755:329451715, ack 245889462, win 24576, options [nop,nop,TS val 3552095031 ecr 3647909563], length 28960
17:48:48.601012 IP micro1.48746 > 10.0.1.4.6804: Flags [.], ack 329386555, win 24317, options [nop,nop,TS val 3647909564 ecr 3552095031], length 0
17:48:48.601089 IP 10.0.1.4.6804 > micro1.48746: Flags [P.], seq 329451715:329516875, ack 245889462, win 24576, options [nop,nop,TS val 3552095031 ecr 3647909563], length 65160
17:48:48.601089 IP 10.0.1.4.6804 > micro1.48746: Flags [P.], seq 329516875:329582035, ack 245889462, win 24576, options [nop,nop,TS val 3552095031 ecr 3647909563], length 65160
17:48:48.601089 IP 10.0.1.4.6804 > micro1.48746: Flags [P.], seq 329582035:329624764, ack 245889462, win 24576, options [nop,nop,TS val 3552095031 ecr 3647909563], length 42729
17:48:48.601204 IP micro1.48746 > 10.0.1.4.6804: Flags [.], ack 329624764, win 23357, options [nop,nop,TS val 3647909564 ecr 3552095031], length 0
17:48:48.601206 IP 10.0.1.4.6803 > micro1.33328: Flags [P.], seq 938255:938512, ack 359644195, win 24576, options [nop,nop,TS val 3552095031 ecr 3647909540], length 257
17:48:48.601310 IP micro1.48746 > 10.0.1.4.6804: Flags [P.], seq 245889462:245889506, ack 329624764, win 24576, options [nop,nop,TS val 3647909564 ecr 3552095031], length 44
17:48:48.602839 IP micro1.48746 > 10.0.1.4.6804: Flags [P.], seq 245889506:245889707, ack 329624764, win 24576, options [nop,nop,TS val 3647909566 ecr 3552095031], length 201
17:48:48.602947 IP 10.0.1.4.6804 > micro1.48746: Flags [.], ack 245889707, win 24576, options [nop,nop,TS val 3552095033 ecr 3647909564], length 0
17:48:48.602975 IP 10.0.1.4.6804 > micro1.48746: Flags [P.], seq 329624764:329624808, ack 245889707, win 24576, options [nop,nop,TS val 3552095033 ecr 3647909564], length 44
17:48:48.603028 IP 10.0.1.4.6803 > micro1.33328: Flags [P.], seq 938512:938811, ack 359644195, win 24576, options [nop,nop,TS val 3552095033 ecr 3647909540], length 299
17:48:48.603053 IP micro1.33328 > 10.0.1.4.6803: Flags [.], ack 938811, win 1886, options [nop,nop,TS val 3647909566 ecr 3552095031], length 0
17:48:48.604594 IP micro1.33328 > 10.0.1.4.6803: Flags [P.], seq 359644195:359709355, ack 938811, win 1886, options [nop,nop,TS val 3647909568 ecr 3552095031], length 65160
17:48:48.604644 IP micro1.33328 > 10.0.1.4.6803: Flags [P.], seq 359709355:359774515, ack 938811, win 1886, options [nop,nop,TS val 3647909568 ecr 3552095031], length 65160
17:48:48.604688 IP micro1.33328 > 10.0.1.4.6803: Flags [P.], seq 359774515:359839675, ack 938811, win 1886, options [nop,nop,TS val 3647909568 ecr 3552095031], length 65160
17:48:48.604733 IP micro1.33328 > 10.0.1.4.6803: Flags [P.], seq 359839675:359904835, ack 938811, win 1886, options [nop,nop,TS val 3647909568 ecr 3552095031], length 65160
17:48:48.604751 IP 10.0.1.4.6803 > micro1.33328: Flags [.], ack 359709355, win 24317, options [nop,nop,TS val 3552095035 ecr 3647909568], length 0
17:48:48.604757 IP micro1.33328 > 10.0.1.4.6803: Flags [P.], seq 359904835:359910746, ack 938811, win 1886, options [nop,nop,TS val 3647909568 ecr 3552095035], length 5911
17:48:48.604797 IP micro1.33328 > 10.0.1.4.6803: Flags [P.], seq 359910746:359975906, ack 938811, win 1886, options [nop,nop,TS val 3647909568 ecr 3552095035], length 65160

We showed you how to set up a MicroCloud with a fully disaggregated Ceph networking setup, but you can also set up a partially disaggregated Ceph networking setup by using the same network subnet for both the Ceph public and internal traffic.
1 change: 1 addition & 0 deletions doc/how-to/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ These how-to guides cover key operations and processes in MicroCloud.
Install MicroCloud </how-to/install>
Manage the snaps </how-to/snaps>
Initialise MicroCloud </how-to/initialise>
Configure Ceph networking </how-to/ceph_networking>
Add a machine </how-to/add_machine>
Get support </how-to/support>
Contribute to MicroCloud </how-to/contribute>
Expand Down
2 changes: 2 additions & 0 deletions doc/how-to/initialise.rst
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,8 @@ Complete the following steps to initialise MicroCloud:
Wiping a disk will destroy all data on it.

#. You can choose to optionally set up a CephFS distributed file system.
#. Select either an IPv4 or IPv6 CIDR subnet for the Ceph public traffic. You can leave it empty to use the default value, which is the MicroCloud internal network (see :ref:`howto-ceph-networking` for how to configure it).
#. Select either an IPv4 or IPv6 CIDR subnet for the Ceph internal traffic. You can leave it empty to use the default value, which is the MicroCloud internal network (see :ref:`howto-ceph-networking` for how to configure it).
#. Select whether you want to set up distributed networking (using MicroOVN).

If you choose ``yes``, configure the distributed networking:
Expand Down
5 changes: 5 additions & 0 deletions doc/how-to/preseed.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,11 @@ systems:
- name: micro04
ovn_uplink_interface: eth1

# `ceph` is optional and represents the Ceph global configuration
ceph:
internal_network: 10.0.1.0/24
public_network: 10.0.2.0/24

# `ovn` is optional and represents the OVN & uplink network configuration for LXD.
ovn:
ipv4_gateway: 192.0.2.1/24
Expand Down
Binary file added doc/images/ceph_network_full_setup.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/images/ceph_network_partial_setup.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/images/ceph_network_usual_setup.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
9 changes: 8 additions & 1 deletion doc/tutorial/get_started.rst
Original file line number Diff line number Diff line change
Expand Up @@ -272,6 +272,8 @@ Complete the following steps:
#. Select all listed disks (these should be ``remote1``, ``remote2``, and ``remote3``).
#. You don't need to wipe any disks (because we just created them).
#. Select ``yes`` to optionally configure the CephFS distributed file system.
#. Leave the question empty for the IPv4 or IPv6 CIDR subnet address used for the Ceph public network.
#. Leave the question empty for the IPv4 or IPv6 CIDR subnet address used for the Ceph internal network.
#. Select ``yes`` to configure distributed networking.
#. Select all listed network interfaces (these should be ``enp6s0`` on the four different VMs).
#. Specify the IPv4 address that you noted down for your ``microbr0`` network as the IPv4 gateway.
Expand Down Expand Up @@ -385,6 +387,11 @@ See the full initialisation process here:
Using 1 disk(s) on "micro3" for remote storage pool

Would you like to set up CephFS remote storage? (yes/no) [default=yes]: yes
Configure a dedicated Ceph network? (yes/no) [default=no]: yes
Choose either an IPv4 or IPv6 subnet (CIDR notation) to describe your Ceph dedicated cluster: 192.168.0.0/24
Interface "enp7s0" ("192.168.0.3") detected on cluster member "micro2"
Interface "enp7s0" ("192.168.0.4") detected on cluster member "micro3"
Interface "enp7s0" ("192.168.0.2") detected on cluster member "micro1"
Configure distributed networking? (yes/no) [default=yes]: yes
Select an available interface per system to provide external connectivity for distributed network(s):
Space to select; enter to confirm; type to filter results.
Expand Down Expand Up @@ -540,7 +547,7 @@ You can now inspect your cluster setup.
total space: 29.67GiB
used by: {}

#. Inspect the network setup:
#. Inspect the OVN network setup:

.. terminal::
:input: lxc network list
Expand Down

0 comments on commit 85b44a1

Please sign in to comment.