Skip to content

Commit

Permalink
MicroCeph Remote Replication (2/3): RBD Mirroring (#437)
Browse files Browse the repository at this point in the history
# Description
Adds implementation for enabling/disabling rbd mirroring (remote
replication) to a configured MicroCeph remote cluster.

## Type of change

Please delete options that are not relevant.

- [ ] Bug fix (non-breaking change which fixes an issue)
- [X] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [ ] CleanCode (Code refactor, test updates, does not introduce
functional changes)
- [X] Documentation update (Contains Doc change)

## How Has This Been Tested?
- [X] CI Tests
- [X] Unit Tests

## Contributor's Checklist

Please check that you have:

- [ ] self-reviewed the code in this PR.
- [X] added code comments, particularly in hard-to-understand areas.
- [X] updated the user documentation with corresponding changes.
- [X] added tests to verify effectiveness of this change.

---------

Signed-off-by: Utkarsh Bhatt <utkarsh.bhatt@canonical.com>
  • Loading branch information
UtkarshBhatthere authored Oct 17, 2024
1 parent e32ad49 commit f1e0421
Show file tree
Hide file tree
Showing 35 changed files with 2,826 additions and 60 deletions.
15 changes: 15 additions & 0 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -706,5 +706,20 @@ jobs:
- name: Verify Remote authentication
run: ~/actionutils.sh remote_perform_remote_ops_check

- name: Enable RBD Mirror Daemon
run : ~/actionutils.sh remote_enable_rbd_mirror_daemon

- name: Configure RBD mirror
run : ~/actionutils.sh remote_configure_rbd_mirroring

- name: Wait for RBD mirror to sync images
run : ~/actionutils.sh remote_wait_for_secondary_to_sync

- name: Verify RBD mirror
run : ~/actionutils.sh remote_verify_rbd_mirroring

- name: Disable RBD mirror
run : ~/actionutils.sh remote_disable_rbd_mirroring

- name: Verify Remote removal
run: ~/actionutils.sh remote_remove_and_verify
96 changes: 96 additions & 0 deletions docs/how-to/configure-rbd-mirroring.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
==================================
Configure RBD remote replication
==================================

MicroCeph supports asynchronously replicating (mirroring) RBD images to a remote cluster.

An operator can enable this on any rbd image, or a whole pool. Enabling it on a pool enables it for all the images in the pool.

Prerequisites
--------------
1. A primary and a secondary MicroCeph cluster, for example named "primary_cluster" and "secondary_cluster"
2. primary_cluster has imported configurations from secondary_cluster and vice versa. refer to :doc:`import remote <./import-remote-cluster>`
3. Both clusters have 2 rbd pools: pool_one and pool_two.
4. Both pools at cluster "primary_cluster" have 2 images each (image_one and image_two) while the pools at cluster "secondary_cluster" are empty.

Enable RBD remote replication
-------------------------------

An operator can enable replication for a given rbd pool which is present at both clusters as

.. code-block:: none
sudo microceph remote replication rbd enable pool_one --remote secondary_cluster
Here, pool_one is the name of the rbd pool and it is expected to be present at both the clusters.

Check RBD remote replication status
------------------------------------

The above command will enable replication for ALL the images inside pool_one, it can be checked as:

.. code-block:: none
sudo microceph remote replication rbd status pool_one
+------------------------+----------------------+
| SUMMARY | HEALTH |
+-------------+----------+-------------+--------+
| Name | pool_one | Replication | OK |
| Mode | pool | Daemon | OK |
| Image Count | 2 | Image | OK |
+-------------+----------+-------------+--------+
+-------------------+-----------+--------------------------------------+
| REMOTE NAME | DIRECTION | UUID |
+-------------------+-----------+--------------------------------------+
| secondary_cluster | rx-tx | f25af3c3-f405-4159-a5c4-220c01d27507 |
+-------------------+-----------+--------------------------------------+
The status shows that there are 2 images in the pool which are enabled for mirroring.

Listing all RBD remote replication images
------------------------------------------

An operator can list all the images that have replication (mirroring) enabled as follows:

.. code-block:: none
sudo microceph remote replication rbd list
+-----------+------------+------------+---------------------+
| POOL NAME | IMAGE NAME | IS PRIMARY | LAST LOCAL UPDATE |
+-----------+------------+------------+---------------------+
| pool_one | image_one | true | 2024-10-08 13:54:49 |
| pool_one | image_two | true | 2024-10-08 13:55:19 |
| pool_two | image_one | true | 2024-10-08 13:55:12 |
| pool_two | image_two | true | 2024-10-08 13:55:07 |
+-----------+------------+------------+---------------------+
Disabling RBD remote replication
---------------------------------

In some cases, it may be desired to disable replication. A single image ($pool/$image) or
a whole pool ($pool) can be disabled in a single command as follows:

Disable Pool replication:
.. code-block:: none
sudo microceph remote replication disable pool_one
sudo microceph remote replication list
+-----------+------------+------------+---------------------+
| POOL NAME | IMAGE NAME | IS PRIMARY | LAST LOCAL UPDATE |
+-----------+------------+------------+---------------------+
| pool_two | image_one | true | 2024-10-08 13:55:12 |
| pool_two | image_two | true | 2024-10-08 13:55:07 |
+-----------+------------+------------+---------------------+
Disable Image replication:
.. code-block:: none
sudo microceph remote replication disable pool_two/image_two
sudo microceph remote replication list
+-----------+------------+------------+---------------------+
| POOL NAME | IMAGE NAME | IS PRIMARY | LAST LOCAL UPDATE |
+-----------+------------+------------+---------------------+
| pool_two | image_one | true | 2024-10-08 13:55:12 |
+-----------+------------+------------+---------------------+
12 changes: 11 additions & 1 deletion docs/how-to/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,18 @@ migrate services and more.
change-log-level
migrate-auto-services
remove-disk
import-remote-cluster

Managing a remote cluster
-------------------------

Make MicroCeph aware of a remote cluster and configure remote replication for
RBD pools and images.

.. toctree::
:maxdepth: 1

import-remote-cluster
configure-rbd-mirroring

Upgrading your cluster
----------------------
Expand Down
98 changes: 98 additions & 0 deletions docs/reference/commands/remote-replication-rbd.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
=============================
``remote replication rbd``
=============================

Usage:

.. code-block:: none
microceph remote replication rbd [command]
Available commands:

.. code-block:: none
configure Configure remote replication parameters for RBD resource (Pool or Image)
disable Disable remote replication for RBD resource (Pool or Image)
enable Enable remote replication for RBD resource (Pool or Image)
list List all configured remotes replication pairs.
status Show RBD resource (Pool or Image) replication status
Global options:

.. code-block:: none
-d, --debug Show all debug messages
-h, --help Print help
--state-dir Path to store state information
-v, --verbose Show all information messages
--version Print version number
``enable``
----------

Enable remote replication for RBD resource (Pool or Image)

Usage:

.. code-block:: none
microceph remote replication rbd enable <resource> [flags]
Flags:

.. code-block:: none
--remote string remote MicroCeph cluster name
--schedule string snapshot schedule in days, hours, or minutes using d, h, m suffix respectively
--skip-auto-enable do not auto enable rbd mirroring for all images in the pool.
--type string 'journal' or 'snapshot', defaults to journal (default "journal")
``status``
----------

Show RBD resource (Pool or Image) replication status

Usage:

.. code-block:: none
microceph remote replication rbd status <resource> [flags]
Flags:

.. code-block:: none
--json output as json string
``list``
----------

List all configured remotes replication pairs.

Usage:

.. code-block:: none
microceph remote replication rbd list [flags]
.. code-block:: none
--json output as json string
--pool string RBD pool name
``disable``
------------

Disable remote replication for RBD resource (Pool or Image)

Usage:

.. code-block:: none
microceph remote replication rbd disable <resource> [flags]
.. code-block:: none
--force forcefully disable replication for rbd resource
145 changes: 145 additions & 0 deletions microceph/api/ops_replication.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
package api

import (
"context"
"encoding/json"
"fmt"
"net/http"
"net/url"

"github.com/canonical/lxd/lxd/response"
"github.com/canonical/lxd/shared/logger"
"github.com/canonical/microceph/microceph/api/types"
"github.com/canonical/microceph/microceph/ceph"
"github.com/canonical/microceph/microceph/interfaces"
"github.com/canonical/microcluster/v2/rest"
"github.com/canonical/microcluster/v2/state"
"github.com/gorilla/mux"
)

// Top level ops API
var opsCmd = rest.Endpoint{
Path: "ops",
}

// replication ops API
var opsReplicationCmd = rest.Endpoint{
Path: "ops/replication/",
}

// List Replications
var opsReplicationWorkloadCmd = rest.Endpoint{
Path: "ops/replication/{wl}",
Get: rest.EndpointAction{Handler: getOpsReplicationWorkload, ProxyTarget: false},
}

// CRUD Replication
var opsReplicationResourceCmd = rest.Endpoint{
Path: "ops/replication/{wl}/{name}",
Get: rest.EndpointAction{Handler: getOpsReplicationResource, ProxyTarget: false},
Post: rest.EndpointAction{Handler: postOpsReplicationResource, ProxyTarget: false},
Put: rest.EndpointAction{Handler: putOpsReplicationResource, ProxyTarget: false},
Delete: rest.EndpointAction{Handler: deleteOpsReplicationResource, ProxyTarget: false},
}

// getOpsReplicationWorkload handles list operation
func getOpsReplicationWorkload(s state.State, r *http.Request) response.Response {
return cmdOpsReplication(s, r, types.ListReplicationRequest)
}

// getOpsReplicationResource handles status operation for a certain resource.
func getOpsReplicationResource(s state.State, r *http.Request) response.Response {
return cmdOpsReplication(s, r, types.StatusReplicationRequest)
}

// postOpsReplicationResource handles rep enablement for the requested resource
func postOpsReplicationResource(s state.State, r *http.Request) response.Response {
return cmdOpsReplication(s, r, types.EnableReplicationRequest)
}

// putOpsReplicationResource handles configuration of the requested resource
func putOpsReplicationResource(s state.State, r *http.Request) response.Response {
return cmdOpsReplication(s, r, types.ConfigureReplicationRequest)
}

// deleteOpsReplicationResource handles rep disablement for the requested resource
func deleteOpsReplicationResource(s state.State, r *http.Request) response.Response {
return cmdOpsReplication(s, r, types.DisableReplicationRequest)
}

// cmdOpsReplication is the common handler for all requests on replication endpoint.
func cmdOpsReplication(s state.State, r *http.Request, patchRequest types.ReplicationRequestType) response.Response {
// Get workload name from API
wl, err := url.PathUnescape(mux.Vars(r)["wl"])
if err != nil {
logger.Errorf("REP: %v", err.Error())
return response.InternalError(err)
}

// Get resource name from API
resource, err := url.PathUnescape(mux.Vars(r)["name"])
if err != nil {
logger.Errorf("REP: %v", err.Error())
return response.InternalError(err)
}

// Populate the replication request with necessary information for RESTfullnes
var req types.ReplicationRequest
if wl == string(types.RbdWorkload) {
var data types.RbdReplicationRequest
err := json.NewDecoder(r.Body).Decode(&data)
if err != nil {
logger.Errorf("REP: failed to decode request data: %v", err.Error())
return response.InternalError(err)
}

// carry RbdReplicationRequest in interface object.
data.SetAPIObjectId(resource)
// Patch request type.
if len(patchRequest) != 0 {
data.RequestType = patchRequest
}

req = data
} else {
return response.SmartError(fmt.Errorf("unknown workload %s, resource %s", wl, resource))
}

return handleReplicationRequest(s, r.Context(), req)
}

// handleReplicationRequest parses the replication request and feeds it to the corresponding state machine.
func handleReplicationRequest(s state.State, ctx context.Context, req types.ReplicationRequest) response.Response {
// Fetch replication handler
wl := string(req.GetWorkloadType())
rh := ceph.GetReplicationHandler(wl)
if rh == nil {
return response.SmartError(fmt.Errorf("no replication handler for %s workload", wl))
}

// Populate resource info
err := rh.PreFill(ctx, req)
if err != nil {
return response.SmartError(err)
}

// Get FSM
repFsm := ceph.GetReplicationStateMachine(rh.GetResourceState())

var resp string
event := req.GetWorkloadRequestType()
// Each event is provided with, replication handler, response object and state.
err = repFsm.FireCtx(ctx, event, rh, &resp, interfaces.CephState{State: s})
if err != nil {
return response.SmartError(err)
}

logger.Debugf("REPFSM: Check FSM response: %s", resp)

// If non-empty response
if len(resp) > 0 {
return response.SyncResponse(true, resp)
}

return response.SyncResponse(true, "")
}
Loading

0 comments on commit f1e0421

Please sign in to comment.