Cluster health checks

This section contains a list of checks to run on your OpenShift 3.9+ source and 4.x target clusters before migration. The purpose of these checks is to detect issues that might affect the migration process.

This list is not comprehensive and the verification of these checks does not guarantee a successful migration. We recommend getting in contact with the support team before migrating a cluster from OpenShift 3 to 4, especially if the cluster is in a production environment.

General health

Source cluster

You can perform the following health checks on an OpenShift 3.9+ source cluster:

  • Check that the OpenShift version is supported by the migration tool.
  • Verify that all nodes in the OpenShift cluster contain an active OpenShift Container Platform subscription. This will avoid issues in case support needs to be contacted.
  • Install and configure Prometheus cluster monitoring. Prometheus provides a detailed view of the health of the cluster components.
  • Check the node status to verify that all nodes are in a Ready state:

    $ oc get nodes
    
  • Check the persistent volumes (PVs):

    $ oc get pv
    
    • Mounted PVs
    • Unmounted PVs
    • Abnormal configurations
    • PVs stuck in terminating state
  • Check pods for status that is not Running or Completed. Use the following command because pods might not display an error state:

    $ oc get pods --all-namespaces|egrep -v 'Running | Completed'
    
  • Check for pods with a high restart count. Even if they are in a Running state, a high restart count might indicate underlying problems.
    # Get pods with a restartCount above 3
    $ oc get pods --all-namespaces --field-selector=status.phase=Running -o json | jq '.items[]|select(any( .status.containerStatuses[]; .restartCount > 3))|.metadata.name'
    
  • Check the health of the etcd cluster.
  • Check the network connectivity between master hosts.
  • Check the API service status.
  • Check that the cluster certificates are not close to expiration and will be valid for the duration of the migration process. You can use the easy-mode Ansible playbook to check the certificates.
  • Check for pending certificate signing requests:

    $ oc get csr
    
  • Check that time synchronization is consistent across the whole cluster.
  • Check that the internal container image registry is healthy, images can be read from and written to it.
  • Check that the internal container image registry uses a supported storage type.
  • Check that applications are not using deprecated Kubernetes API references. MTC will warn you about any resources using deprecated Kubernetes API references.
  • Check that all nodes in the cluster have high entropy value.

Target cluster

You can perform the following health checks on an OpenShift 4.x target cluster:

  • Check that the cluster has access to external services required by the applications by verifying network connectivity and proper permissions.

    Examples of external services include databases, source code repositories, container image registries, and CI/CD tools.

  • Check that external applications, services, and appliances that use services provided by the target cluster have access and proper permissions.
  • Verify that all internal container image dependencies are met.

    If an application requires an image that is not in the application namespace, check that the image exists. For example, an application that uses the php:7.1 base image from the PHP image stream on an OpenShift 3.11 cluster will not work on an OpenShift 4.x cluster because that particular version is not included in the PHP image stream for OpenShift 4. See migration prerequisites for a list of image stream tags that have been removed from OpenShift 4.2.

    You can manually update the image stream tag of internal images from OpenShift 3 to 4 with Podman.

Resource capacity

  • The clusters require additional memory, CPUs, and storage in order to run a migration on top of normal workloads. Actual resource requirements depend on the number of Kubernetes resources being migrated in a single migration plan.
  • Check that the OpenShift 3.9+ source cluster meets the minimum hardware requirements for an OpenShift installation.
  • Check that the OpenShift 4.x target cluster meets the minimum hardware requirements for the specific platform and installation method. For example, a bare metal installation has specific minimum resource requirements.
  • Verify that the OpenShift 4.x target cluster contains storage classes for the same types (block, file, object) as the OpenShift 3.9+ source cluster.
  • Check the available bandwidth between the source and target clusters. Less than 10 Gbps is not recommended.
  • If you are migrating more than 20 TB, check that the target cluster and the replication repository have sufficient storage.

Performance

  • Check cluster compute and memory utilization: $ oc adm top node.
  • Check the average response time of API calls in the source cluster. Less than 50 ms. is recommended.
  • Check etcd disk performance on the source and target clusters with fio.

Additional checks

  • Review the migration considerations in the OpenShift documentation.
  • Verify the identity provider on the source and target clusters.
  • Verify the network visibility between namespaces on the OpenShift 4.x target cluster, especially if the OpenShift 3.9+ source cluster uses the multitenant network plugin.
  • OpenShift 4.x uses the networkpolicy network plugin, which has an open policy by default. All pods and services are accessible from any project.