Cloudera Enterprise 5.14.x | Other versions

Upgrading to CDH 5.x Using Parcels

Minimum Required Role: Cluster Administrator (also provided by Full Administrator)

This topic describes how to upgrade CDH from any version of CDH 5.x to a higher version of CDH 5.x using Cloudera Manager and parcels. If the CDH 5 cluster you are upgrading was installed using packages, you can upgrade it using parcels, and the upgraded version of CDH will then use parcels for future upgrades or changes. You can also migrate your cluster from using packages to using parcels before starting the upgrade. The minor version of Cloudera Manager you use to perform the upgrade must be equal to or greater than the CDH minor version. To upgrade Cloudera Manager, see Overview of Upgrading Cloudera Manager.

The upgrade procedure described in this topic requires cluster downtime. If the cluster was installed using parcels, has a Cloudera Enterprise license, and has HDFS high availability enabled, you can perform a rolling upgrade that does not require cluster downtime.

  Note: If you are upgrading to a maintenance version of CDH, skip any steps that are labeled

[Not required for CDH maintenance release upgrades.].

The version numbers for maintenance releases differ only in the third digit, for example when upgrading from CDH 5.8.0 to CDH 5.8.2. See Maintenance Version Upgrades.

To upgrade CDH using parcels:

Step 1: Collect Upgrade Information

Before starting an upgrade, collect the following information:
  1. Host credentials. You must have SSH access and be able to log in using a root account or an account that has password-less sudo permission.
  2. The version of Cloudera Manager used in your cluster. Go to Support > About.
  3. The version of the JDK deployed in the cluster. Go to Support > About.
  4. The version of CDH. The CDH version number displays next to the cluster name on the Home page.
  5. Whether the cluster was installed using parcels or packages. This information displays next to the CDH version on the Home page of Cloudera Manager.
  6. The services enabled in your cluster. Go to Clusters > Cluster name.
  7. Operating system type and version. Go to Hosts and click on a hostname in the list. The operating system type and version displays in the Distribution row in the Details section.
  8. Database information for the databases used by Sqoop, Oozie, Hue, Hive Metastore, and Sentry Server (information is only required if theses services are enabled in the cluster).
    Gather the following information:
    • Type of database (PostgreSQL, Embedded PostgreSQL, MySQL, MariaDB, or Oracle)
    • Hostnames of the databases
    • Credentials for the databases
    To locate database information:
    • Sqoop, Oozie, and Hue – Go to Cluster Name > Configuration > Database Settings.
    • Hive Metastore – Go to the Hive service, select Configuration, and select the Hive Metastore Database category.
    • Sentry – Go to the Sentry service, select Configuration, and select the Sentry Server Database category.

Step 2: Complete Pre-Upgrade Steps

Step 3: Stop Cluster Services

Step 4: Back up the HDFS Metadata on the NameNode

[Not required for CDH maintenance release upgrades.]

The steps in this section are only required for the following upgrades:
  • CDH 5.0 or 5.1 to 5.2 or higher
  • CDH 5.2 or 5.3 to 5.4 or higher
  1. Go to the HDFS service.
  2. Click the Configuration tab.
  3. In the Search field, search for "NameNode Data Directories" and note the value.
  4. On the active NameNode host, back up the directory listed in the NameNode Data Directories property. If more than one is listed, make a backup of one directory, becaues each directory is a complete copy. For example, if the NameNode data directory is /data/dfs/nn, do the following as root:
    # cd /data/dfs/nn
    # tar -cvf /root/nn_backup_data.tar .

    You should see output like this:

    ./
    ./current/
    ./current/fsimage
    ./current/fstime
    ./current/VERSION
    ./current/edits
    ./image/
    ./image/fsimage
    If a file with the extension lock exists in the NameNode data directory, the NameNode most likely is still running. Repeat the steps, beginning with shutting down the NameNode role.

Step 5: Back Up Databases

  Note: Backing up databases requires that you stop some services, which may make them unavailable during backup.
Back up the databases for any of the following services that are deployed in your cluster:
Table 1. Service Databases to Back Up
Service Where to find database information
Sqoop Go to Clusters > Cluster Name > Sqoop service > Configuration and select the Database category.
Hue Go to Clusters > Cluster Name > Hue service > Configuration and select the Database category.
Oozie Go to Clusters > Cluster Name > Oozie service > Configuration and select the Database category.
Cloudera Navigator Audit Server Go to Clusters > Cloudera Management Service > Configuration and select the Database category.
Cloudera Navigator Metadata Server Go to Clusters > Cloudera Management Service > Configuration and select the Database category.
Activity Monitor Go to Clusters > Cloudera Management Service > Configuration and select the Database category.
Reports Manager Go to Clusters > Cloudera Management Service > Configuration and select the Database category.
Sentry Server Go to Clusters > Cluster Name > Sentry service > Configuration and select the Sentry Server Database category.
Hive Metastore Go to Clusters > Cluster Name > Hive service > Configuration and select the Hive Metastore Database category.
To back up the databases:
  1. If not already stopped, stop the service:
    1. On the Home > Status tab, click to the right of the service name and select Stop.
    2. Click Stop in the next screen to confirm. When you see a Finished status, the service has stopped.
  2. Back up the database. See Backing Up Databases for detailed instructions for each supported type of database.
  3. Restart the service:
    1. On the Home > Status tab, click to the right of the service name and select Start.
    2. Click Start that appears in the next screen to confirm. When you see a Finished status, the service has started.

Step 6: Run the Upgrade Wizard

  Note: If Cloudera Manager detects a failure while upgrading CDH, Cloudera Manager displays a dialog box where you can create a diagnostic bundle to send to Cloudera Support so they can help you recover from the failure. The cluster name and time duration fields are pre-populated to capture the correct data.
  1. If your cluster has Kudu 1.4.0 (or lower) installed, deactivate the existing Kudu parcel. Starting with Kudu 1.5.0 / CDH 5.13, Kudu is part of the CDH parcel and does not need to be installed separately.
  2. If your cluster has Spark 2.0 or Spark 2.1 installed and you want to upgrade to CDH 5.13 or higher, you must first upgrade to Spark 2.1 release 2 or later before upgrading CDH. To install these versions of Spark, do the following before running the CDH Upgrade Wizard:
    1. Install the Custom Service Descriptor (CSD) file. See
    2. Download, distribute, and activate the Parcel for the version of Spark that you are installing:
      • Spark 2.1 release 2: The parcel name includes "cloudera2" in its name.
      • Spark 2.2 release 1: The parcel name includes "cloudera1" in its name.
      See Managing Parcels.
  3. From the Home > Status tab, click next to the cluster name and select Upgrade Cluster.

    The Getting Started page of the upgrade wizard displays.

  4. If the option to pick between packages and parcels displays, select Use Parcels.
  5. In the Choose CDH Version (Parcels) field, select the CDH version. If no qualifying parcels are listed, or you want to upgrade to a different version, click the Modify the Remote Parcel Repository URLs link to go to the configuration page for Remote Parcel Repository URLs and add the appropriate URL to the configuration. See Parcel Configuration Settings for information about entering the correct URL for parcel repositories. Click Continue.
  6. If you previously installed the GPLEXTRAS parcel, download and distribute the version of the GPLEXTRAS parcel that matches the version of CDH that you are upgrading to.
  7. Read the notices for steps you must complete before upgrading, click the Yes, I ... checkboxes after completing the steps, and click Continue. If you downloaded a new version of the GPLEXTRAS parcel, the Upgrade Wizard displays a message that the GPLEXTRAS parcel conflicts with the version of the CDH parcel, similar to the following:

    Select the option to resolve the conflicts automatically and click Continue.

    Cloudera Manager deactivates the old version of the GPLEXTRAS parcel, activates the new version and verifies that all hosts have the correct software installed.

  8. Click Continue.

    The Host Inspector runs and displays the CDH version on the hosts.

  9. Click Continue.

    The Choose Upgrade Procedure screen displays the available types of upgrades:

    • Full Cluster Restart - Cloudera Manager performs all service upgrades and restarts the cluster.
    • Manual upgrade Cloudera Manager configures the cluster to the specified CDH version but performs no upgrades or service restarts. Manually upgrading is difficult and for advanced users only. To perform a manual upgrade:
      1. Select the Let me upgrade the cluster checkbox.
      2. Click Continue.
      3. See Performing Upgrade Wizard Actions Manually for the required steps.
  10. Select Full Cluster Restart.
  11. Click Continue.

    The Upgrade Cluster Command screen displays the result of the commands run by the wizard as it shuts down all services, activates the new parcel, upgrades services, deploys client configuration files, and restarts services. If any of the steps fail, correct any reported errors and click the Retry button. If you click the Abort button, the Retry button at the top right is enabled.

    Click Retry to retry the step and continue the wizard, or click the Cloudera Manager logo to return to the Home > Status tab and manually perform the failed step and all following steps.

  12. Click Continue.

    The wizard reports the result of the upgrade.

    If your cluster was previously installed or upgraded using packages, the wizard may indicate that some services cannot start because their parcels are not available. To download the required parcels:
    1. In another browser tab, open the Cloudera Manager Admin Console.
    2. Select Hosts > Parcels.
    3. Locate the row containing the missing parcel and click the button to Download, Distribute, and then Activate the parcel.
    4. Return to the upgrade wizard and click the Retry button.

      The Upgrade Wizard continues upgrading the cluster.

  13. Click Finish to return to the Home page.

Step 7: Recover from Failed Steps or Perform a Manual Upgrade

The actions performed by the upgrade wizard are listed in Performing Upgrade Wizard Actions Manually. If any of the steps in the Upgrade Cluster Command screen fail, complete the steps as described in that section before proceeding.

Step 8: Remove the Previous CDH Version Packages and Refresh Symlinks

Step 9: Finalize the HDFS Metadata Upgrade

[Not required for CDH maintenance release upgrades.]

The steps in this section are only required for the following upgrades:
  • CDH 5.0 or 5.1 to 5.2 or higher
  • CDH 5.2 or 5.3 to 5.4 or higher

To determine if you can finalize, run important workloads and ensure that they are successful. Once you have finalized the upgrade, you cannot roll back to a previous version of HDFS without using backups. Verifying that you are ready to finalize the upgrade can take a long time.

Make sure you have enough free disk space, keeping in mind that the following behavior continues until the upgrade is finalized:
  • Deleting files does not free up disk space.
  • Using the balancer causes all moved replicas to be duplicated.
  • All on-disk data representing the NameNodes metadata is retained, which could more than double the amount of space required on the NameNode and JournalNode disks.
To finalize the metadata upgrade:
  1. Go to the HDFS service.
  2. Click the Instances tab.
  3. Select the NameNode instance. If you have enabled high availability for HDFS, select NameNode (Active).
  4. Select Actions > Finalize Metadata Upgrade and click Finalize Metadata Upgrade to confirm.

Step 10: Exit Maintenance Mode

If you entered maintenance mode during this upgrade, exit maintenance mode.

Step 11: Clear Browser Cache (Hue only)

If you have enabled the Hue service in your upgraded cluster, users may need to clear the cache in their Web browsers before accessing Hue.

Page generated March 9, 2018.