Tutorial

How To Migrate Linux Servers Part 2 - Transfer Core Data

How To Migrate Linux Servers Part 2 - Transfer Core Data

Introduction

There are many scenarios in which you might have to move your data and operating requirements from one server to another. You may need to implement your solutions in a new datacenter, upgrade to a larger machine, or transition to new hardware or a new VPS provider.

Whatever your reasons, there are many different considerations you should make when migrating from one system to another. Getting functionally equivalent configurations can be difficult if you are not working with a configuration management solution such as Chef, Puppet, or Ansible. You need to not only transfer data, but also configure your services to operate in the same way on a new machine.

In the last article, you prepped your server for data migration. At this point, your target and source system should be able to communicate — the target system should have SSH access to the source system. You should also have a list of software and services that you need to transfer, including version numbers. In this guide, you’ll continue where you left off and begin the actual migration to your new server.

Note: The general idea here is to transfer all of the relevant pieces of information while leaving the target system as clean as possible. Some migration strategies might clone a root partition outright, or start a copy operation at the root of the source machine while only manually excluding a few files that you know will cause conflicts. However, migrating large pieces of system data onto a live operating system can cause unpredictable results, or needlessly clutter your new system with files that are no longer relevant to your operational requirements. This tutorial will instead migrate selectively — by inclusion, rather than exclusion — to create a better end result.

Step 1 – Creating a Migration Script

This tutorial and the next one will be focused on creating and adding to a migration bash shell script as you go along. You will be using a number of low-level Linux tools, and rather than try to run them all interactively, your goal should be to end up with a reproducible set of steps that can capture the relevant parts of your server configuration.

As you write this script, you should be able to run it iteratively as you go along. Most of the tools used in this tutorial, such as rsync, will only transfer data if it has been changed since the last run, so that you can safely repeat commands without needing to worry about making them redundant. Because you configured SSH to enable connecting to the original (source) machine from the new (target) server, you should be working from the target server throughout this tutorial.

You can create this script in your home directory, using nano or your favorite text editor:

  1. nano ~/sync.sh

On the first line of the file, add a script heading, also known as a shebang. This tells the script which interpreter to run with by default. #!/bin/bash means the script will default to the bash shell, which is the most powerful and widely-supported shell available on most systems.

~/sync.sh
#!/bin/bash

Save and close the file for now. If you are using nano, press Ctrl+X, then when prompted, Y and then Enter.

Back on the command line, make the script executable by using chmod:

  1. chmod 700 ~/sync.sh

For an in-depth overview of how chmod and Linux permissions work, you can refer to An Introduction to Linux Permissions.

After having made the script executable and adding the shebang, you can run it by calling it directly:

  1. ~/sync.sh

It will not produce any output yet, as the script is empty. You should test the script regularly through the rest of this tutorial as needed. As in the prior tutorial in this series, you may need to run it with sudo permissions, depending on the steps that you add to the script.

Step 2 – Installing Needed Programs and Services

The first step you’ll add to your migration script will be to restore the packages that you marked for migration in the previous tutorial.

Add Additional Repositories

Before doing that, you’ll want to connect to your original (source) server again in a separate terminal, to check whether you’ve installed software from any third-party repositories. If so, you won’t be able to reinstall those packages in your new environment without first configuring those additional package sources.

In Ubuntu and Debian environments, you can see if alternative repositories are present on your source system by investigating a few locations:

  1. cat /etc/apt/sources.list
Output
… ## Uncomment the following two lines to add software from Canonical's ## 'partner' repository. ## This software is not part of Ubuntu, but is offered by Canonical and the ## respective vendors as a service to Ubuntu users. # deb http://archive.canonical.com/ubuntu impish partner # deb-src http://archive.canonical.com/ubuntu impish partner deb http://security.ubuntu.com/ubuntu impish-security main restricted # deb-src http://security.ubuntu.com/ubuntu impish-security main restricted deb http://security.ubuntu.com/ubuntu impish-security universe # deb-src http://security.ubuntu.com/ubuntu impish-security universe deb http://security.ubuntu.com/ubuntu impish-security multiverse # deb-src http://security.ubuntu.com/ubuntu impish-security multiverse

This is the main package source list — because it’s a single file, you can use cat to output its contents. If the last line of the file contains a ubuntu.com address, then you probably haven’t added any third-party repositories to this file. Additional repositories can also be listed in the sources.list.d directory:

  1. ls /etc/apt/sources.list.d
Output
droplet-agent.list elastic-7.x.list nodesource.list

If this directory is not empty, you can cat the individual files to check each of the repositories:

  1. cat /etc/apt/sources.list.d/elastic-7.x.list
Output
deb https://artifacts.elastic.co/packages/7.x/apt stable main

This will tell you the URL of the repository that you’ll need to re-add to your target machine. In most cases, you can do that with the add-apt-repository command:

  1. sudo add-apt-repository repo_url

On RHEL, Rocky, or Fedora Linux, you can instead use dnf to list the repositories configured for the server:

  1. dnf repolist enabled

You can then add additional repositories to your target system by using dnf config-manager:

  1. sudo dnf config-manager --add-repo repo_url

If you make any changes to your source list, add them as comments at the top of your migration script back on your target system. This way, if you have to start from a fresh install, you will know what procedures need to happen before attempting a new migration.

  1. nano ~/sync.sh
~/sync.sh
#!/bin/bash

#############
# Prep Steps
#############

# Add additional repositories to /etc/apt/source.list
#       deb http://example.repo.com/linux/deb stable main non-free

Then, save and close the file.

Specifying Version Constraints and Installing

You now have your package sources updated on your target machine to match your source machine.

On Ubuntu or Debian machines, you can now install the versions of the software that you need on your target machine by typing:

  1. sudo apt update
  2. sudo apt install package_name=version_number

If the version of the package you are trying to match is more than a few months old, it may have been removed from the official repositories. In this case, you could try to hunt down the older version of the .deb packages (for example, by browsing older upstream repositories, or third-party PPAs) and their dependencies and install them manually with:

  1. sudo dpkg -i package.deb

However, you should do this very sparingly to avoid creating a situation where you have too many packages with version mismatches. If older versions of software are not readily available, test the newest available releases first to see if they still meet your needs, to avoid imposing out-of-date requirements.

For RHEL, Rocky, or Fedora Linux, you can install specific versions of software by typing:

  1. Sudo dnf install package_name-version_number

If you need to hunt down rpm files that have been removed from the repository in favor of newer versions, you can install them with dnf :

  1. dnf install package_name.rpm

Again, keep track of what operations you are performing here. You can include them as comments in the script you are creating:

  1. nano ~/sync.sh
~/sync.sh
#!/bin/bash

#############
# Prep Steps
#############

# Add additional repositories to /etc/apt/source.list
#       deb http://example.repo.com/linux/deb stable main non-free

# Install necessary software and versions
#       apt-get update
#       apt-get install apache2=2.2.22-1ubuntu1.4 mysql-server=5.5.35-0ubuntu0.12.04.2 libapache2-mod-auth-mysql=4.3.9-13ubuntu3 php5-mysql=5.3.10-1ubuntu3.9 php5=5.3.10-1ubuntu3.9 libapache2-mod-php5=5.3.10-1ubuntu3.9 php5-mcrypt=5.3.5-0ubuntu1

Again, save and close the file.

Step 3 – Start Transferring Data

The actual transfer of data is usually not the most labor intensive part of the migration, but it can be the most time-intensive. If you are migrating a server with a lot of data, it is probably a good idea to start transferring data sooner rather than later.

Rsync is a powerful tool that provides a wide array of options for replicating files and directories across many different environments, with built-in checksum validation and other features. Identify any directories whose data you want to transfer, and add rsync commands to your migration script.

A sample rsync command looks like this:

  1. rsync -azvP --progress source_server:/path/to/directory/to/transfer /path/to/local/directory

-azvP is a typical set of Rsync options. As a breakdown of what each of those do:

  • a enables “Archive Mode” for this copy operation, which preserves file modification times, owners, and so on. It is also the equivalent of providing each of the -rlptgoD options individually (yes, really). Notably, the -r option tells Rsync to recurse into subdirectories to copy nested files and folders as well. This option is common to many other copy operations, such as cp and scp.
  • z compresses data during the transfer itself, if possible. This is useful for any transfers over slow connections, especially when transferring data that compresses very effectively, like logs and other text.
  • v enables verbose mode, so you can read more details of your transfer while it is in progress.
  • P tells Rsync to retain partial copies of any files that do not transfer completely, so that transfers can be resumed later.

You can find out more about how to create appropriate rsync commands by reading this article. In some cases, you may have to create the parent directories leading up to your target destination prior to running rsync .

With the addition of rsync commands, your sync script might now look like this:

~/sync.sh
#!/bin/bash

#############
# Prep Steps
#############

# Add additional repositories to /etc/apt/source.list
#       deb http://example.repo.com/linux/deb stable main non-free

# Install necessary software and versions
#       apt-get update
#       apt-get install apache2=2.2.22-1ubuntu1.4 mysql-server=5.5.35-0ubuntu0.12.04.2 libapache2-mod-auth-mysql=4.3.9-13ubuntu3 php5-mysql=5.3.10-1ubuntu3.9 php5=5.3.10-1ubuntu3.9 libapache2-mod-php5=5.3.10-1ubuntu3.9 php5-mcrypt=5.3.5-0ubuntu1

#############
# File Transfer
#############


# Rsync web root
rsync -azvP --progress source_server:/var/www/site1 /var/www/

# Rsync home directories
. . .

Remember that these commands can be re-run and will not transfer any new data unless the source files have changed, so you can add to this script incrementally as you are testing it. Be conservative and iterative about which directories you include.

Migrating Databases and other Non-File Data

Note that you cannot necessarily copy all of your data using rsync without any additional prep. Many applications such as databases store their relevant data across multiple actual “files” in your filesystem, to optimize access using techniques like Database Sharding. These files are generally not meant to be accessed or copied as-is; a database exposes data through a query interface instead.

Fortunately, almost all applications that implement their own storage will include some mechanism of exporting and importing data into ordinary files, so that they can be copied as normal during migrations like this. For example, if you are using MySQL, you can review How to Import and Export Databases. You can then transfer these exports across servers using rsync or scp.

Step 4 – Modifying Configuration Files

Although some software will resume working gracefully after transferring the relevant configuration details and data from the original server, many configurations will need to be modified.

This presents a slight problem for the syncing script. If you run the script to sync your data, and then modify the values to reflect the correct information for its new destination, these changes will be wiped out the next time you run the script. To solve this problem, you can add additional steps to the script which will modify that data in place after transferring it.

Linux includes a number of core utilities that are very useful for this kind of text scripting. Two of these are sed and awk. In general, sed is more straightforward to use if you are making modifications to unstructured text using regular expressions, and awk is more useful for more complex parsing of formatted text or tabular data. Beyond this tutorial, you can also learn more about using sed, or learn more about using awk.

This way, your sync script can perform sed or awk commands immediately after rsync, so that your files are automatically modified as needed after being transferred.

sed syntax looks like this:

  1. sed -i 's/string_to_match/string_to_replace_it_with/g' file_to_edit

The -i flag means that the file will be modified in place rather than creating a separate output file. The s and g do not change and are a regular sed convention. You can also use regular expressions within the string_to_match. Try adding a sed command to your sync.sh:

~/sync.sh
rsync -avz --progress source_server:/etc/mysql/* /etc/mysql/

# Change socket to '/mysqld/mysqld.sock'
sed -i 's/\/var\/run\/mysqld\/mysqld.sock/\/mysqld\/mysqld.sock/g' /etc/mysql/my.cnf

This will change every instance of /var/run/mysqld/mysqld.sock in /etc/mysql/my.cnf to /mysqld/mysqld.sock/g. The \ character is used to precede the / characters because they would otherwise be parsed as the end of your sed expression. This is known as escaping special characters. Make sure that your sed commands come after the rsync commands.

You can use awk for formatted text the same way you used sed for unstructured text. For instance, the /etc/shadow file is divided into tabs delimited by the colon (:) character, which look like this:

/etc/shadow
vault:!:18941::::::
stunnel4:!:18968:0:99999:7:::
sammy:$6$bscTWIVxvy.KhkO8$NJNhpABhJoybG/vDiRzQ2y9OFEd6XtqgCUG4qkuWfld97VEXH8/jUtc7oMtOC34V47VE7HjdpMMv37Aftdb7C/:18981:0:99999:7:::

You could use awk to remove the data from the second “column” (i.e., between the first and second : character), like so:

  1. awk 'BEGIN { OFS=FS=":"; } $1=="root" { $2=""; } { print; }' /etc/shadow > shadow.tmp && mv shadow.tmp /etc/shadow

This command is telling awk that both the input and the output delimiter should be parsed as :. It then specifies that if column 1 is equal to “root”, then column 2 should be set to an empty string. Unlike sed, awk doesn’t directly support editing files in place, so this script performs equivalent steps of writing to a temporary file then using mv to overwrite the original input with the temporary file.

Note: While sed is still very broadly popular due to the flexibility of working with regular expressions, awk is considered somewhat arcane by modern standards, and its syntax can be challenging to learn. If you are working with comma-delimited files, consider using a more modern tool such as csvkit.

You can always add comments to your migration script (on lines preceded by #) to document in-progress fixes or changes to your files.

Conclusion

You should now have all the information you need to migrate your application environments and your data to your new server. You should also have good, reproducible documentation for this process should you ever need to redeploy your stack onto a new system.

In the final tutorial in this series, you’ll review how to transfer and test any lingering system services on your new server.

Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.

Learn more about our products


Tutorial Series: How To Migrate to a New Linux Server

Migrating to a new server can be a complex and involved task. Not only do you have to transfer the data itself to a new location, you also have to replicate the service environment and ensure that your components interact as you expect them to.

In this series, we will take you through the steps needed to migrate an existing installation to a new server. Follow along to start developing your migration plan.

About the authors

Default avatar

Senior DevOps Technical Writer


Still looking for an answer?

Ask a questionSearch for more help

Was this helpful?
 
2 Comments


This textbox defaults to using Markdown to format your answer.

You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!

Just wondering what happens if I need to move the new files, for example, I´ve moved everything one day before making the DNS changes and the day I will change the DNS I want to copy just the new files, how do I do that?

Hi, Is it going to work on SUSE sap servers with licensing ? We need to migrate the server from old hypervisor to new.

Try DigitalOcean for free

Click below to sign up and get $200 of credit to try our products over 60 days!

Sign up

Join the Tech Talk
Success! Thank you! Please check your email for further details.

Please complete your information!

Become a contributor for community

Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.

DigitalOcean Documentation

Full documentation for every DigitalOcean product.

Resources for startups and SMBs

The Wave has everything you need to know about building a business, from raising funding to marketing your product.

Get our newsletter

Stay up to date by signing up for DigitalOcean’s Infrastructure as a Newsletter.

New accounts only. By submitting your email you agree to our Privacy Policy

The developer cloud

Scale up as you grow — whether you're running one virtual machine or ten thousand.

Get started for free

Sign up and get $200 in credit for your first 60 days with DigitalOcean.*

*This promotional offer applies to new accounts only.