Skip to main content

HowTo: Use TWSC COS to back up, synchronize and restore data

In this document, you will learn how to use COS to back up, synchronize and restore data of Virtual Compute Service (VCS) and Container Compute Service (CCS). For VCS instances/containers of different operating systems (Linux and Windows), we will separately show you how to back up and synchronize data using third-party softwares, such as s3cmd and Duplicati:

Preparation

  • Create VCS instances or Interactive Containers.
  • Create COS buckets to back up or synchronize your data.
  • Connect to your VCS instance or Interactive container, and obtain the folder/file path to be backed up/synchronized.

s3cmd: applicable to VCS Linux instances and CCS containers

Install and set up s3cmd

1. Install pip (For CCS container, please go to step 2)

sudo apt install python-pip

2. Download and install s3cmd

sudo pip install s3cmd

3. Set parameters

s3cmd --configure

After that, system will return the following content.

Enter new values or accept defaults in brackets with Enter.
Refer to user manual for detailed description of all options.
Access key and Secret key are your identifiers for Amazon S3. Leave them empty for using the env variables.

4. Enter COS Access Key and Secret Key

Please enter your COS Access key and Secret key

Access Key: <YOUR-ACCESS-KEY>
Secret Key: <YOUR-SECRET-KEY>
info

5. Default Region: press enter key

Default Region [US]:

6. Endpoint: please enter TWSC COS endpoint:cos.twcc.ai

Use "s3.amazonaws.com" for S3 Endpoint and not modify it to the target Amazon S3.
S3 Endpoint [s3.amazonaws.com]: cos.twcc.ai

7. DNS-style sccess point: also enter cos.twcc.ai

Use "%(bucket)s.s3.amazonaws.com" to the target Amazon S3. "%(bucket)s" and "%(location)s" vars can be used
if the target S3 system supports dns based buckets.
DNS-style bucket+hostname:port template for accessing a bucket [%(bucket)s.s3.amazonaws.com]: cos.twcc.ai

8. (Optional) set up encryption password

If you need data encryption, please enter a password for encryption password, or simply press enter key to skip.

Encryption password is used to protect your files from reading
by unauthorized persons while in transfer to S3
Encryption password:

9. (Optional) Bypass GPG program path: press enter key

Path to GPG program [/usr/bin/gpg]:

10. Use HTTPS protocol, enterYes (case sensitive!)

When using secure HTTPS protocol all communication with Amazon S3
servers is protected from 3rd party eavesdropping. This method is
slower than plain HTTP, and can only be proxied with Python 2.7 or newer
Use HTTPS protocol [Yes]: Yes

11. Bypass HTTP Proxy server name: press enter key

On some networks all internet access must go through a HTTP proxy.
Try setting it here if you can't connect to S3 directly
HTTP Proxy server name:

12. Double check the settings

New settings:
Access Key: XXXXXXXXXXXXXXX
Secret Key: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Default Region: US
S3 Endpoint: cos.twcc.ai
DNS-style bucket+hostname:port template for accessing a bucket: cos.twcc.ai
Encryption password:
Path to GPG program: /usr/bin/gpg
Use HTTPS protocol: True
HTTP Proxy server name:
HTTP Proxy server port: 0

13. Test access: enter n

Test access with supplied credentials? [Y/n] n

14. entery to save the settings

After entering y, you will get the path and file name of the configuration file:/home/<system account>/.s3cfg

Save settings? [y/N] y
Configuration saved to '/home/<system account>/.s3cfg'

15. Modify certificate settings to avoid backup/synchronization errors

(system account) please change to the system account shown in step 14

sed -i 's/check_ssl_certificate = True/check_ssl_certificate = False/g' /home/<system account>/.s3cfg

Back up/synchronize data to COS

After the setting is completed, you can save your data in VCS instances and containers to COS buckets. There are two methods: backup and synchronization.

Backup: All your data will be stored. If the data is deleted or lost by mistake, you can restore the data from COS at any time.
Synchronization: You can make your data in VCS instances/containers consistent with the data in COS. If the supercomputer data is deleted, the COS data will be deleted simultaneously.
  • Back up data

    • Back up now

      s3cmd sync /<LOCAL_DIR> s3://<DEST_BUCKET>
    • Set the backup schedule, for example, regularly back up files to COS every day at 2AM

      Enter the following command to edit the crontab job

      crontab -e 

      Enter the following command in the script

      0 2 * * * /usr/local/bin/s3cmd -c /<LOCAL_DIR>/.s3cfg sync /home/ubuntu s3://<DEST_BUCKET>
  • Synchronize data

    • Synchronize now

      s3cmd sync --delete-removed /<LOCAL_DIR> s3://<DEST_BUCKET>
    • Set the synchronization schedule, for example, regularly synchronize files to COS every day at 2AM

      Enter the following command to edit the crontab job

      crontab -e  # Add the following content

      Enter the following command in the script

       0 2 * * * s3cmd sync --delete-removed /<LOCAL_DIR> s3://<DEST_BUCKET>
      info

      Configure schedules:* * * * *(Minute Hour Day Month Week)

      NameAllowed value (integer)
      Minute0 - 59, or * (no designation)
      Hour0 - 23, or * (no designation)
      Day of the month1 - 31, or * (no designation)
      Month1 to 12, or * (no designation)
      Day of the week0 to 7 (both 0 and 7 both mean Sunday), or * (no designation)
      info

      If you need to back up multiple versions of files, please enable the Bucket Versioning function. Refer to Enable or Disable Versioning for a Bucket from S3 Browser.

Recover backed up/synchronized files from COS

If the data of VCS instances/containers is deleted by mistake or is lost, you can use the restore function to recover the data. There are two methods: backup and synchronization.

Restore from backup: All data backed up in COS will be restored to local machine, and the inconsistent data between both sides will still be stored in supercomputer.
Restore from synchronization: The data synchronized with COS will be completely restored to local machine, and will use COS data for the inconsistencies between both sides.
  • Restore from backup
    s3cmd sync s3://<DEST_BUCKET>/ /<LOCAL_DIR>/ 
  • Restore from synchronization
    s3cmd sync --delete-removed s3://<DEST_BUCKET>/ /<LOCAL_DIR>/

Other COS commands

In addition to backing up, synchronizing and recovering data, s3cmd can also perform general operations on COS buckets:

  • List all buckets
s3cmd ls
  • Create a bucket
s3cmd mb s3://<BUCKET_NAME>
  • Upload a file
s3cmd put <LOCAL_FILE> s3://<BUCKET_NAME>
  • List all files in the bucket
s3cmd ls s3://<BUCKET_NAME>
  • Delete a file in the bucket
s3cmd rm s3://<BUCKET_NAME>/<FILE_NAME>
  • Delete a large number of files in the bucket (e.g., all files under the folder gpu-burn)
s3cmd del s3://<BUCKET_NAME>/gpu-burn/*
  • Delete a folder in the bucket
s3cmd del --recursive s3://<BUCKET_NAME>/<FOLDER_NAME>/
  • Delete a bucket
s3cmd rb s3://<BUCKET_NAME>

Duplicati: applicable to VCS Windows instances

Download and install Duplicati

Back up data to COS

After the installation, you can save your data in VCS instances and containers to COS buckets.

1. Add backup settings

Click + Add backup , and select Configure a new backup

Set the backup name, select AES-256 encryption, built-in for encryption, and set the passphrase.

2. Set the backup destination

  • Storage Type: S3 compatible
  • Use SSL: please tick the checkbox
  • Server: select Custom server url, and enter cos.twcc.ai
  • Bucket name: enter the bucket name you wish to store the backups
  • AWS Access ID/Key: enter COS Access Key and Secret Key
info
  1. You can get public COS connection information on SERVICES> Cloud Object Storage (COS)> Third-party Software page.

  1. You can find the COS connection information on Private COS > Third-party Software page.

  • After entering the information, click Test Connection. If the settings are correct, system will pop up Connection worked! message.

3. Set backup source data and schedule

  • Select the folder or file you want to back up from the local computer

4. Set automatic backup schedule

E.g., regular backup automatically at 1 PM every day

  1. Set backup options and store settings

Select the default value for the remote volume size (select a smaller size if your Internet speed is slow). The number of backup retention can be selected according to your needs.

Select backup retention according to your needs (keep all backups, delete backups that are older than a specific time, keep a specific number of backups, smart backup retention or custom backup retention.).

  • Click Save to save the settings. After that, Home page will display the next backup schedule and backup configuration.

  • If you need to back up immediately, please click Run now

  • If you don’t need to back up data anymore, click Delete to delete the configuration

Restore backup data from COS

1. Set the restore configuration

Click Restore, and select the backup configuration you want to restore

2. Select files

Select the backup version, and select the files and folders you want to restore

3. Set the restore options

Set the options according to your needs, and click Restore to start the restoration

After the restoration, it will show you the following message: Your files and folders have been restored successfully.