Home > Using Globus to Move and Share Data

Using Globus to Move and Share Data

Research data has seen explosive growth with ever-increasing instrument resolution and the advent of “big data” methods such as deep learning. To share these large datasets, Einstein has implemented several different tools which apply to different scenarios.

This document outlines the use of Globus, a tool for easily moving large files, and to automate file transfers.
Below are instructions for:

  1.  Setting up your account to use the tools
  2. Sending and receiving files from Einstein employees
  3. Sending and receiving files from other institutions

What is Globus?

Globus enables users to exchange files with a simple file explorer interface. Globus has been designed to deal with large files efficiently but is useful for any size data. Globus consists of a browser-based file manager user interface and Grid-FTP and other tools in the background to move and schedule data.

Globus transfers the data securely directly from the source to the destination, monitors the progress, and validates that the data has been copied correctly. Once the transfer has been initiated you do not need to watch over the process. Globus will navigate any network interruptions and resume the transfer where it left off.

Globus organizes data into collections. Collections are pre-defined data sources that can be at any institution that uses Globus. To use transfer data using Globus, you must have permission to access the source and destination collections set up ahead of time. For example, you may have a collection defined on data.einstein and one on your personal or lab workstation, then you can use Globus to move data without the need to mount data.einstein on your workstation.

For detailed information about Globus see Globus Documentation.


Who Can Use Globus?

Everyone with either and einsteinmed.edu or montifiore.org email address can use Globus.

The Globus web application, https://app.globus.org, uses the Einstein campus Active Directory to identify allowed users. Once you are on the app, you can see any collections that you have permission to access. To create a collection on data.einstein, create a service request with Information Technology at Einstein.

Setting Up Your Account on Globus


To get started, navigate to https://app.globus.org where you will be prompted to login. Select Albert Einstein College of Med from the drop-down menu:

You will then be prompted to enter your credentials:

You will then be presented with the File Manager page:

Click on the Search area of the Collection menu and type in the name of your collection. Select the desired two collections, and Globus will list the contents in the file browser. You can then select files and folders to be transferred from one are to the other. Note that the two collections are both source and destination.

Creating and Accessing Collections

Data collections are central to Globus . You can set up your own personal collection by installing Globus Connect Personal on your system. This establishes your system as a Globus endpoint, and you can select a certain area of your local disk for sharing. Note that only people to whom you give permission can access your data. In this example, you only provide permission to yourself between the two systems.

If a collaborator has given permission to you to see their collection, you can access it through the file browser.
To create a collection on a computer for which you do not have admin privileges, such as data.einstein, create a service request with Information Technology at Einstein.

  •  Example 1: Remote Work
    You may want to transfer files back and forth to campus systems without being on the campus network, for example, between your laptop and the campus data lake, data.einstein.
    To accomplish this, create a collection on your personal computer by following the instructions at Globus Connect Personal, and request that a “globus” folder be created under your home directory on data.einstein and shared out on Globus as a named collection. Once this is set up, you would be able to move data back and forth between the two systems in an asynchronous manner. You would not need to be able to see data.einstein from your local machine or even be connected to the campus network.
    Example 2: Sharing Data with Outside Collaborators
    To share data with an outside collaborator, first determine where the data resides or is to be placed. You can use your Globus Connect Personal setup for this, or a collection on data.einstein. The external user would have to give you permission to read or read/write to their collection as required, and you would do the same. If you are only downloading information, then the external user does not need access to your collection at all. As above, to create a collection on campus, create a service request with Information Technology at Einstein.
  • Example 3: Moving Data from an Instrument to Central Storage
    Many departments have instruments which are constantly producing data. This data is stored locally on a system directly connected to the instrument, and then must be moved somewhere for processing. The Information Technology at Einstein engineering team can work with you to set up a Globus end point on the local system and have it transfer data to another endpoint such as data.einstein.


Resources and Asking for Help

The primary resource available online is the HPC website:

HPC3.0 User Guide

Here you will find information on the cluster, training videos, and documentation on using the cluster and data storage.
For user support go to:


Our intent is that you will use this document as a guide. While we work diligently to prepare accurate documentation, the steps we have outlined and the screen shots we have provided above will not precisely replicate each person’s experience as technology evolves and varies based on device hardware, software version, and device customizations or configurations. If this guide is inadequate for your needs, please open a support request with us and we will help you.