Gaining Access to Sequencing Data on the HPC
To access your sequencing data on the NYU HPC cluster, you will need to:
- Create an HPC account by visiting the NYU High Performance Computing Wiki and following the account creation procedures.
- Submit a request using the Biology Computation Support Form to join the CGSB Linux working group and gain permission to your lab's sequencing results directory.
Data Policy and Retention
- Demultiplexed and raw lane fastq files are transferred to lab directories at
/projects/rps/cgsbon the HPC. - Lab owners receive read access to fastq files in this location.
- Data in lab directories is backed up and is not subject to deletion.
- Raw and processed sequencing directories are archived and retained for a minimum of five years.
- Raw sequencing directories are available upon request.
- Lab shares are kept up to 3 years after PI departure from CGSB.
HPC Best Practices
- Run jobs and save output in your personal scratch directory:
/scratch/netID/my-project/job-xyz/ - Store Slurm scripts in job or project directories to enable parameter verification and reproducibility.
- Keep personal scripts (Python, executables) in your home folder:
/home/netID/ - Reference scripts from your home directory using the
$HOMEvariable in Slurm submissions. - Request unavailable software packages from hpc@nyu.edu; check existing modules via
module avail.
HPC Important Locations
| Directory | Path | Quota | Flushing | Backup | Purpose |
|---|---|---|---|---|---|
| GenCore Fastq Delivery | /projects/rps/cgsb/gencore/out/ |
— | Protected | Yes | Sequencing output files |
| Lab Share | /projects/cgsb/ |
Subject to charges | Protected | Cloud | Lab collaboration and results sharing |
| Personal Scratch | /scratch/netID/ |
5TB | 60-day deletion | No | Active analysis work |
| Personal Home | /home/netID/ |
50GB | Protected | Yes | Custom scripts and tools |
| Personal Archive | /archive/netID/ |
2TB | Protected | Yes | Completed archived analyses |
| Shared Genomes | /projects/work/cgsb/genomes |
— | Protected | Yes | Common genomic datasets |
To establish a shared lab directory, please submit a Lab Share Directory Form.
For shared genome resources, see the Shared Genome Resource documentation.