Nextflow & nf-core on NYU HPC

All nextflow and nf-core pipelines have been successfully configured for use on the HPC Cluster at New York University. The configuration applies required and recommended options in order to have e...

Non fungible tokens (NFTs) for academic publications?

Being rejected from the preprint server, bioRxiv, seemed like a new low for me. But, I was heartened to know that I was in good company. At least an explanation was provided by the bioRxiv team tha...

Streamlined RNA-Seq Analysis Using Nextflow

UPDATED: April 16, 2024

JBrowse Genome Browser

During the summer of 2020, the Ghedin and Gresham labs at New York University sequenced several SARS-CoV-2 isolates from clinical samples acquired in New York City. To visualize and share the data ...

Variant Calling Pipeline using GATK4

Introduction

Single Cell RNA-Seq Allows For An Unprecedented Look At Plant Root Meristem Cell Identity

In the Kenneth Birnbaum Lab, we are interested in understanding how the plant root is able to grow continuously over the plant’s life and maintain its specific root structure (Fig.1). More specific...

Beginner's Guide to Bioinformatics Tools for Analyzing Microbiome Data

Next-generation sequencing technologies have allowed for sequencing at a low cost and fast speed, and is used more and more to study microbial communities. RNA-seq metatranscriptome and WGS metagen...

Three Useful Nextflow Patterns Every Computational Biologist Should Know

In this article I’ll go over three Nextflow patterns I frequently use to make development of Nextflow data processing pipelines easier and faster. I use each of these in most of my workflows, so th...

HighPrep PCR Beads as an AMPureXP Alternative

Comparing HighPrep PCR and AMPureXP for cleanup and size selection

Gene Set Enrichment Analysis in Minutes with the NASQAR Web App

Gene Set Enrichment Analysis (GSEA) is a common method to analyze RNA-Seq data that determines whether a predefined defined set of genes (for example those in a GO term or KEGG pathway) show statis...

Analyze your Data Faster with NASQAR: Nucleic Acid SeQuence Analysis Resource

The bioinformatics team at the NYU Center for Genomics and Systems Biology in Abu Dhabi and New York have recently developed NASQAR (Nucleic Acid SeQuence Analysis Resource), a web-based platform p...

GPU-Accelerated MinION Basecalling On the HPC

I recently helped the Rockman lab basecall their MinION sequencing data on the Prince HPC, leveraging the power of the GPUs available there. This allowed us to bring the total time required for bas...

How To Find Out What Barcodes Are In Your Undetermined Reads

Sometimes after demultiplexing there exists a high number of undetermined reads, i.e. reads which were not assigned to any library based on the barcodes provided. This is most often the result of i...

Beginners Guide: What is OpenStack?

OpenStack, a project originally started by NASA and Rackspace, is an open source cloud computing platform that enables users to access and control pools of compute, storage, and networking resource...

reform: Modify Reference Sequence and Annotation Files Quickly and Reproducibly

Update 7/21/2021: reform has officially been published as an NFT. Read about this experiment in scientific publishing here. Access the reform publication (PDF) here.

Next-Generation Sequencing Analysis Resources

The NYU Center For Genomics and Systems Biology in New York and Abu Dhabi have developed a new website with resources for mastering NGS analysis: https://learn.gencore.bio.nyu.edu/

Building an Analysis Pipeline for HPC using Python

In this post we will build a pipeline for the HPC using Python 3. We will begin by building the foundation for a pipeline in Python in part 1, and then use that to build a simple NGS analysis pipel...

Explore the New Shared Genome Resource

Save time and resources with the local CGSB repository of commonly used genomic data sets. Data is obtained from Ensembl and NCBI. New versions/releases will be added periodically or upon request. ...

Remote Desktop Connection to Prince

Connect to Prince using a remote desktop to analyze your data in RStudio, visualize in IGV, and interact with other GUI applications on the HPC.

Salmon & kallisto: Rapid Transcript Quantification for RNA-Seq Data

Salmon and kallisto might sound like a tasty entree from a hip Tribeca restaurant, but the duo are in fact a pair of next-generation applications for rapid transcript quantification. They represent...

Node to Joy: Maximize Your Performance on the HPC

In this post we’ll discuss maximizing your performance on the HPC. This entry is aimed towards experienced HPC users; for new users, please see Getting Started on the HPC.

Variant Calling Pipeline: FastQ to Annotated SNPs in Hours

Note: This pipeline guide has been superseded by the updated Variant Calling Pipeline using GATK4 post.

Start the New Year Off Right: How to Choose the Right Sequencer

A new year means new sequencing projects, but how do you know which sequencer is right for your project? There are many factors that go into choosing which sequencing platform and machine will fit ...