Services

We provide NGS computational analysis for TCI-affiliated faculty members. The BiNGS staff advises on experimental design and protocols related to NGS, performs data analysis, and assists in data interpretation, integration, and visualization.

We offer two tiers of services:

1) Standard Computational Analysis

Step 1: Project design meeting. Upon receiving a project request we will schedule a meeting to discuss the details. We recommend having the initial meeting before you start experimenting so we can advise on best experimental practices, available technologies, data analysis and statistical considerations to ensure that the data generated will be meaningful and appropriate to address the scientific questions from a computational perspective. We also accept projects starting from NGS data that was already generated.

Step 2: Standard analysis. All data sets will go through a rigorous QC evaluation followed by alignment to the appropriate genome. Depending on data type, we will perform basic analysis that allows the researcher a first look at the results (e.g. for RNA-seq data we will generate differential expression tables, and plots to visualize data; for ChIP-seq data we will generate pileup files allowing visualization on a genome browser and call significant peaks). For each project, we will produce a report containing a complete and detailed description of the results and computational methods (e.g. QC reports, protocols used for read trimming/mapping, methods of read filtering, methods of alignment and peak calling, and data normalization techniques).

Step 3: Follow-up meeting (optional). Upon completion of data analysis, we will request a second meeting to discuss the results and perform additional analyses that are within the scope of our standard analysis. Should the investigator choose a more customized analysis upon completion of the standard analysis, they can request to do so as described below.

2) Customized Computational Analysis

Upon request, the BiNGS Shared Resource Facility will provide customized analyses. For such projects, the facility will assign a bioinformatician that will take a ‘deeper dive’ into the data working closely with the investigator to address specific hypotheses (e.g. clustering, dimensionality reduction, data integration with publicly available datasets, network analysis, enrichment analysis, and more sophisticated and customized data visualization). A payment structure for such projects will be evaluated on a case by case basis.

Acknowledgement

We asked that our work be acknowledged in publications and presentations supported by BiNGS. Please also consider including our bioinformaticians in the authors list in cases where they contributed significantly.

Please acknowledge us with the following statement:
“This work was supported in part through the Bioinformatics for Next Generation Sequencing (BiNGS) shared resource facility within the Tisch Cancer Institute at the Icahn School of Medicine at Mount Sinai. The development of this shared resource is partially supported by the NCI P30 Cancer Center support grant.”

Epigenomics

Chromatin immunoprecipitation (ChIP) with massively parallel DNA sequencing is a method used to analyze protein interactions with DNA to identify the binding sites of DNA-associated proteins.

Standard analysis:

  • Evaluation of reads quality and alignment statistics
  • Samples normalization using internal controls or computational methods
  • A link to a UCSC genome browser session for all normalized datasets
  • Peak files for all significantly enriched regions
  • Assessment of sample similarity (PCA plot)
  • Annotation of genomic distribution (e.g. promoters, gene bodies, etc)
  • Gene Set Enrichment Analysis (GSEA)
  • Gene-Ontology terms and pathway enrichment analysis

Customized analysis:

  • Motif discovery
  • Differential peaks across multiple samples
  • Data integration (e.g. RNA-seq, ATAC-seq)
  • Characterization of chromatin states
  • Enhancer and Super Enhancer identification and gene association
  • Identification of alternative promoters
  • Alignment to repetitive sequences and enrichment quantification
  • Data integration with publicly available resources (e.g. ENCODE, TCGA)
  • Publication quality figures

Assay for Transposase-Accessible Chromatin using sequencing is a technique used to assess genome-wide chromatin accessibility which is a strong indicator of the activities of functional DNA sequences.

Standard analysis:

  • Evaluation of reads quality and alignment statistics
  • Samples normalization using computational methods
  • A link to a UCSC genome browser session for all normalized datasets
  • Peak files for all significantly accessible regions
  • Assessment of sample similarity (PCA plot)
  • Annotation of genomic distribution (e.g. promoters, gene bodies, etc)
  • Gene Set Enrichment Analysis (GSEA)
  • Gene-Ontology terms and pathway enrichment analysis

Customized analysis:

  • Motif discovery
  • Differential peaks across multiple samples
  • Quantification of differential accessibility across multiple samples
  • Data integration (e.g. RNA-seq, ChIP-seq)
  • Association of intergenic accessible regions with genes
  • Data integration with publicly available resources (e.g. ENCODE, TCGA)
  • Publication quality figures

Cleavage Under Targets and Release Using Nuclease combines antibody-targeted controlled cleavage by micrococcal nuclease with massively parallel DNA sequencing to identify the binding sites of DNA-associated proteins. CUT&Tag-sequencing is an improvement over Cut&Run using the Tn5 for DNA tagmentation.

Standard analysis:

  • Reads quality and alignment statistics evaluation
  • Samples normalization using internal control or computational methods
  • A link to a UCSC genome browser session for all normalized datasets
  • Peaks files for all significantly enriched regions
  • Assessment of sample similarity (PCA plot)
  • Gene Set Enrichment Analysis (GSEA)
  • Gene-Ontology terms and pathway enrichment analysis

Customized analysis:

  • Motif discovery
  • Differential peaks across multiple samples
  • Data integration (e.g. RNA-seq, ATAC-seq)
  • Characterization of chromatin states
  • Enhancer and Super Enhancer identification and gene association
  • Identification of alternative promoters
  • Alignment to repetitive sequences and enrichment quantification
  • Data integration with publicly available resources (e.g. ENCODE, TCGA)
  • Publication quality figures

Single-cell chromatin accessibility sequencing has become a powerful technology for understanding genome-wide epigenetic regulatory landscape heterogeneity of complex tissue.

Standard analysis:

  • Evaluation of reads quality and alignment statistics
  • Samples normalization using computational methods
  • A link to a UCSC genome browser session for all normalized datasets
  • Peak-Barcode Matrix
  • Assessment of samples’ similarities (PCA plot)
  • Annotation of genomic distribution (e.g. promoters, gene bodies, etc)
  • Gene Set Enrichment Analysis (GSEA)
  • Gene-Ontology terms and pathway enrichment analysis

Customized analysis:

  • Dimensionality Reduction, Clustering and t-SNE Projection
  • TF motif analysis, TF enrichment for each cell, differential foot printing analysis
  • cis-interaction prediction
  • Cell type assignment based on chromatin accessibility signals of known cell type marker genes
  • Quantification of differential accessibility across multiple samples
  • Data integration (e.g. RNA-seq, ChIP-seq)
  • Association of intergenic accessible regions with genes
  • Data integration with publicly available resources (e.g. ENCODE, TCGA)
  • Publication quality figures

Hi-C is a method that uses high-throughput sequencing to find chromatin conformations in an all against all manner throughout the entire genome. Hi-ChIP combines Hi-C with ChIP-seq to detect all interactions mediated by a protein of interest.

Standard analysis:

  • Evaluation of reads quality and alignment statistics
  • A link to a UCSC genome browser session for all normalized datasets
  • Loop calls for the significant interactions
  • Compartment and TAD calls

Customized analysis:

  • Differential loops across multiple samples
  • Data integration (e.g. RNA-seq, ATAC-seq)
  • Enhancer and Super Enhancer identification and gene association
  • Publication quality figures

Transcriptomics

RNA-Seq is a technology-based sequencing technique which uses next-generation sequencing to reveal the presence and quantity of RNA in a biological sample at a given moment, analyzing the continuously changing cellular transcriptome. RNA-seq provides normalized expression levels and allows the identification of differential gene expression patterns between samples, alternative splicing events and usage of alternative promoters.

Standard analysis:

  • Evaluation of reads quality and alignment statistics
  • Samples normalization using internal controls or computational methods
  • A link to a UCSC genome browser session for all normalized datasets
  • Normalized read counts (TPM)
  • Assessment of sample similarity (PCA plot)
  • Differential gene expression
  • Gene Set Enrichment Analysis (GSEA)
  • Gene-Ontology terms and pathway enrichment analysis

Customized analysis:

  • Gene expression modules
  • Clustering
  • Motif discovery
  • Data integration (e.g. ATAC-seq, ChIP-seq)
  • Data integration with publicly available resources (e.g. ENCODE, TCGA)
  • Publication quality figures

Alternative splicing is a process that enables mRNA to direct synthesis of different isoforms that may have different cellular functions or properties.

Standard analysis:

  • Evaluation of reads quality and alignment statistics
  • Samples normalization using internal controls or computational methods
  • A link to a UCSC genome browser session for all normalized datasets
  • rMATS output report
  • Quantification of differential splicing events (SE, RI, A5’SS, A3’SS and MXE)
  • Gene Set Enrichment Analysis (GSEA)
  • Gene-Ontology term and pathway enrichment analysis

Customized analysis:

  • Shapiro plots of 55’SS strength and motif
  • GO and Pathway analysis
  • Publication quality figures

Alternative promoters are one of the main transcriptional regulatory mechanisms that play a central role in determining the set of expressed transcripts as well as their expression levels in a cell.

Standard analysis:

  • Evaluation of reads quality and alignment statistics
  • Samples normalization using computational methods
  • A link to a UCSC genome browser session for all normalized datasets
  • Assessment of sample similarity (PCA plot)
  • Heatmap of promoter activity estimates
  • Identification of alternative promoter usage across conditions
  • Gene Set Enrichment Analysis (GSEA)
  • Gene-Ontology terms and pathway enrichment analysis

Customized analysis:

  • Motif discovery
  • Data integration (e.g. ChIP-seq, ATAC-seq)
  • Data integration with publicly available resources (e.g. ENCODE, TCGA)
  • Publication quality figures

With the advent of next-generation sequencing, transcriptomic characterization of patients’ cohorts has become increasingly valuable. Landmark cancer genomic datasets such as the Cancer Genome Atlas (TCGA) have molecularly characterized thousands of matched normal/cancer samples spanning most cancer types. Transcriptomic analysis of public cancer datasets allows the characterization of differential gene expression, enrichments of specific gene sets and survival in the context of specific cancer type, mutations and/or expression of specific genes. 

Standard analysis:

  • Differential gene expression analysis (possibly in the context of specific mutation/cancer type)
  • GO, Pathway analysis, Gene Set Enrichment Analysis (GSEA)
  • Clustering of samples through dimensionality reduction
  • Survival correlations
  • Clinical features correlations
  • Publication quality figures

Genomics

A diverse array of DNA sequencing strategies are currently available. These range from high read-depth cancer gene panel sequencing to whole genome sequencing. These approaches allow precise genetic variant detection as a fundamental step for understanding disease causes, evolution and response to treatment.

Standard analysis:

  • Evaluation of reads quality and alignment statistics
  • Duplicate read removal, base recalibration and other best practices
  • Germline short variant discovery (SNPs and indels)
  • Somatic short variant discovery (SNP and indels)
  • Germline copy number (CNVs) and Structural Variant (SV) discovery
  • Somatic copy number (CNVs) and Structural Variant (SV) discovery

Customized analysis:

  • Multiple caller consensus integration and reporting
  • Variant annotation and interpretation of pathogenicity
  • Mutational burden and mutational signatures analyses (somatic)
  • Cohort integration and visualization of variants (i.e oncoprint, lollipop plots)
  • Sub-clonality and tumor purity estimation
  • Tumor evolution studies for longitudinal sample sets
  • Cohort integration and visualization of CNVs (i.e. GISTIC)
  • Multitrack visualizations using Circos plot
  • Complex genomic rearrangements (i.e. chromothripsis) analysis (WGS)
  • Integrative analyses with RNA-seq (eQTL)
  • Integration non-coding variant with transcriptomic and epigenetic data (WGS)
  • Publication quality figures