NanoString Gene Expression RCC Methods

Normalization

ROSALIND® follows the default nCounter® Advanced Analysis protocol of dividing counts within a lane by the geometric mean of the normalizer probes from the same lane. Housekeeping probes to be used for normalization are selected based on the geNorm algorithm as implemented in the Bioconductor package NormqPCR. For more information on the normalization methods, reference page 41 of the nCounter® Advanced Analysis 2.0 User Manual.

Multi-Lot and PlexSet Calibration

For scenarios where a user has more than one lot of the same panel or for PlexSet data, the user is able to define reference or calibration samples in ROSALIND during experiment setup. These samples are used to quantify and adjust for variability in probe efficiency across batches or lanes. Calibration factors are calculated on a per lot basis and are multiplied across all probes in that lot.

Differential Expression

ROSALIND follows the nCounter® Advanced Analysis protocol to identity the targets which express significant increased or decreased expression. Differential expression is calculated based on user specified groups. In ROSALIND, users can set up comparisons based on sample attributes or selecting specific samples for each comparison of interest. Effects of confounding variables or covariates can be optionally assessed with ROSALIND Covariate Correction. Fold changes and pValues are calculated using the fast method as described on page 50 of the nCounter® Advanced Analysis 2.0 User Manual. P-value adjustment is performed using the Benjamini-Hochberg method of estimating false discovery rates (FDR).

There is a thresholding step in the Gene Expression workflow that follows the Advanced Analysis protocol. Background thresholds are calculated by taking the 97.5th percentile of negative controls. Probes are removed from the analysis if more than half of the samples do not meet the sample specific thresholds. For probes that have been pruned, ROSALIND reports log2FoldChange as 0 and pValue and pAdj as 1.

Cell Type Profiler

Abundance of various cell populations is calculated on ROSALIND using the Cell Type Profiling Module. The method quantifies cell populations using marker genes which are expressed stably and specifically in given cell types. ROSALIND performs a filtering of Cell Type Profiling results to include results that have scores with a p-Value less than or equal to 0.05. Cell Type Profiling Algorithm details can be found on page 84 of the nCounter® Advanced Analysis 2.0 User Manual.

Pathway Analysis

Gene Set Analysis (GSA) is incorporated into ROSALIND to summarize the global significance score and the directed global significance score. GSA summarizes the change in regulation within each defined gene set relative to the baseline. More information on GSA algorithm details can be found on page 56 of the nCounter® Advanced Analysis 2.0 User Manual.

Additional ROSALIND enrichment is calculated for gene sets as defined in multiple databases including WikiPathways, REACTOME, MSigDB, and Gene Ontology. Enrichment is calculated using a hypergeometric distribution algorithm in reference to the background set of genes, or all genes in the panel. FDR adjustment and p-Elim pruning scores are provided when appropriate.