Masthead
Masthead

Welcome to GeoScript™ Hub

GeoScript Hub enables easy access to the latest spatial biology analytic capabilities through NanoString™ validated R-scripts. These R-scripts can be used directly within GeoMx® DSP Data Analysis Suite, or they can be incorporated into your own environment. The GeoMx user community is also encouraged to share their own developments; however, these will not be validated by the NanoString team.

NanoString contributed scripts are developed by the NanoString’s biostatistics and bioinformatics teams to support your GeoMx DSP research goals. They are intended to provide simple functionality, such as a heatmap plot individually, and are meant to work as a supplement to the analytics provided in GeoMx Data Analysis software. These scripts are hosted in a public GitHub repository and can be accessed here. Many scripts will have corresponding R-packages that can be incorporated into your own pipeline or development environment. These packages can be accessed in Bioconductor.

Now Available
Example Datasets and Tutorials
Intended Use
Validation Process
Script vs Package
Bioconductor
Now Available

Recently added to GeoScript Hub

Evaluate Normalization for Protein This script was designed for data from the GeoMx protein assay. It will also work with data from the GeoMx RNA assay.

RNA Negative Normalization This script normalizes the observed RNA data using information from the negative probes in the assay.

SpatialDecon (v1.1 updated April 2021) This script was designed for data from GeoMx high-plex RNA assays, such as the Cancer Transcriptome Assay and Whole Transcriptome Assay.

Dimension Reduction (v1.1 updated March 2021) Performs dimension reduction analysis of the segments within the study.

Volcano Plot The Labeled Volcano Plot DSP DA script creates publication-ready labeled volcano plots based on the researchers’ input and statistical study results. The script also creates a table of tagged genes.

Cell-Type Contouring The Cell-Type Contouring ImageJ script enables contouring around a morphology marker-based segment mask.

Example Datasets and Tutorials

Dataset 1: Exploring Whole Transcriptome across the Kidney

This dataset consists of three normal tissue samples and four samples with diabetic kidney disease.  High resolution slide images and corresponding spatially resolved gene expression data for selected regions of interest (ROIs) on the slides are available.

ROI gene expression was detected using the Human Whole Transcriptome Atlas (WTA) from NanoString which is a panel used to collect data from 18,000+ human protein-coding genes on the GeoMx DSP. After WTA expression tags are collected from the ROIs, these data are sequenced on an Illumina sequencer and processed in the GeoMx DSP Data Analysis Suite to link gene identity with expression tag.  Gene expression data are provided as raw counts, processed counts post quality control, and normalized counts.

Download annotated dataset

Intended Use

Intended Use

Whether you are a discovery or translational researcher, the GeoMx DSP is the most flexible spatial solution designed to conform to your ever-changing research needs. All scripts and packages are for research use only.

Validation Process

Validation Process

NanoString developed scripts pass an internal verification process to ensure that the script complies with the documented requirements and correctly performs the functions for which it is intended.

Every script is verified using GeoMx DSP data and analyzed via the Custom Script tool inside the DSP Data Analysis Suite. Scripts are required to have an intended use document to describe how to use the script. And every script is beta tested with customers to provide usability feedback.

If you have questions about a script, please contact geomxsupport@nanostring.com.

Script vs Package

Script vs Package

Script: A script provides a simple and singular functionality using data from DSP Data Analysis Suite. Simple and singular indicates functionality such as a graph or plot, a table, or a single statistical test without the aid of another script. These scripts may use packages that are already installed on the DSP Data Analysis Suite system but cannot use packages that are not already installed.

Package: A package is a collection of R functions, data sets, and help files within a single distributable object. These functions are meant to work together to produce an analysis workflow or pipeline or possibly support the creation of workflows or pipelines in other packages. These functions are designed to work within R in general and are not guaranteed to work with the output from DSP Data Analysis Suite without some data manipulation at the beginning of the pipeline.

Bioconductor

Bioconductor

Bioconductor is an open-source platform that provides tools developed in the R statistical programming language for the analysis of high throughput genomic data. With the January 2021 release of Bioconductor NanoString developed R packages will be available. These R packages can be downloaded from Bioconductor and incorporated into your own pipelines and development environment for the analysis of data from the GeoMx DSP.

Visit bioconductor.org

Available on GeoScript Hub

This script was designed for data from the GeoMx protein assay. It will also work with data from the GeoMx RNA assay.

This script normalizes the observed RNA data using information from the negative probes in the assay.

(v1.1 updated April 2021) This script was designed for data from GeoMx high-plex RNA assays, such as the Cancer Transcriptome Assay and Whole Transcriptome Assay.

(v1.1 updated March 2021) Performs dimension reduction analysis of the segments within the study. The user with options of specifies type of reduction

  • PCA
  • tSNE
  • UMAP

The Labeled Volcano Plot DSP DA script creates publication-ready labeled volcano plots based on the researchers’ input and statistical study results. The script also creates a table of tagged genes.

The Cell-Type Contouring ImageJ script enables contouring around a morphology marker-based segment mask.

Evaluate Normalization for Protein

This script was designed for data from the GeoMx® protein assay. It will also work with data from the GeoMx® RNA assay. However, in this application, some plots will not be relevant as they are uniquely applicable to protein data. If the nCounter readout is used to count probes, then only ERCC-normalized data should be run through this script.

Download
Import & Run
Visualize
Modify
Intended Use
Import & Run

Import & Run

Import

  • Sign into GeoMx® DSP Interface
    Pro Tip: The following steps are performed in the GeoMx® DSP Analysis Suite and can be performed remotely if networked.
  • First, ensure that the correct dataset is selected; a custom script is run on the dataset, slides, and targets that are selected.
  • Click the Custom scripts tab
  • Click Manage
  • Under Manage, click Add “ + ”
  • Load the evaluate_normalization__options.R script
  • Click Save (in blue)
  • Add a name
  • Click Add

NOTE: once a script is loaded, it is available for every data analysis study. To run an existing script, simply follow Step 3 below.

Run

  • Click the Run tab under Custom scripts
  • Select the script so it is highlighted in green
  • Click Run, a dialog box will appear stating “Script executed successfully!”
  • Click the Dataset Summary tab
  • Select Attachments
  • Click Save
Visualize

Visualize

  • First, we will look at the QC plots for housekeepers and negative control IgGs.
  • Our motivating theory is simple: if several probes all accurately measure signal strength, they should be highly correlated with each other. More precisely, the log-ratios between them should have low SDs (this latter criterion is similar in spirit to the geNorm algorithm).
Modify

Modify

Modify by Setting User Parameters

Below is a setting that can be easily adjusted by the user in the script:

  • plot_factor – this allows the user to color by their annotation factor of interest for the QC plots.
Intended Use

Intended Use

About GeoMx Protien nCounter Normalization Script

The Evaluate-Normalization Options Script is designed for data from the GeoMx nCounter protein assay. It is also compatible with data from the GeoMx nCounter RNA assay. Only ERCC-normalized data should be run through this script.

This script does the following:

  • Automatically identifies relevant variables from your segment annotations for plotting
  • Arbitrarily assigns colors to the identified segment annotations
  • Computes multiple normalization factors (negative control IgGs, housekeepers, area, nuclei) for comparison
  • Produces a plot used to QC the negative control IgG’s
  • Produces a plot used to QC the housekeeping proteins
  • Produces a plot used to QC all normalization factors computed in step 3

Plot Examples

Plot QC-ing the IgGs
Housekeepers
Housekeepers cont.
QC plot
Plot QC-ing the IgGs

Plot QC-ing the IgGs

Here, we see good concordance amongst the IgGs, confirming they all can be used. Numbers in the lower-left panels show the SD of the log2-ratios between IgGs. Importantly, we do not see a tendency for one IgG to be offset, suggesting there’s no between-slide bias in calculation of background.

Housekeepers

Housekeepers

Now let us look at the same plot drawn for the housekeepers, shown below:

Above, we see a tendency for the blue-colored slide to over-express S6. <em>Housekeeper</em> normalization might be better without this protein. Though the offset of the red points in the middle-right cluster casts some doubt on GAPDH as well.

Housekeepers cont.

Housekeepers cont.

Finally, let us look below at the overall agreement of the housekeeper factors:

Observations and conclusions, we can make:

  • The IgGs and the housekeepers agree nicely, suggesting that if we normalize using one of them, the other will leave little artifactual signal in the data. If these factors diverged strongly, we would know that normalization with one of them would fail to account to the other, leaving an artifact in the data that must be accounted for in downstream analysis.
  • Area and nuclei are highly consistent with each other (SD log2 ratio of just 0.31).
  • Area and nuclei diverge somewhat from the probe-based normalization factors Neg geomean and HK geomean. This suggests that signal strength is not purely a result of area/cell count, or alternatively, that the neg and HK geomeans are noisy metrics.
  • The concordance of Negs/HKs suggests their performance is adequate, leading to the conclusion that area/nuclei are noisy measurements of signal strength in this data.
QC plot

QC plot

This package/script also produces a QC plot for protein expression:

The above plot helps us identify proteins with no useful signal. For example:

  • LAG3 hovers around background in all segments and should probably be excluded from analysis.
  • PD-L1 is mostly near-background, but it has meaningfully high signal in a handful of segments.
  • CD40L seems to have lower background than the negative controls. But its long range, and especially the existence of points well above background, suggests this protein has interpretable data.

RNA Negative Normalization

This script was designed for data from the GeoMx DSP and is intended to support NGS RNA analyses including those consisting of multiple panels and custom kits.

This script normalizes the observed RNA data using information from the negative probes in the assay.

Download
Import & Run
Visualize
Modify
Intended Use
Import & Run

Import & Run

Import

  • Sign into GeoMx® DSP Interface
    Pro Tip: The following steps are performed in the GeoMx® DSP Analysis Suite and can be performed remotely if networked.
  • First, ensure that the correct dataset is selected; a custom script is run on the dataset, slides, and targets that are selected.
  • Click the Custom scripts tab
  • Click Manage
  • Under Manage, click Add “ + ”
  • Load the RNANegativeNormalization.R script
  • Click Save (in blue)
  • Add a name
  • Click Add

NOTE: once a script is loaded, it is available for every data analysis study. To run an existing script, simply follow Step 3 below.

Run

  • Click the Run tab under Custom scripts
  • Select the script so it is highlighted in green
  • Click Run, a dialog box will appear stating “Script executed successfully!”
  • Click the Dataset Summary tab
  • Select Attachments
  • Click Save
Visualize

Visualize

Normalized results will show up as a new dataset in the DSP-DA and can be downloaded as excel exports.

Modify

Modify

Optional: Some Custom scripts can be modified for further functionality.

  • Click Manage, select the script of interest
  • Modify the script as desired
  • Click Save
  • At this point, the script is modified and can be run as in the RUN tab.

NOTE: For additional information, please refer to the documentation specific to the script of interest (most additional support – size, color, font – each one is different).

Intended Use

Intended Use

About RNA Negative Normalization Script

This script was designed for data from the GeoMx DSP Data and is intended to support NGS RNA analyses including those consisting of multiple panels and custom kits.

This script does the following:

  • Identifies negatives and associates with the corresponding panel
  • Computes the negative normalization factor for each segment for each panelSegment negative count divided by the geometric mean of all segment negative counts
  • Normalizes results for each panel by the corresponding negative normalization factorDivide counts with the corresponding negative normalization factor
  • Returns the negative normalized results as the target count matrixscr

SpatialDecon

This script (v1.1 updated April 2021) was designed for data from GeoMx high-plex RNA assays, such as the Cancer Transcriptome Assay and Whole Transcriptome Assay. It estimates the abundance of mixed cell types within each AOI in an experiment.

Download
Import & Run
Visualize
Modify
Intended Use
Import & Run

Import & Run

Import

  • Sign into GeoMx® DSP Interface
    Pro Tip: The following steps are performed in the GeoMx® DSP Analysis Suite and can be performed remotely if networked.
  • First, ensure that the correct dataset is selected; a custom script is run on the dataset, slides, and targets that are selected.
  • Click the Custom scripts tab
  • Click Manage
  • Under Manage, click Add “ + ”
  • Load the SpatialDecon_plugin.R script
  • Click Save (in blue)
  • Add a name
  • Click Add

NOTE: once a script is loaded, it is available for every data analysis study. To run an existing script, simply follow Step 3 below.

Run

  • Click the Run tab under Custom scripts
  • Select the script so it is highlighted in green
  • Click Run, a dialog box will appear stating “Script executed successfully!”
  • Click the Dataset Summary tab
  • Select Attachments
  • Click Save

Second: Cell Profile Matrix

  • A “cell profile matrix.” This is a .csv file giving the expected expression profiles of each cell type in your dataset. Many such matrices can be found in the Cell Profile Library.
  • For tumor immune deconvolution, use the file “safeTME-for-tumor-immune.csv,” provided along with the script code. To use a custom matrix, make sure it matches the format of the matrices referenced above.
Visualize

Visualize

The deconvolution algorithm outputs

  • Data Tables (Excel)
  • Cell Abundance Scores Heatmap Plot
  • Cell Proportions Plot
  • Scaled Abundance Scores Heatmap Plot
  • Cell Abundance Scores Barplot
  • Cell Proportions Barplot
Modify

Modify

  • Click Manage, select the script of interest
  • Modify the script as instructed: The script accepts five “arguments” that you can set by modifying the top of the script’s code. Instructions for how to use these arguments are in-line in the script’s R code.The arguments are:
    • cell_profile_filename: the .csv file containing the cell profile matrix. This is the name of whatever cell profile matrix .csv file you’ve uploaded to the DSP DA.
    • pure_tumor_column_name: If you have tumor data with ROIs segmented into tumor and microenvironment, you can use this argument to specify which AOIs are almost pure tumor cells. The algorithm will use this information to fit a tumor cell profile and append it to the cell profile matrix. This optional step leads to slightly more accurate immune cell abundance estimates. To use this argument, enter the name of a column in the segmentAnnotations. The code will look for entries in that column that say “tumor.”
    • variables_to_plot: Enter column names of any segmentAnnotations variables you’d like to appear in plots. Use column names without special characters such as dashes or spaces, and begin column names with letters, not a number.
    • custom_annotation_coloring: Allows you to define custom coloring for the “variables_to_plot.” To enable, set this argument to TRUE, and modify the example syntax provided in the code. Any typos here will cause the script to error out.
    • hmcols: Specify a color gradient for heatmaps.
  • Click Save
  • At this point, the script is modified and can be run as in RUN.

NOTE: For additional information, please refer to the documentation specific to the desired script (most additional support – size, color, font – each one is different).

Intended Use

Intended Use

About the SpatialDecon Script

The SpatialDecon script was designed for data from GeoMx high-plex RNA assays, such as the CTA and WTA. It estimates the abundance of mixed cell types within each AOI in an experiment.

This guide to running the SpatialDecon DSP DA script and interpreting the resulting plots.

  • A complete description of the SpatialDecon algorithm is at Biorvix
  • Click the link for the SpatialDecon algorithm at NanoString’s GitHub page

IMPORTANT: please use the appropriate cell profile matrix that represents the tissue type of interest as this will affect the cell abundance/proportion scores. The default matrix below is for Solid Tumor GeoMx data

Plot Examples

Tables
Plot Cell Abundance Scores Heatmap
Cell Proportions
Scaled Abundance Scores Heatmap
Cell Abundance Scores Barplot
Cell Proportions Barplot
Tables

Data tables output

  • Abundance scores tab: Gives the estimated abundance of each cell type in each AOI. These cell abundance scores are interpreted on the same scale as the normalized data: they give abundance of each cell type scaled by whatever quantity you used to normalize the data, e.g. cell abundance per unit of area, or cell abundance per unit of total expression if Q3 normalization was used.
  • Proportions tab: Gives the proportion of each cell type in each AOI. Only “fitted” cells are included in this calculation. For example, if tumor cells are present in the sample but the cell profile matrix only includes immune cells, then proportions will ignore the presence of tumor cells.
  • Scaled abundance scores tab: The same data as the “abundance scores” tab, but scaled so that each cell type has a maximum value of 1.
  • Segment annotations tab: gives the segment annotation data. Rows in this tab are aligned to columns in the other tabs
Plot Cell Abundance Scores Heatmap

Plot Cell Abundance Scores Heatmap

Cell abundance scores heatmap: Shows the results recorded in the “abundance scores” tab of the data tables output. Example below.

Cell Proportions

Cell Proportions

Cell proportions: Shows the results recorded in the “proportions” tab of the data tables output. Example below. This color scheme is activated by using the “viridis option B” option in the hmcols argument.

Scaled Abundance Scores Heatmap

Scaled Abundance Scores Heatmap

Scaled abundance scores heatmap: Shows the results recorded in the “scaled abundance scores” tab of the data tables output. Example below.

Cell Abundance Scores Barplot

Cell Abundance Scores Barplot

Cell abundance scores barplot: See below for an example. Each column shows the cell abundances from a single AOI, with color denoting cell type.

Cell Proportions Barplot

Cell Proportions Barplot

Cell proportions barplot: See below for an example. Each column shows the cell abundances from a single AOI, with color denoting cell type.

Dimension Reduction

The Dimension Reduction (v1.1 updated March 2021) DSP DA script was designed for data from the GeoMx DSP high-plex RNA assays, such as the CTA NGS readout application, but can be used with other assays.

This script does the following:

  1. Performs dimension reduction analysis as specified by the user on the segments within a study. Dimension reduction analysis options include:
    • PCA
    • tSNE
    • UMAP
  2. Generates scatter plots of the resulting reduced dimensions. There are options of users to control the color, shape, point size, font, font size, and file type. Plots are produced that are 6” tall by 8” wide, at 300 DPI (if applicable).
  3. If a PCA is being shown, it will also graph the cumulative proportion of variance explained by each principal component up to the first 15 components
  4. Saves an updated annotation sheet to allow for re-graphing with external software as a excel spreadsheet. If a PCA is graphed, all principal components, loadings, and variance estimates are captured within the spreadsheet as well.
Download
Import & Run
Visualize
Modify
Intended Use
Import & Run

Import & Run

Import

  • Sign into GeoMx® DSP Interface
    Pro Tip: The following steps are performed in the GeoMx® DSP Analysis Suite and can be performed remotely if networked.
  • First, ensure that the correct dataset is selected; a custom script is run on the dataset, slides, and targets that are selected.
  • Click the Custom scripts tab
  • Click Manage
  • Under Manage, click Add “ + ”
  • Load the DimReduction.R script
  • Click Save (in blue)
  • Add a name
  • Click Add

NOTE: once a script is loaded, it is available for every data analysis study. To run an existing script, simply follow Step 3 below.

Run

  • Click the Run tab under Custom scripts
  • Select the script so it is highlighted in green
  • Click Run, a dialog box will appear stating “Script executed successfully!”
  • Click the Dataset Summary tab
  • Select Attachments
  • Click Save
Visualize

Visualize

Example 1: a UMAP with color based on PanCK segmentation, with PanCK+ in green and PanCK- in cyan. Shape should be based on the slide name

  • plot_type = “UMAP”
  • color_by = “SegmentName” # tag, factor, or target
  • shape_by = ” SegmentName ” # tag, or factor
  • plot_colors = list(“green3”, “cyan2 “)
  • color_levels = c(“PanCK-pos”, “PanCK-neg”)

Resulting graph:

Modify

Modify

Setting User Parameters:
There are a few settings that can be adjusted easily by the user at the top of the script script. These include:

  • plot_type – set this to either “PCA,” “UMAP,” or “tSNE” based on user preference. No other values or methods are currently supported

Plotting parameters:

  • color_by – set this to the name (column name) of an annotation tag, factor within the segment annotations, or the target’s display name. For example, you may have a factor named “Location” that you want to visualize. This may be used with the color parameter. Alternatively, “VEGFA” can be used to color points by the continuous expression of VEGFA.
  • shape_by – set this to the name of an annotation tag or factor. Target names may not be used for shape.
  • plot_font – is a list, which includes family and size. Family can be set to ‘sans,’ ‘serif,’ or ‘mono’ to use Helvetica, Times New Roman, or Courier New fonts. Additional fonts may be supported as well, but not all fonts are available. Size is relative; increasing the number (default = 15) shall increase font size relative to the plot size for all labels on the plots.
  • save_as – is a string that defines the type of format you would like to save the graphs. If PDF is selected (default), it will save graphs as a multipage PDF, which is especially useful for PCA analysis, which outputs 4 graphs. PNG or SVG formats are most appropriate for inclusion in print documents and use a dpi resolution of 300. This may be edited further down in the script if a higher resolution is needed. Search for “dpi” to change that variable. Similarly, if a different size is needed, search for “width” or “height” within the script.

Controls for colors: [See below for examples]

  • plot_colors – is a list which should contain either:
    • colors that can be recognized by R. These should be either named colors (e.g. “orange2”) or hexadecimal colors (“#ABABAB”). At least 1 color is needed for each unique entry in the color_by column. See below for a cheat sheet of all named R colors.
    • Alternative, if you do not want to specify all the colors, you may set the first color to the name of a color palette from the palettes listed below. For example, “Dark2” or “Set3”.
  • color_levels – is a list of the annotation tag/factor levels that should be matched to the colors defined in plot_colors. They will be used in the same order as the levels defined in plot_colors. Additional notes:
    • If coloring by a target, set this to “High,” “Mid,” and “Low.” “Mid” is optional if you only want to specify the colors to be used with the minimum and the maximum values. “Mid” defines the color for the median value of a target.
    • If you have more values in your annotation tag than defined, the script will use “Set1” below to add new colors to the palette to represent the values. You do not have to define all of the levels within the tag or factor, but undefined levels will be added in alphanumeric order
Intended Use

Intended Use

The Dimension Reduction DSP DA script was designed for data from the GeoMx DSP high-plex RNA assays, such as the CTA NGS readout application, but can be used with other assays.

This script does the following:

  1. Performs dimension reduction analysis as specified by the user on the segments within a study. Dimension reduction analysis options include:
    • PCA
    • tSNE
    • UMAP
  2. Generates scatter plots of the resulting reduced dimensions. There are options of users to control the color, shape, point size, font, font size, and file type. Plots are produced that are 6” tall by 8” wide, at 300 DPI (if applicable).
  3. If a PCA is being shown, it will also graph the cumulative proportion of variance explained by each principal component up to the first 15 components
  4. Saves an updated annotation sheet to allow for re-graphing with external software as a excel spreadsheet. If a PCA is graphed, all principal components, loadings, and variance estimates are captured within the spreadsheet as well.

Plot Examples

Colors and Palette Options
Example PCA
Example 2
PCA
tSNE
UMAP
Excel File
Colors and Palette Options

Colors and Palette Options

Palette Options: Names to the left of each palette represent the text that can be used in the first entry in plot_colors. Any of the values shown here may be used. Note that palettes with light colors may be harder to distinguish on the graph.

Example PCA

Example PCA

Example PCA:

  • plot_type = “PCA”
  • color_by = “CD68” # tag, factor, or target
  • shape_by = “SegmentName” # tag, or factor
  • plot_colors = list(“yellow2”, “black”, “magenta2”)
  • color_levels = c(“High”, “Mid”, “Low”)

Resulting graph:

Example 2

Example 2

Example 2:

  • plot_type = “PCA”
  • color_by = “CD68”
  • size_by = “PTPRC”
  • shape_by = “SegmentName”
  • plot_colors = list(“cyan3”, “black”, “green3”)
  • color_levels = c(“High”, “Mid”, “Low”)

Resulting graph:

PCA

PCA

Principal component analysis (PCA), is a method for reducing high dimension data down to lower data spaces. It iteratively identifies the linear component that explains the most variation in a dataset, through a process called singular value decomposition. The process is performed on scaled, log2 transformed expression data and captures each dimension and then looks for orthogonal axes of variation that would explain the next most amount of variation within the dataset. These principal components (PCs) can then be used to visualize samples in a much smaller sample space, as well as understanding the amount of variation that the analysis has captured for any given number of components. Samples which appear closer on a given principal component have similar aggregate expression of the genes that comprise that component. An example is shown below of the two graphs output by this method:


In this example, we observe 2 strong clusters that separate based on both PC1 and PC2 based on the segmentation strategy used, where PanCK- regions (stroma) are on the left of the graph, and PanCK+ regions are on the top right. Color denotes a regional factor used to categories ROIs as they were selected, and so we can further explore within-cluster distributions such as the fact that immune high tumor PanCK+ ROIs separate from normal colon PanCK+ ROIs. The graph on the right is also output, and it shows the cumulative variance explained by each the components measured. While we only output the first 3 PCs in the annotation data, you can see by this graph that additional PCs explain smaller & smaller variances in expression. If your dataset is particularly similar or diverse more or less of the variance will be explained by the first PC. The variance explained is also shown as a percentage on the axes of the graphs.
tSNE

tSNE

tSNE (t-Distributed Stochastic Neighbor Embedding) is a method do cluster samples based on expression that is not linearly or orthogonally constrained like the PCA plot. However, it is a stochastic method, and cannot be used to estimate where a new sample would fit within the defined clusters. As such, it is useful for data exploration, but less so for defining characteristics that may be shared in a new dataset.

Reference: L.J.P. van der Maaten and G.E. Hinton. Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9(Nov):2579-2605, 2008.

An example graph is shown on:

In this particular example, tSNE identifies 3 clusters of samples, 1 that is based on PanCK- segments, and then two separate clusters of PanCK+ ROIs. While not visualized here, these clusters may be patient driven, as disease or cancer samples tissues tend to be less closely related than adjacent normal tissues from the same tissue. To visualize this, you can set the color or shape to the scan name or patient ID.

UMAP

UMAP

UMAP (Uniform Manifold Approximation and Projection) is a method of dimension reduction developed for developing a reproducible method for graphing samples in a non-linearly constrained fashion. It has been heavily used by the single- For more information about this method see the reference below:

Reference: McInnes, L, Healy, J, UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction, ArXiv e-prints 1802.03426, 2018

An Example of a UMAP for a set of samples produced is shown below:

In this particular example, color is being defined by the target “VEGFA”, shape is being set by the factor “SegmentName”, and the color palette used is “Set1”. Here we observe 3 clusters. As with tSNE, two clusters are PanCK-positive, which show higher expression of VEGFA than the PanCK-negative segments, which would be expected in the colorectal cancer setting.

Excel File

Excel File

In addition to graphs, the script will also output a excel spreadsheet file with new data columns depending on what variables were used. If a tSNE or UMAP was graphed, the spreadsheet will only have one tab (Segment Annotations).

Within segment annotations at the end of the table you will find the following new data columns:

  • If color is set to a target name it shall be included in the table, with the log2 count values shown for the target based on the active data frame selected
  • UMAP or tSNE is selected 2 new columns (Dim1 and DimPAT2) will be added. If PCA is selected Dim3 will also be added. Dim1&2 represent the graphed values, Dim3 is added in case users are interested in graphing additional PCs.

If a PCA is used there will be 3 additional tabs on the excel file these will be:

  • Principal Components (All) – a table with the principal component scores for all segments for all calculated components during the analysis
  • PC Loadings – These show the loading weights for each principal component. The loadings are sorted based on the first principal component, with the highest absolute value loadings shown at the top
  • Variance Estimates – This shows the standard deviation & proportion of variance explained for each PC as well as the cumulative proportion of variance explained

Volcano Plot

(tutorial video coming soon)

The Labeled Volcano Plot DSP DA script creates publication-ready labeled volcano plots based on the researchers’ input and statistical study results. The script also creates a table of tagged genes.

More graphs examples are located at the EXAMPLES tab
Download
Import & Run
Visualize
Modify
Intended Use
Download
Import & Run

Import & Run

Import

  • Sign into GeoMx® DSP Interface
    Pro Tip: The following steps are performed in the GeoMx® DSP Analysis Suite and can be performed remotely if networked.
  • First, ensure that the correct dataset is selected; a custom script is run on the dataset, slides, and selected targets.
  • Then proceed to click the Custom scripts tab
  • Run a volcano plot statistical test
  • Export the results to a .xlsx file
    We will start by first loading a script and the extra file input from the DSP DA. Change the .xlsx file to a tab-delimited file .txt file before running the script.
  • Click Manage
  • Under Manage, click Add “ + ”
  • Load the LabeledVolcanoPlot.R script
  • Click Add “ + ”
  • Select VOLCANO PLOT.txt
  • Add the desired parameters of the script.
  • Click Save

Run

  • Click the Run tab under Custom scripts
  • Select the script so it is highlighted in green
  • Click Run, a dialog box will appear stating “Script executed successfully!”
  • Click the Dataset Summary tab
  • Select Attachments
  • Click Save
Visualize

Visualize

  • Export the results to a .xlsx file after running a statistical test and creating a DSP DA plot.
  • Under the most recent dataset test run, the user will discover Statistical test results.
Modify

Modify

  • Click Manage, select the script of interest and adjust the parameters.
  • Select the “+” button to add the LabeledVolcanoPlot.R file to the script.
  • Select the “+” button to add the VOLCANO PLOT.txt
  • Modify the script by editing the top lines
  • Select Save
Intended Use

Intended Use

About the Labeled Volcano Plot DSP DA Script

The Labeled Volcano Plot DSP DA script supports the GeoMx nCounter (protein or RNA) or GeoMx NGS (CTA) readout applications. The script creates publication-ready labeled volcano plots based on user input and statistical test results. The script also generates a table with the tagged genes.

Plot Examples

User Parameters Settings
User Parameters Settings Continued
Named R Colors Chart
Example Parameter Set-up
Visualize and Interpret Examples cont.
User Parameters Settings

User Parameters Settings

Modifying the User Parameters Settings:

The twenty-one settings are adjustable by the user at the top of the plug-in script:

Files

  1. de_results_filename: (String) Name of the tab-delimited file you have uploaded to the DSP DA.
  2. output_format: (String) Desired output format for the volcano plot figure. Output options: PNG, JPG, TIFF, SVG, PDF, and BMP

Labeling
Be sure to add the labels to the results file from the DSA DA volcano plot run.

  1. plot_title: (String) Title for figure
  2. negative_label: (String) Matching negative (left) x-axis label to the volcano plot in the DSP DA
  3. positive_label: (String) Matching positive (right) x-axis label to the volcano plot in the DSP DA
  4. show_legend: (Boolean) A color legend appears
  5. n_genes: (Numeric) Number of top genes by pvalue/fdr to label on figure. gene_list overrides this variable if set.
  6. gene_list: (String) List of specified genes labeled no matter what on figure. Default labeling method over n_genes.
User Parameters Settings Continued

User Parameters Settings Continued

Thresholds
The line indicates the threshold settings are in place. If a “no-line” is desired, set thresholds to NULL.
  1. pval_thresh: (Numeric) p-value threshold on the y-axis
  2. fdr_thresh: (Numeric) false discovery rate threshold on the y-axis
  3. fc_thresh: (Numeric) log2 fold change cutoff on x-axis.
  4. label_fc: (Boolean) Should genes below the FC threshold be labeled if they are also above the significance threshold

Fonts

  1. font_size: (Numeric) Font size on the figure
  2. label_size: (Numeric) Size of the font for the gene labels
  3. font_family: (String) Font family for all text on the figure
    o Options: serif, sans, mono

Plot Size

  1. plot_width: (Numeric) Width of the saved figure in inches
  2. plot_height: (Numeric) Height of saved figure in inches
Coloring
Colors that R can recognize should be either named colors (e.g., “orange2”) or hexadecimal colors (“#ABABAB”). See below for a cheat sheet of all named R colors.
  1. default_color: (String) Color of points not in target group or above the significance threshold
  2. fc_color: (String) Color of points below fc_thresh but above significance threshold(s); change to same as default to not call out these targets
  3. target_groups: (String) Color-specific gene target groups in the plot. Label target groups in the VOLCANO PLOT.xlsx file. All genes in the given target_group are colored no matter where they are in the figure. If no group is assigned (NULL), colored targets are above the pval/fdr threshold.
  4. color_options: (String) List of colors to use in the figure. Must have at least the number of target_groups.
Named R Colors Chart

Named R Colors Chart

Example Parameter Set-up

Example Parameter Set-up

The LabeledVolcanoPlot script outputs a typical volcano plot figure with log2 fold change on the x-axis and pvalue or FDR on the y-axis for each target. A table of labeled genes in the figure is also output.

Example figures with different input arguments.

Example 1:
  • n_genes = 25
  • fdr_thresh = 0.01
  • pval_thresh = 0.05
  • fc_thresh = 0.5
  • label_fc = FALSE
  • target_groups = NULL
Visualize and Interpret Examples cont.

Visualize and Interpret Examples cont.

Example 2:
  • gene_list = c(“IL2RG”, “GLUL”, “SPIB”, “C2”, “A2M”,”MLNR”, “TLX1”, “FAM180B”)
  • target_groups = c(“Hemostasis”, “DNA Repair”)
  • pval_thresh = NULL
  • fc_thresh = 0.5
  • fdr_thresh = 0.01

Cell-Type Contouring

(tutorial video coming soon)

The Cell-Type Contouring ImageJ script enables contouring around a morphology marker-based segment mask.

The script enables creation of three segment masks to upload into the GeoMx Control Center: i) a cell-type specific mask based on a fluorescent marker of interest, ii) a geometric ring contour mask around the cell-type mask, and iii) a residual mask with the remaining area in the ROI. The script will also create one visual reference image for user reference displaying the masks together overlaid on the ROI for checking.

Masked ROI
Download & Install
Export
Run
Import
Intended Use
Export

Export

  1. Be sure you have ImageJ installed on your computer (downloaded from the NIH website), which is networked to the GeoMx® DSP. Alternatively, if not networked, you can move the necessary files (e.g., masks) via USB to upload directly on the GeoMx®.DSP.
  2. From the external computer, open ImageJ. Then, open Chrome and access your GeoMx® workspace remotely:
    1. Browse to https://geomx-#### where #### is the GeoMx® instrument number; find this under Administration tab >> Network >> Machine Name.
  3. In the GeoMx® Scan Workspace, create or select a geometric ROI. The segmentation masks you create in subsequent steps will be confined within this ROI.
  4. On the left panel in the workspace is where the ROIs are listed. Click the “Export image” button the ROI. It will save the ROI as a stacked TIFF image.
Run

Run

  1. In ImageJ, please drag & drop the ROI image file into the ImageJ workspace by directly placing it on the ImageJ toolbar.
  2. Next, similarly, drag & drop the first ImageJ script into the ImageJ workspace.
  3. Enter values for the three user inputs:
    1. erode_margin = ____; determines how close adjacent cell type masks can be (ex: a value of 2 corresponds to 2 microns of erosion)
    2. channel_A = ____; determines the fluorescent channel to be used; (e.g., a value of 1: FITC Channel; 2: Cy3 Channel; 3: Texas Red Channel; 4: Cy5 Channel)
    3. ringWidth = ____; determines the ring dimensions (ex: a value of 10 corresponds to a 10-micron ring width)
  4. In the script window select Macros >> Run Macro.
  5. Three pop-ups will appear: i) your tunable image mask, ii) your thresholding slider window, and iii) the script instructions:
    1. Use the threshold slider or the leftmost dropdown menu to select a thresholding preference; an algorithm selected from the dropdown ultimately selects a threshold value.
      • Commonly used are the Li or Default algorithms.
    2. After thresholding, click Apply
    3. Lastly, click OK

    Numerous windows will appear as the script runs; please ensure not to resize, change, or adjust any windows and directly proceed to Step 10.

Import

Import

  1. Masks are directly saved in the ROI image file location.
  2. Return to the GeoMx® DSP interface. Click “Import” under the ROI.
  3. Import masks by navigating to the individual mask files, highlighting all of them, and clicking Open. The masks will appear on your scan with ROI details. Please ensure that the masks are named appropriately as per the GeoMx Control Center pop-up when selecting Import.
  4. (Optional) To change the collection order, drag individual masks up and down the “Segment List.”
  5. Proceed with other ROIs, then finalize ROIs and proceed to Collection.
Intended Use

Intended Use

The design of the Cell-Type Contouring script is to support ROI selection with any GeoMx assay/experiment where contours on a cell-type-specific segment are required.

This ImageJ script does the following:

  1. Creates a cell-type-specific mask with a given ROI based on a fluorescent marker of interest
    1. the user can select the fluorescent channel and tune their mask thresholding
  2. Creates a geometric ring contour mask around the developed cell-type mask
    1. the user can define the ring distance
  3. Creates a third mask that contains the remaining area in the given ROI
  4. The script will output three masks to upload into the GeoMx Control Center and one visual reference image for user reference displaying the masks together overlaid on the ROI.