Introduction

This notebook shows you how to plot all netDx predictor results in one function call. The output generated includes:

  • Summary predictor performance: Plot of average performance (AUROC/AUPR)
  • Detailed predictor performance: AUROC/AUPR curves for all data splits
  • Detailed statistics: Network scores and patient predictions in all splits
  • Themes of selected features: A network visualization of major themes in predictive variables (an EnrichmentMap)
  • A network visualization of patient similarity using only feature-selected variables

Software Requirements

Assumes you have all Cytoscape- and CyRest-related dependencies required to install netDx v1.0.

Note:If Cytoscape is not running, this example will not work!

Let us see if the required dependencies can be installed and/or loaded:

#httr
tryCatch(expr = { library(httr)}, 
          error = function(e) { install.packages("httr")}, finally = library(httr))

#RJSONIO
tryCatch(expr = { library(RJSONIO)}, 
          error = function(e) { install.packages("RJSONIO")}, finally = library(RJSONIO))

#r2cytoscape
tryCatch(expr = { library(r2cytoscape)}, 
          error = function(e) { devtools::install_github('cytoscape/cytoscape-automation/for-scripters/R/r2cytoscape')}, finally = library(r2cytoscape))
## Loading required package: XML
## Warning: package 'XML' was built under R version 3.3.2
# EasycyRest
tryCatch(expr = { library(EasycyRest); detach(package:EascyRest,unload=TRUE)}, 
          error = function(e) { devtools::install_github('BaderLab/Easycyrest/EasycyRest@0.1')}, finally = {})
## 
## Attaching package: 'EasycyRest'
## The following objects are masked from 'package:r2cytoscape':
## 
##     createNetwork, createStyle
## Skipping install of 'EasycyRest' from a github remote, the SHA1 (3f474b6e) has not changed since last install.
##   Use `force = TRUE` to force installation

Set up

suppressWarnings(suppressMessages(require(netDx)))
suppressWarnings(suppressMessages(require(netDx.examples)))

Load data for plotting

In this example, we use data from The Cancer Genome Atlas (http://cancergenome.nih.gov/), downloaded from the PanCancer Survival project (https://www.synapse.org/#!Synapse:syn1710282). We use gene expression profiles from renal clear cell carcinoma tumours to predict poor and good survival after Yuan et al. (2014) (Refs 1-2). The data consists of 150 tumours. Here we work only with the gene expression profiles generated.

phenoFile <- sprintf("%s/extdata/KIRC_pheno.rda",path.package("netDx.examples"))
lnames <- load(phenoFile)
head(pheno)
##             ID age grade     stage STATUS_INT     STATUS
## 1 TCGA-AK-3428  62    G2 Stage III          1 SURVIVEYES
## 2 TCGA-AK-3434  72    G2   Stage I          1 SURVIVEYES
## 3 TCGA-B0-4688  46    G4  Stage IV          0  SURVIVENO
## 4 TCGA-B0-4690  65    G3  Stage IV          0  SURVIVENO
## 5 TCGA-B0-4691  55    G3  Stage IV          0  SURVIVENO
## 6 TCGA-B0-4693  72    G4 Stage III          0  SURVIVENO
pathFile <- sprintf("%s/extdata/Human_160124_AllPathways.gmt",
           path.package("netDx.examples"))
pathwayList <- readPathways(pathFile)
## ---------------------------------------
## File: Human_160124_AllPathways.gmt
## 
## Read 2760 pathways in total, internal list has 2712 entries
##  FILTER: sets with num genes in [10, 500]
##    => 911 pathways excluded
##    => 1801 left

Filter for the genes measured in this dataset. For this example we have stored the names of genes measured in this dataset. In practice you would get this information from the corresponding input table.

xpr_genes <- sprintf("%s/extdata/EMap_input/genenames.txt",
      path.package("netDx.examples"))
xpr_genes <- read.delim(xpr_genes,h=FALSE,as.is=TRUE)[,1]
head(xpr_genes)
## [1] "ZNF121"  "OR2J3"   "HMOX1"   "SYT4"    "GPR137C" "AKAP12"
pathwayList <- lapply(pathwayList, function(x) x[which(x %in% xpr_genes)])

Generate results

inDir <- sprintf("%s/extdata/KIRC_output",
    path.package("netDx.examples"))
out <- plotAllResults(pheno, inDir,outDir=sprintf("%s/plots",getwd()),
               fsCutoff=10,fsPctPass=0.7,pathwaySet=pathwayList)
## * Plotting average and detailed performance
## Single directory provided, retrieving prediction files
## SURVIVEYES
##  Single directory provided, retrieving CV score files
## Got 100 iterations
## * Computing consensus
## SURVIVENO
##  Single directory provided, retrieving CV score files
## Got 100 iterations
## * Computing consensus
## * pathwaySet provided; generating EnrichmentMaps
##       apiVersion cytoscapeVersion 
##             "v1"          "3.5.1" 
## * Applying AutoAnnotate
## * Importing node attributes
## * Creating or applying style
## POST-ing style
## * Final cleanup
## [1] "http://localhost:1234/v1/commands/view/export?OutputFile=/Users/shraddhapai/Documents/Software/netDx/examples/plots/EMap/EnrichmentMap_SURVIVEYES.png"
##       apiVersion cytoscapeVersion 
##             "v1"          "3.5.1" 
## * Applying AutoAnnotate
## * Importing node attributes
## * Creating or applying style
## * Final cleanup
## [1] "http://localhost:1234/v1/commands/view/export?OutputFile=/Users/shraddhapai/Documents/Software/netDx/examples/plots/EMap/EnrichmentMap_SURVIVENO.png"
## * Generating overall patient similarity view
## 2 classes: { SURVIVEYES,SURVIVENO }
## * Creating style
## POST-ing style
## Group SURVIVEYES
## Group SURVIVENO
## * Computing aggregate net
## 
## Writing aggregate PSN
## Loading required package: reshape2

## 
##  12234 pairs have no edges (counts directed edges)
##  Sparsity = 10266/11175 (92 %)
## * Creating network in Cytoscape
## * Create network URL
## Network ID is : 214 
## * Applying layout
## * Applying style
## * Exporting to PNG

The EnrichmentMaps and integrated PSN should now be present in Cytoscape, as well as PNG files in the output directory.

Look at results

Feature scores:

  • featScores/<class>_featScores.txt: Table of feature scores for all splits of nested CV.
  • featScores/<class>_FeatSel_cutoff<fsCutoff>_pct<fsPctPass.txt: Features selected for this class

**PSN-related data:**

  • PSN/outputPDN.png: Integrated patient dissimilarity network
  • PSN/aggregateNet_filterEdgeWt0.00_MEAN.txt: Aggregate PSN from combining all feature-selected nets
  • PSN/predictor_prunedNet_top0.20.txt: Pruned dissimilarity network that is the input to the viz in Cytoscape
  • PSN/pool/*txt: All interaction nets that were input for the aggregate PSN

Enrichment Map:

  • EMap/EnrichmentMap_<class>.png: View of EnrichmentMap for each class
  • EMap/<class>_<yymmdd>.gmt: GMT file that serves as input for EnrichmentMap in Cytoscape
  • EMap/<class>_nodeAttrs_<yymmdd>.txt: Node attribute table with “maxScore” column that is mapped to node fill in Cytoscape.
dir(sprintf("%s/plots",getwd()),recursive=TRUE)
##  [1] "EMap/EnrichmentMap_SURVIVENO.png"                  
##  [2] "EMap/EnrichmentMap_SURVIVEYES.png"                 
##  [3] "EMap/SURVIVENO_170913.gmt"                         
##  [4] "EMap/SURVIVENO_nodeAttrs_170913.txt"               
##  [5] "EMap/SURVIVEYES_170913.gmt"                        
##  [6] "EMap/SURVIVEYES_nodeAttrs_170913.txt"              
##  [7] "PSN/aggregateNet_filterEdgeWt0.00_MEAN.txt"        
##  [8] "PSN/outputPDN.png"                                 
##  [9] "PSN/pool/1.SURVIVENO.1.1.txt"                      
## [10] "PSN/pool/1.SURVIVENO.1.13.txt"                     
## [11] "PSN/pool/1.SURVIVENO.1.15.txt"                     
## [12] "PSN/pool/1.SURVIVENO.1.18.txt"                     
## [13] "PSN/pool/1.SURVIVENO.1.3.txt"                      
## [14] "PSN/pool/1.SURVIVEYES.1.23.txt"                    
## [15] "PSN/pool/1.SURVIVEYES.1.31.txt"                    
## [16] "PSN/pool/netInfo.txt"                              
## [17] "PSN/predictor_prunedNet_top0.20.txt"               
## [18] "featScores/SURVIVENO_FeatSel_cutoff10_pct0.70.txt" 
## [19] "featScores/SURVIVENO_featScores.txt"               
## [20] "featScores/SURVIVEYES_FeatSel_cutoff10_pct0.70.txt"
## [21] "featScores/SURVIVEYES_featScores.txt"

sessionInfo

sessionInfo()
## R version 3.3.1 (2016-06-21)
## Platform: x86_64-apple-darwin13.4.0 (64-bit)
## Running under: OS X 10.10.5 (Yosemite)
## 
## locale:
## [1] C/C/C/C/C/en_CA.UTF-8
## 
## attached base packages:
## [1] stats4    parallel  stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] reshape2_1.4.2       netDx.examples_0.1   netDx_0.94          
##  [4] RColorBrewer_1.1-2   pracma_2.0.7         ROCR_1.0-7          
##  [7] gplots_3.0.1         GenomicRanges_1.26.4 GenomeInfoDb_1.10.3 
## [10] IRanges_2.8.2        S4Vectors_0.12.2     BiocGenerics_0.20.0 
## [13] combinat_0.0-8       doParallel_1.0.10    iterators_1.0.8     
## [16] foreach_1.4.3        bigmemory_4.5.19     bigmemory.sri_0.1.3 
## [19] EasycyRest_0.1       r2cytoscape_0.0.3    XML_3.98-1.9        
## [22] RJSONIO_1.3-0        httr_1.3.1          
## 
## loaded via a namespace (and not attached):
##  [1] gtools_3.5.0       colorspace_1.3-2   htmltools_0.3.6   
##  [4] yaml_2.1.14        rlang_0.1.2        withr_2.0.0       
##  [7] plyr_1.8.4         stringr_1.2.0      zlibbioc_1.20.0   
## [10] munsell_0.4.3      gtable_0.2.0       devtools_1.13.3   
## [13] caTools_1.17.1     codetools_0.2-15   memoise_1.1.0     
## [16] evaluate_0.10.1    knitr_1.17         curl_2.8.1        
## [19] Rcpp_0.12.12       KernSmooth_2.23-15 backports_1.1.0   
## [22] scales_0.4.1       gdata_2.18.0       XVector_0.14.1    
## [25] ggplot2_2.2.1      digest_0.6.12      stringi_1.1.5     
## [28] rprojroot_1.2      grid_3.3.1         quadprog_1.5-5    
## [31] tools_3.3.1        bitops_1.0-6       magrittr_1.5      
## [34] RCurl_1.95-4.8     lazyeval_0.2.0     tibble_1.3.3      
## [37] pkgconfig_2.0.1    rmarkdown_1.6      R6_2.2.2          
## [40] igraph_1.1.2       git2r_0.19.0

References

  1. Yuan, Y. et al. (2014) Assessing the clinical utility of cancer genomic and proteomic data across tumor types. Nat Biotechnol 32, 644-52.
  2. The Cancer Genome Atlas Research Network (2013). Comprehensive molecular characterization of clear cell renal cell carcinoma. Nature 499, 43-9.
  3. Merico, D., Isserlin, R. & Bader, G.D. (2011). Visualizing gene-set enrichment results using the Cytoscape plug-in enrichment map. Methods Mol Biol 781, 257-77.
  4. Kucera, M., Isserlin, R., Arkhangorodsky, A. & Bader, G.D. (2016). AutoAnnotate: A Cytoscape app for summarizing networks with semantic annotations. F1000Res 5, 1717.