Dev by escauley · Pull Request #71 · NIDAP-Community/DSPWorkflow · GitHub
Skip to content
Merged

Dev #71

Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
66 commits
Select commit Hold shift + click to select a range
2c59b63
Update Helper script and Action file
ruiheesi Mar 7, 2023
377a144
Update Helper script for select normalized
ruiheesi Mar 7, 2023
143e624
Updates to Description and deleted test files
escauley Apr 7, 2023
4a0172f
Merge branch 'dev' of https://github.com/NIDAP-Community/DSPWorkflow …
escauley Apr 7, 2023
3a4534d
Updated formatting in DESCRIPTION and redocumented NAMESPACE
escauley Apr 7, 2023
71acf0a
format update for DESCRIPTION
escauley Apr 7, 2023
65eb588
Author info adjusted to one line each
escauley Apr 7, 2023
72721e3
Update syntax error
ruiheesi Apr 7, 2023
3e39d1e
Merge pull request #39 from NIDAP-Community/update_description
ruiheesi Apr 7, 2023
ba45cbe
Updating maintainer roles
escauley Apr 7, 2023
158f578
Merge branch 'dev' of https://github.com/NIDAP-Community/DSPWorkflow …
escauley Apr 7, 2023
053b6ea
update library section
bianjh-cloud Apr 10, 2023
ad69385
Updated NSCLC integration test working directory
escauley Apr 10, 2023
8531469
test commit for spatial decon branch
escauley Apr 10, 2023
e376b2e
Merge branch 'violinPlot' of https://github.com/NIDAP-Community/DSPWo…
bianjh-cloud Apr 10, 2023
65cd48c
update library of spatial deconv
bianjh-cloud Apr 10, 2023
3679d67
Updates to integration markdowns
escauley Apr 11, 2023
9083b9a
Updated
Apr 12, 2023
5f1a430
Updated
Apr 12, 2023
7ccb6e2
Merge branch 'dev' into Filtering
ChadAHighfill Apr 12, 2023
bf75a32
changed DEG testing to create temp figures
maggiecam Apr 13, 2023
feb642d
Merge branch 'diffExpr' of https://github.com/NIDAP-Community/DSPWork…
maggiecam Apr 13, 2023
a9a883a
new snapshots for DEG test
maggiecam Apr 13, 2023
037f8aa
fix aes inheritance for add.points
ammichalowski Apr 15, 2023
f825f74
Merge pull request #48 from NIDAP-Community/DimReduct
ammichalowski Apr 15, 2023
7b2e447
Updates to description file, mouse int test
escauley Apr 18, 2023
2338d9a
Merge branch 'dev' of https://github.com/NIDAP-Community/DSPWorkflow …
escauley Apr 18, 2023
95c2294
Error checking for dcc download, updates for test dataset parameters
escauley Apr 18, 2023
f5b0c42
Added placeholder files in downloaded fixtures folders
escauley Apr 19, 2023
efd67a3
Added check for max num of cores available
escauley Apr 19, 2023
343ecaf
Updated diff exp helper for snap folder path
escauley Apr 19, 2023
2afd535
Merge branch 'dev' into Filtering
escauley Apr 20, 2023
5e3b904
Updates for NSCLC filtering
escauley Apr 20, 2023
8b7d22e
NSCLC fixes for Filtering
escauley Apr 20, 2023
99f2ac1
Merge branch 'dev' into Filtering
escauley Apr 20, 2023
128cf85
Merge pull request #54 from NIDAP-Community/Filtering
escauley Apr 20, 2023
7c711af
Updated fixtures for NSCLC test dataet and other cleaned up old fixtures
escauley Apr 20, 2023
589c82d
Merge branch 'dev' of https://github.com/NIDAP-Community/DSPWorkflow …
escauley Apr 20, 2023
31116fd
Merge branch 'dev' into violinPlot
escauley Apr 20, 2023
76e75a4
Merge pull request #40 from NIDAP-Community/violinPlot
escauley Apr 20, 2023
0b140de
Merge branch 'dev' into spatialDeconvolution
escauley Apr 20, 2023
7fe5fc3
Merge pull request #55 from NIDAP-Community/spatialDeconvolution
escauley Apr 20, 2023
228786d
fixed path on snap files
escauley Apr 20, 2023
278529d
testing skipping snapshot testing on github ci
escauley Apr 20, 2023
f945185
Added skip on ci function for snapshot testing
escauley Apr 20, 2023
92ac7b0
Added skip on ci for NSCLC snapshot
escauley Apr 20, 2023
a787581
Fixed merge conflicts for diff exp and dev
escauley Apr 20, 2023
d1ef2fa
Merge pull request #59 from NIDAP-Community/diffExpr
escauley Apr 20, 2023
2def830
Added complex heatmap package and updated NAMESPACE
escauley Apr 20, 2023
937eef9
Merge branch 'dev' of https://github.com/NIDAP-Community/DSPWorkflow …
escauley Apr 20, 2023
93ed521
Updated library for spatial decon and rewrote NAMESPACE
escauley Apr 20, 2023
c223f08
Added ngeoMean function import
escauley Apr 20, 2023
d932d44
Updated for lowercamalcase
Apr 21, 2023
5ce7711
Updated for lowercamalcase
Apr 21, 2023
031e94e
Merge pull request #63 from NIDAP-Community/Filtering
ChadAHighfill Apr 21, 2023
00d2098
Merge pull request #64 from NIDAP-Community/GeoMxNorm
ChadAHighfill Apr 21, 2023
3a7a118
Update gitflow-R-action.yml
ruiheesi Apr 21, 2023
fb2a804
Merge pull request #65 from NIDAP-Community/Update_Github_Action
ruiheesi Apr 21, 2023
694fb61
qcProc clean warnings 1
ammichalowski Apr 22, 2023
016d89b
Update gitflow-R-action.yml
escauley Apr 24, 2023
f302908
add output table in return description
ammichalowski Apr 24, 2023
a88347f
Merge pull request #67 from NIDAP-Community/qcProc
ammichalowski Apr 24, 2023
cae3ff0
ttheme_default
maggiecam Apr 24, 2023
ee4d318
Merge pull request #69 from NIDAP-Community/diffexpr_fix
maggiecam Apr 24, 2023
b4261aa
modified spatDeconv helper
bianjh-cloud Apr 24, 2023
13a9ebc
Merge pull request #70 from NIDAP-Community/spatDeconv2
bianjh-cloud Apr 24, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/gitflow-R-action.yml
30 changes: 19 additions & 11 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,10 +1,20 @@
Package: DSPWorkflow
Title: What the Package Does (One Line, Title Case)
Version: 1.0.0.0
Authors@R:
person("First", "Last", , "first.last@example.com", role = c("aut", "cre"),
comment = c(ORCID = "YOUR-ORCID-ID"))
Description: The DSP Workflow addresses a growing need to streamline the analysis of Spatial Transcriptomics data produced from Digital Spatial Profiling Technology (NanoString). It can be run in a docker container, and for biologists, in user-friendly web-based interactive notebooks (NIDAP, Palantir Foundry).
Title: A Workflow for Analyzing Digital Spatial Profiling RNA Data
Version: 0.9.2.0
Authors@R: c(person("Rui", "He", email = "rui.he@nih.gov", role = "aut"),
person("Maggie", "Cam", email = "maggie.cam@nih.gov", role = "aut", comment = c(ORCID = "0000-0001-8190-9766")),
person("Ned", "Cauley", email = "ned.cauley@nih.gov", role = c("aut", "cre"), comment = c(ORCID = "0000-0002-8968-6621")),
person("Jing", "Bian", email = "bianjh@nih.gov", role = "aut", comment = c(ORCID = "0000-0001-7109-716X")),
person("Difei", "Wang", email = "difei.wang2@nih.gov", role = "aut", comment = c(ORCID = "0000-0003-4088-3859")),
person("Chad", "Highfill", email = "chad.highfill@nih.gov", role = "aut", comment = c(ORCID = "0000-0003-0046-3593")))
Description: A set of functions for analyzing RNA data from the spatial
transcriptomics approach Digital Spatial Profiling (Nanostring). The user
provides read count data and annotations, and the package outputs
normalized differential expression of genes and further visualizations and
analysis based on user input. It can be run in a docker container and in
user-friendly web-based interactive notebooks (NIDAP, Palantir Foundry).
URL: https://github.com/NIDAP-Community/DSPWorkflow
BugReports: https://github.com/NIDAP-Community/DSPWorkflow/issues
License: MIT + file LICENSE
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
Expand All @@ -14,7 +24,6 @@ Suggests:
Depends:
R (>= 3.6)
Imports:
backports (>= 1.4.1),
Biobase (>= 2.54.0),
BiocGenerics (>= 0.40.0),
cowplot (>= 1.1.1),
Expand All @@ -23,21 +32,20 @@ Imports:
ggforce (>= 0.3.4),
ggplot2 (>= 3.3.6),
gridExtra (>= 2.3),
grid (>= 4.1.3),
gtable (>= 0.3.0),
knitr (>= 1.40),
NanoStringNCTools (>= 1.2.0),
patchwork (>= 1.1.2),
reshape2 (>= 1.4.4),
Rmpfr (>= 0.8-9),
Rtsne (>= 0.16),
scales (>= 1.2.1),
stats (>= 4.1.3),
SpatialDecon (>= 1.4.3),
tibble (>= 3.1.8),
tidyr (>= 1.2.1),
tidyverse (>= 1.3.2),
umap (>= 0.2.9.0),
pheatmap (>= 1.0.12),
stringr,
magrittr
magrittr (>= 2.0.3),
ComplexHeatmap (>= 2.10.0)
Config/testthat/edition: 3
19 changes: 11 additions & 8 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -3,19 +3,15 @@
export(diffExpr)
export(dimReduct)
export(filtering)
export(geomxnorm)
export(geomxNorm)
export(heatMap)
export(qcProc)
export(spatialDeconvolution)
export(studyDesign)
export(violinPlot)
import(GeomxTools)
import(Biobase)
import(NanoStringNCTools)
import(SpatialDecon)
import(ggplot2)
import(gridExtra)
import(pheatmap)
import(stats)
importFrom(Biobase,assayDataElement)
importFrom(Biobase,exprs)
importFrom(Biobase,fData)
Expand All @@ -26,8 +22,10 @@ importFrom(BiocGenerics,annotation)
importFrom(BiocGenerics,colnames)
importFrom(BiocGenerics,rbind)
importFrom(BiocGenerics,rownames)
importFrom(ComplexHeatmap,pheatmap)
importFrom(GeomxTools,aggregateCounts)
importFrom(GeomxTools,mixedModelDE)
importFrom(GeomxTools,ngeoMean)
importFrom(GeomxTools,normalize)
importFrom(GeomxTools,readNanoStringGeoMxSet)
importFrom(GeomxTools,setBioProbeQCFlags)
Expand All @@ -38,6 +36,9 @@ importFrom(NanoStringNCTools,esBy)
importFrom(NanoStringNCTools,negativeControlSubset)
importFrom(NanoStringNCTools,sData)
importFrom(Rtsne,Rtsne)
importFrom(SpatialDecon,create_profile_matrix)
importFrom(SpatialDecon,derive_GeoMx_background)
importFrom(SpatialDecon,spatialdecon)
importFrom(cowplot,plot_grid)
importFrom(dplyr,arrange)
importFrom(dplyr,count)
Expand Down Expand Up @@ -72,27 +73,29 @@ importFrom(ggplot2,scale_y_continuous)
importFrom(ggplot2,theme)
importFrom(ggplot2,theme_bw)
importFrom(ggplot2,theme_classic)
importFrom(graphics,boxplot)
importFrom(grid,gpar)
importFrom(grid,grid.draw)
importFrom(grid,grid.newpage)
importFrom(grid,grobHeight)
importFrom(grid,textGrob)
importFrom(gridExtra,arrangeGrob)
importFrom(gridExtra,grid.arrange)
importFrom(gridExtra,tableGrob)
importFrom(gridExtra,ttheme_default)
importFrom(gtable,gtable_add_grob)
importFrom(gtable,gtable_add_rows)
importFrom(knitr,kable)
importFrom(magrittr,"%>%")
importFrom(parallel,detectCores)
importFrom(patchwork,guide_area)
importFrom(patchwork,patchworkGrob)
importFrom(patchwork,plot_annotation)
importFrom(patchwork,plot_layout)
importFrom(patchwork,wrap_elements)
importFrom(patchwork,wrap_plots)
importFrom(pheatmap,pheatmap)
importFrom(reshape2,melt)
importFrom(scales,percent)
importFrom(stats,as.formula)
importFrom(stats,p.adjust)
importFrom(stats,prcomp)
importFrom(stats,quantile)
Expand Down
16 changes: 15 additions & 1 deletion R/differential_expression_analysis.R
Original file line number Diff line number Diff line change
Expand Up @@ -46,10 +46,11 @@
#' @importFrom grid grid.newpage textGrob gpar grobHeight grid.draw
#' @importFrom gtable gtable_add_rows gtable_add_grob
#' @importFrom tibble rownames_to_column
#' @importFrom gridExtra tableGrob
#' @importFrom gridExtra tableGrob ttheme_default
#' @importFrom BiocGenerics rownames colnames rbind
#' @importFrom magrittr %>%
#' @importFrom Biobase pData assayDataElement
#' @importFrom parallel detectCores
#' @export
#'
#' @return a list containing mixed model output data frame, grid tables for
Expand All @@ -71,6 +72,19 @@ diffExpr <- function(object,
pval.lim.1 = 0.05,
pval.lim.2 = 0.01) {

# Check the number of cores available for the current machine
available.cores <- detectCores()

if (n.cores > available.cores) {
print(paste0("The number of cores selected is greater than the number of available cores, reducing number of cores to maximum of ", available.cores))
n.cores <- available.cores
}

# Adjust the number of cores selected within the machine's range




testClass <- testRegion <- Gene <- Subset <- NULL

# convert test variables to factors after checking input
Expand Down
47 changes: 24 additions & 23 deletions R/filtering.R
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
# loq.cutoff 2 is recommended, loq.min 2 is recommend,
# cut.segment = remove segments with less than 10% of the genes detected; .05-.1 recommended,
# goi = goi (genes of interest). Must be a vector of genes (i.e c("PDCD1", "CD274")),
filtering <- function(object, pkc.file, loq.cutoff, loq.min, cut.segment, goi) {
filtering <- function(object, loq.cutoff, loq.min, cut.segment, goi) {

if(class(object)[1] != "NanoStringGeoMxSet"){
stop(paste0("Error: You have the wrong data class, must be NanoStringGeoMxSet" ))
Expand All @@ -41,7 +41,8 @@ filtering <- function(object, pkc.file, loq.cutoff, loq.min, cut.segment, goi) {
stop(paste0("Error: You have the wrong data class, must be numeric" ))
}
# Define Modules
pkc.file <- pkc.file
#pkc.file <- pkc.file
pkc.file <- annotation(object)
if(class(pkc.file)[1] != "character"){
stop(paste0("Error: You have the wrong data class, must be character" ))
}
Expand All @@ -65,22 +66,22 @@ filtering <- function(object, pkc.file, loq.cutoff, loq.min, cut.segment, goi) {
pData(object)$loq <- loq

## 4.5.0 Filtering
loq_mat <- c()
loq.mat <- c()
for(module in modules) {
ind <- fData(object)$Module == module
mat_i <- t(esApply(object[ind, ], MARGIN = 1,
mat.i <- t(esApply(object[ind, ], MARGIN = 1,
FUN = function(x) {
x > loq[, module]
}))
loq_mat <- rbind(loq_mat, mat_i)
loq.mat <- rbind(loq.mat, mat.i)
}
# ensure ordering since this is stored outside of the geomxSet
loq_mat <- loq_mat[fData(object)$TargetName, ]
loq.mat <- loq.mat[fData(object)$TargetName, ]

##4.5.1S egment Gene Detection
# Save detection rate information to pheno data
pData(object)$GenesDetected <-
colSums(loq_mat, na.rm = TRUE)
colSums(loq.mat, na.rm = TRUE)
pData(object)$GeneDetectionRate <-
pData(object)$GenesDetected / nrow(object)

Expand Down Expand Up @@ -114,17 +115,17 @@ filtering <- function(object, pkc.file, loq.cutoff, loq.min, cut.segment, goi) {

# select the annotations we want to show, use `` to surround column names with
# spaces or special symbols
count_mat <- count(pData(object), `slide name`, class, region, segment)
count.mat <- count(pData(object), `slide name`, class, region, segment)
if(class(object)[1] != "NanoStringGeoMxSet"){
stop(paste0("Error: You have the wrong data class, must be NanoStringGeoMxSet" ))
}
# simplify the slide names
count_mat$`slide name` <- gsub("disease", "d", gsub("normal", "n", count_mat$`slide name`))
count.mat$`slide name` <- gsub("disease", "d", gsub("normal", "n", count.mat$`slide name`))
# gather the data and plot in order: class, slide name, region, segment
test_gr <- gather_set_data(count_mat, 1:4)
test_gr$x <-factor(test_gr$x, levels = c("class", "slide name", "region", "segment"))
test.gr <- gather_set_data(count.mat, 1:4)
test.gr$x <-factor(test.gr$x, levels = c("class", "slide name", "region", "segment"))
# plot Sankey
sankey.plot<- ggplot(test_gr, aes(x, id = id, split = y, value = n)) +
sankey.plot<- ggplot(test.gr, aes(x, id = id, split = y, value = n)) +
geom_parallel_sets(aes(fill = region), alpha = 0.5, axis.width = 0.1) +
geom_parallel_sets_axes(axis.width = 0.2) +
geom_parallel_sets_labels(color = "white", size = 5) +
Expand All @@ -143,8 +144,8 @@ filtering <- function(object, pkc.file, loq.cutoff, loq.min, cut.segment, goi) {

##4.5.2 Gene Detection Rate
# Calculate detection rate:
loq_mat <- loq_mat[, colnames(object)]
fData(object)$DetectedSegments <- rowSums(loq_mat, na.rm = TRUE)
loq.mat <- loq.mat[, colnames(object)]
fData(object)$DetectedSegments <- rowSums(loq.mat, na.rm = TRUE)
fData(object)$DetectionRate <-
fData(object)$DetectedSegments / nrow(pData(object))

Expand All @@ -153,20 +154,20 @@ filtering <- function(object, pkc.file, loq.cutoff, loq.min, cut.segment, goi) {
if(class(goi)[1] != "character"){
stop(paste0("Error: You have the wrong data class, must be character vector" ))
}
goi_df <- data.frame(Gene = goi,
goi.df <- data.frame(Gene = goi,
Number = fData(object)[goi, "DetectedSegments"],
DetectionRate = percent(fData(object)[goi, "DetectionRate"]))

## 4.5.3 Gene Filtering
# Plot detection rate:
plot_detect <- data.frame(Freq = c(1, 5, 10, 20, 30, 50))
plot_detect$Number <-
plot.detect <- data.frame(Freq = c(1, 5, 10, 20, 30, 50))
plot.detect$Number <-
unlist(lapply(c(0.01, 0.05, 0.1, 0.2, 0.3, 0.5),
function(x) {sum(fData(object)$DetectionRate >= x)}))
plot_detect$Rate <- plot_detect$Number / nrow(fData(object))
rownames(plot_detect) <- plot_detect$Freq
plot.detect$Rate <- plot.detect$Number / nrow(fData(object))
rownames(plot.detect) <- plot.detect$Freq

genes.detected.plot <- ggplot(plot_detect, aes(x = as.factor(Freq), y = Rate, fill = Rate)) +
genes.detected.plot <- ggplot(plot.detect, aes(x = as.factor(Freq), y = Rate, fill = Rate)) +
geom_bar(stat = "identity") +
geom_text(aes(label = formatC(Number, format = "d", big.mark = ",")),
vjust = 1.6, color = "black", size = 4) +
Expand All @@ -182,10 +183,10 @@ filtering <- function(object, pkc.file, loq.cutoff, loq.min, cut.segment, goi) {

# Subset to target genes detected in at least 10% of the samples.
# Also manually include the negative control probe, for downstream use
negativeProbefData <- subset(fData(object), CodeClass == "Negative")
neg_probes <- unique(negativeProbefData$TargetName)
negative.probe.fData <- subset(fData(object), CodeClass == "Negative")
neg.probes <- unique(negative.probe.fData$TargetName)
object <- object[fData(object)$DetectionRate >= 0.1 |
fData(object)$TargetName %in% neg_probes, ]
fData(object)$TargetName %in% neg.probes, ]

# retain only detected genes of interest
goi <- goi[goi %in% rownames(object)]
Expand Down
4 changes: 2 additions & 2 deletions R/heatmap.R
Original file line number Diff line number Diff line change
Expand Up @@ -52,11 +52,11 @@
#'
#' @importFrom NanoStringNCTools assayDataApply
#' @importFrom Biobase assayDataElement
#' @importFrom pheatmap pheatmap
#' @importFrom ComplexHeatmap pheatmap
#'
#' @export
#'
#' @return A list containing the plot genes data matrix, and the heatmap plot.
#' @return A list containing the plot genes data matrix, and the heatmap plot
##
heatMap <- function(
object,
Expand Down
16 changes: 7 additions & 9 deletions R/normalization.R
Loading