This vignette serves to replicate the examples presented in the spatialHeatmap manuscript. As the pertinent functionalities have been introduced in the spatialHeatmap package's vignettes, this document offers concise explanations of each relevant functionality. Some code chunks are not evaluated when buiding this vignette due to long computation time.

To build this vignette, download the folder containing the source code, then run rmarkdown::render('manuscript_examples.Rmd') in the downloaded folder.

1 Getting started

The required packages and helper functions are loaded. To reduce runtime, intermediate results can be cached under ~/.cache/shm.

# R: 4.2.0; Bioconductor: 3.15
library(spatialHeatmap); library(SummarizedExperiment); library(ExpressionAtlas); library(ggplot2)
library(biomaRt) # v2.54.0 
library(enrichplot); library(clusterProfiler) # v4.6.0
source('function/fun.R') # Helper functions.  
cache.pa <- '~/.cache/shm'

2 Spatial Enrichment

The Spatial Enrichment (SpEn) functionality is designed to identify groups of biomolecules (e.g. RNAs, proteins, metabolites) that are particularly abundant or enriched in certain spatial regions, such as tissue-specific transcripts. The chosen mouse RNA-Seq data compares tissue level gene expression across mammalian species (Merkin et al. 2012). The following searches the Expression Atlas for expression data from 'kidney' and 'Mus musculus'.

all.mus <- read_cache(cache.pa, 'all.mus') # Retrieve data from cache.
if (is.null(all.mus)) { # Save downloaded data to cache if it is not cached.
  all.mus <- searchAtlasExperiments(properties="kidney", species="Mus musculus")
  save_cache(dir=cache.pa, overwrite=TRUE, all.mus)
}

Among the many matching entries, accession 'E-MTAB-2801' is downloaded.

all.mus[7, ]
## DataFrame with 1 row and 4 columns
##     Accession      Species                  Type
##   <character>  <character>           <character>
## 1 E-MTAB-2801 Mus musculus RNA-seq of coding RNA
##                    Title
##              <character>
## 1 Strand-specific RNA-..
rse.mus <- read_cache(cache.pa, 'rse.mus') # Read data from cache.
if (is.null(rse.mus)) { # Save downloaded data to cache if it is not cached.
  rse.mus <- getAtlasData('E-MTAB-2801')[[1]][[1]]
  save_cache(dir=cache.pa, overwrite=TRUE, rse.mus)
}
colData(rse.mus)[1:3, ]
## DataFrame with 3 rows and 4 columns
##           AtlasAssayGroup     organism organism_part      strain
##               <character>  <character>   <character> <character>
## SRR594393              g7 Mus musculus         brain      DBA/2J
## SRR594394             g21 Mus musculus         colon      DBA/2J
## SRR594395             g13 Mus musculus         heart      DBA/2J

A targets file is loaded to the colData slot of rse.mus.

mus.tar <- system.file('extdata/shinyApp/data/target_mouse.txt', package='spatialHeatmap')
target.mus <- read.table(mus.tar, header=TRUE, row.names=1, sep='\t') # Importing
colData(rse.mus) <- DataFrame(target.mus) # Loading
target.mus[1:3, ]
##           AtlasAssayGroup     organism organism_part strain
## SRR594393              g7 Mus musculus         brain DBA.2J
## SRR594394             g21 Mus musculus         colon DBA.2J
## SRR594395             g13 Mus musculus         heart DBA.2J

Five spatial features ('brain', 'liver', 'lung', 'colon', and 'kidney') and three experiment variables (mouse strains 'DBA.2J', 'C57BL.6', and 'CD1') are subsetted. The com.by argument specifies whether the enrichment will be performed on spatial features (ft) or experiment variables (var). In the following, com.by is set ft, so SpEn is performed for spatial features and the variables under each spatial feature are treated as replicates.

data.sub.mus <- sf_var(data=rse.mus, feature='organism_part', ft.sel=c('brain', 'lung', 'colon', 'kidney', 'liver'), variable='strain', var.sel=c('DBA.2J', 'C57BL.6', 'CD1'), com.by='ft')

The subsetted data is filtered to remove unreliable expression values.

data.sub.mus.fil <- filter_data(data=data.sub.mus, pOA=c(0.5, 15), CV=c(0.8, 100), verbose=FALSE)

The Ensembl gene ids are converted to UniProt symbols.

data.sub.mus.fil <- cvt_mart(input=data.sub.mus.fil, biomart='ensembl', dataset="mmusculus_gene_ensembl", mirror='www', filters='ensembl_gene_id', attributes=c("ensembl_gene_id", "uniprot_gn_symbol", "entrezgene_id", "go_id", "description"), target='uniprot_gn_symbol', use.cache=TRUE, cache.na='ann.mus', cache.dir='~/.cache/shm')
ann.mus <- rowData(data.sub.mus.fil)

SpEn is performed for each spatial feature with 'edgeR'. The count data are internally normalized by the 'TMM' method from 'edgeR'. The enrichment results are selected by log2 fold change (log2.fc) and FDR (fdr). The outliers argument specifies a number of outliers allowed in references.

enr.res <- spatial_enrich(data.sub.mus.fil, method=c('edgeR'), norm='TMM', log2.fc=1, fdr=0.05, outliers=1)

The overlap of enriched biomolecules (type='up') across spatial features are presented in an UpSet plot (plot='upset').

ovl_enrich(enr.res, type='up', plot='upset', font.size=5, venn.arg=list(), order.by='freq')
Overlap of enriched biomolecules across spatial features.

Figure 1: Overlap of enriched biomolecules across spatial features

The enriched and depleted genes in liver are queried from the enrichment results. In the 'type' column, 'up' and 'down' refer to 'enriched' and 'depleted' respectively, while the 'total' column shows the total reference features excluding outliers.

en.liver <- query_enrich(enr.res, 'liver')
## Done!
up.liver <- subset(rowData(en.liver), type=='up' & total==4)
up.liver[1:3, 1:3]
## DataFrame with 3 rows and 3 columns
##              type     total      method
##       <character> <numeric> <character>
## F10            up         4       edgeR
## Itih1          up         4       edgeR
## Alb            up         4       edgeR
dn.liver <- subset(rowData(en.liver), type=='down' & total==4)
dn.liver[1:3, 1:3]
## DataFrame with 3 rows and 3 columns
##               type     total      method
##        <character> <numeric> <character>
## Mgat3         down         4       edgeR
## Osbpl6        down         4       edgeR
## Slc1a1        down         4       edgeR

The mouse brain aSVG is imported.

svg.mus.pa <- system.file("extdata/shinyApp/data", "mus_musculus.male.svg", package="spatialHeatmap")
svg.mus <- read_svg(svg.mus.pa)
cord <- svg.mus[, 2][[1]]; cord[1, 4] <- 0
attribute(svg.mus)[[1]] <- cord

One enriched (Itih1) and one depleted (Mgat3) gene in mouse liver are chosen to plot enrichment spatial heatmaps (SHMs).

# Aggregate replicates across spatial features by mean.
en.liver <- aggr_rep(en.liver, sam.factor='com.by', con.factor=NULL, aggr='mean')
dat.enrich <- SPHM(svg=svg.mus, bulk=en.liver)
shm(data=dat.enrich, ID=c('Itih1', 'Mgat3'), legend.r=1, legend.nrow=3, sub.title.size=12, ncol=2, bar.width=0.09, lay.shm='con')
## ggplots/grobs: mus_musculus.male.svg ... 
## ggplot: Itih1, con  
## ggplot: Mgat3, con  
## Legend plot ... 
## CPU cores: 1 
## Converting "ggplot" to "grob" ... 
## Itih1_con_1  Mgat3_con_1  
## Converting "ggplot" to "grob" ... 
## 
## Plot layout ... 
## Plot layout ...
## Done!
Enrichment SHMs. Top row: SHMs of spatially-enriched gene. Bottom row: SHMs of spatially-depleted gene.

Figure 2: Enrichment SHMs
Top row: SHMs of spatially-enriched gene. Bottom row: SHMs of spatially-depleted gene.

3 Data Mining

SHMs are suitable for comparing assay profiles among small number of biomolecules (e.g. genes) across spatial features. To also support analysis routines of larger number of biomolecules, spatialHeatmap integrates functionalities for identifying groups of biomolecules with similar and/or dissimilar assay profiles, and subsequently analyzing the results with data mining methods that scale to larger numbers of biomolecules than SHMs, such as hierarchical clustering and network analysis methods.

3.1 Hierarcical clustering

The replicates of organs in the assay data are aggregated by mean. The query gene used in this example is 'Ugp2' that is spatially enriched in mouse liver. After subsetting the aggregated assay matrix to the 15% most similar members from a correlation-based search (including the query), hierarchical clustering is performed. The clustering results are presented in the form of a matrix heatmap (Figure 3). In Figure 3, the rows and columns are sorted by the dendrograms obtained from the hierarchical clustering step, and the query gene is indicated by a black line.

# Aggregate replicates.
mus.nor.aggr <- aggr_rep(enr.res$data, sam.factor='organism_part', con.factor=NULL, aggr='mean')
# Subset assay data.
sub.mat <- submatrix(data=mus.nor.aggr, ID='Ugp2', p=0.15)
dim(sub.mat)
## [1] 1376    5
mhm.res <- matrix_hm(ID='Ugp2', data=sub.mat, scale='row', angleCol=80, angleRow=35, cexRow=1, cexCol=1.2, margin=c(10, 6), static=TRUE, arg.lis1=list(offsetRow=0.01, offsetCol=0.01))
Matrix Heatmap. Rows are genes and columns are samples. The query gene is    tagged by black lines.

Figure 3: Matrix Heatmap
Rows are genes and columns are samples. The query gene is tagged by black lines.

The row dendrograms are cut at the height h=3.685629 to obtain the cluster containing the query gene.

# Cut row dendrograms to obtain the cluster.  
cl <- cut_dendro(dendro=mhm.res$rowDendrogram, h=3.685629, target='Ugp2'); cl
##  [1] "Abcd3"    "Acat1"    "Aco1"     "Aldh3a2"  "Cers2"    "Chchd10" 
##  [7] "Cluh"     "Dbi"      "Dcaf11"   "Eif4ebp2" "Entpd5"   "Gstm1"   
## [13] "Herpud1"  "Hsd17b4"  "Idh1"     "Lonp2"    "Mia2"     "Prdx5"   
## [19] "Rxra"     "Sdc4"     "Slc25a39" "Sod1"     "Ugp2"
# Entrez gene ids of the cluster. 
df.entr <- subset(ann.mus, uniprot_gn_symbol %in% cl)
entriz <- unique(df.entr$entrezgene_id); entriz
##  [1] 216558  11671  76893  20181  20971  68066  13688  74148 338320
## [10]  12499  28199  20655  15488  54683  15926  13167  19299  11428
## [19]  64209 110446  66887 103172  14862
# Export the cluster containing the query gene. 
# rownames(df.entr) <- seq_len(nrow(df.entr))
# write.table(df.entr[, c('ensembl_gene_id', 'uniprot_gn_symbol', 'entrezgene_id')], 'cluster_hc.xlsx', col.names=T, row.names=T, sep='\t')

KEGG enrichment is performed on the cluster containing the query. The Peroxisome pathway (mmu04146) is the most highly enriched category (Figure 4). Peroxisomes are involved in many important metabolic processes including lipid metabolism in hepatocytes (Knebel et al. 2015). This is in alignment with the fact that the query gene 'Ugp2' is spatially enriched in liver.

kk <- enrichKEGG(gene = entriz, keyType='kegg', organism = 'mmu', pvalueCutoff = 0.05, pAdjustMethod='BH', qvalueCutoff=0.05)
## Reading KEGG annotation online: "https://rest.kegg.jp/link/mmu/pathway"...
## Reading KEGG annotation online: "https://rest.kegg.jp/list/pathway/mmu"...
df.kk <- as.data.frame(kk)
barplot(kk, showCategory=5)
KEGG enrichment result in the hierarchical clustering example.

Figure 4: KEGG enrichment result in the hierarchical clustering example

# browseKEGG(kk, 'mmu04146')

3.2 Network Analysis

The objective of network analysis is to identify network clusters that can be visualized in the form of network graphs. The network clusters represent groups of genes sharing highly similar expression profiles. Briefly, to run this analysis, a global network is constructed for the genes stored in the assay matrix that is subsequently partitioned into network clusters.

The same input assay matrix used in the hierarchical clustering example are used in the network analysis. The spatially-enriched gene 'Serpina1b' in the liver is chosen as the query for this analysis. After subsetting the assay matrix to the 15% most similar members from a correlation-based search (including the query), network analysis is performed. This analysis produces results in the form of an adjacency matrix and cluster assignments. The adjacency matrix contains adjacency values (expression similarities) for each pair of genes, while the cluster assignments are provided at four different levels of stringency (0, 1, 2, and 3). Higher stringency levels result in more clusters with smaller sizes. Within each version of the cluster assignment, a value of '0' indicates that a gene is not assigned to any cluster.

# Subset the assay data.  
sub.mat <- submatrix(data=mus.nor.aggr, ID='Serpina1b', p=0.15)
dim(sub.mat)
## [1] 1376    5
# Identify network clusters.
adj.mod <- adj_mod(data=sub.mat)
## ..connectivity..
## ..matrix multiplication (system BLAS)..
## ..normalization..
## ..done.
##  ..done.
##  ..done.
##  ..done.
##  ..done.
# Adjacency matrix. 
adj.mod[['adj']][1:3, 1:3]
##                     Slc22a18 ENSMUSG00000108207 ENSMUSG00000115867
## Slc22a18           1.0000000          0.5358801          0.1034367
## ENSMUSG00000108207 0.5358801          1.0000000          0.1974021
## ENSMUSG00000115867 0.1034367          0.1974021          1.0000000
# Four versions of network clusters. 
adj.mod[['mod']][1:3, ] 
##                     0  1  2  3
## Slc22a18           12  9 36 24
## ENSMUSG00000108207 27 26 20  7
## ENSMUSG00000115867  0  0  0  0

The cluster containing the query gene is presented in a network graph (Figure 5). Nodes and edges in the graph represent genes and adjacency values between genes, respectively. The node size is proportional to the connectivity which is the sum of a gene's adjacency values with all other genes in the same cluster.

net <- network(ID='Serpina1b', data=sub.mat, adj.mod=adj.mod, ds=3, adj.min=0, vertex.label.cex=1, vertex.cex=5, static=TRUE)
Network graph. Node size denotes gene connectivity while edge thickness stands for expression similarity.

Figure 5: Network graph
Node size denotes gene connectivity while edge thickness stands for expression similarity.

Entrez gene ids are obtained for the cluster.

# Cluster containing the query. 
cl <- rownames(net); cl
## [1] "Arhgef37"           "Slc25a47"           "Pfkfb1"            
## [4] "ENSMUSG00000092021" "Atf5"               "Gck"               
## [7] "Klkb1"              "Serpina1b"          "Vtn"
df.entr.net <- subset(ann.mus, uniprot_gn_symbol %in% cl)
# Export the cluster. 
# rownames(df.entr.net) <- seq_len(nrow(df.entr.net))
# write.table(df.entr.net[, c('ensembl_gene_id', 'uniprot_gn_symbol', 'entrezgene_id')], 'module.xlsx', col.names=T, row.names=T, sep='\t')
# Entrez gene ids of the cluster. 
entriz <- unique(df.entr.net$entrezgene_id); entriz
## [1]  22370  18639 107503 103988 328967 104910  20701  16621

KEGG enrichment is performed on the cluster containing the query gene (Figure 6). The Complement and Coagulation Cascades pathway (mmu04610) is most significantly enriched in this cluster. This finding aligns with the fact that the query gene 'Serpina1b' is enriched in liver, as the liver is known to express a significant portion of complement proteins (Thorgersen et al. 2019).

kk <- enrichKEGG(gene = entriz, keyType='kegg', organism = 'mmu', pvalueCutoff = 0.05, pAdjustMethod='BH', qvalueCutoff=0.05)
df.kk <- as.data.frame(kk)
barplot(kk, showCategory=5)
KEGG enrichment result in the network analysis example.

Figure 6: KEGG enrichment result in the network analysis example

# browseKEGG(kk, 'mmu04146')

4 Use Case

spatialHeatmap offers functionalities for analyzing spatial assay data with SHMs, and optionally extending them with large-scale data mining routines. The following use case example demonstrates the advantages of these utilities within a discovery workflow, and the input bioassay data are the same as the SpEn example (Merkin et al. 2012). A typical analysis starts with a biomolecule or feature of interest. In this example, the brain is used as the query spatial feature, and the same SpEn process is performed as the SpEn example. To prevent redundancy, the enrichment results (enr.res) from the SpEn example are used directly.

The enrichment results are queried by brain. Among the enriched genes in brain, 'Grik3' is chosen as the query gene. This gene is interesting since it codes for a well-known excitatory neurotransmitter receptor that is involved in the formation of various neurodegenerative diseases and cancers (Stepulak et al. 2009; Xiao et al. 2019).

en.brain <- query_enrich(enr.res, 'brain')
## Done!
up.brain <- subset(rowData(en.brain), type=='up' & total==4) 
up.brain['Grik3', 1:3]
## DataFrame with 1 row and 3 columns
##              type     total      method
##       <character> <numeric> <character>
## Grik3          up         4       edgeR

After subsetting the assay matrix to the 15% most similar members from a correlation-based search (including 'Grik3'), hierarchical clustering is performed. The clustering results are presented in the form of a matrix heatmap (Figure 3). In Figure 3, the rows and columns are sorted by the dendrograms obtained from the hierarchical clustering step, and the query gene is indicated by a black line.

sub.mat <- submatrix(data=mus.nor.aggr, ID='Grik3', p=0.15)
dim(sub.mat)
## [1] 1376    5
mhm.res <- matrix_hm(ID='Grik3', data=sub.mat, scale='row', angleCol=80, angleRow=35, cexRow=0.8, cexCol=1.2, margin=c(10, 6), static=TRUE, arg.lis1=list(offsetRow=0.01, offsetCol=0.01))
Matrix Heatmap in the use case. Rows are genes and columns are samples. The query gene is tagged by black lines.

Figure 7: Matrix Heatmap in the use case
Rows are genes and columns are samples. The query gene is tagged by black lines.

The row dendrograms are cut at the height h=7.149385 to obtain the cluster containing the query gene ('Grik3' cluster).

# Cut row dendrograms to obtain the cluster.  
cl <- cut_dendro(dendro=mhm.res$rowDendrogram, h=7.149385, target='Grik3'); cl
##  [1] "Acsl6"              "Ca11"               "Cacng7"            
##  [4] "Cadm3"              "Caskin1"            "Cdk5r1"            
##  [7] "Chrm1"              "Clstn3"             "Cnih2"             
## [10] "Dlgap1"             "ENSMUSG00000048388" "ENSMUSG00000072769"
## [13] "ENSMUSG00000097451" "Fgf12"              "Gnal"              
## [16] "Grik3"              "Lrrc4c"             "Lrrtm2"            
## [19] "Mag"                "Mast1"              "Nfasc"             
## [22] "Nrxn1"              "Ntm"                "Ppfia2"            
## [25] "Rasl10b"            "Rbfox3"             "Reep2"             
## [28] "Rims3"              "Rnf165"             "Rnf227"            
## [31] "Scn8a"              "Sez6l2"             "Stx1b"             
## [34] "Tro"                "Unc5a"              "Wasf1"
# Entrez gene ids of the cluster. 
df.entr <- subset(ann.mus, uniprot_gn_symbol %in% cl)
entriz <- unique(df.entr$entrezgene_id); entriz
##  [1]  14807  12348 224997  94332 232370  83767 216739 276952  14167
## [10]  20273  18189  14680  12794  56191 225743  52897 107448 269116
## [19] 233878  56216  12669 242662 268932  17136 225362  80515  12569
## [28] 241568  56527 327814 235106  81904 107065
# Export the cluster containing the query gene. 
# rownames(df.entr) <- seq_len(nrow(df.entr))
# write.table(df.entr[, c('ensembl_gene_id', 'uniprot_gn_symbol', 'entrezgene_id')], 'cluster_hc.xlsx', col.names=T, row.names=T, sep='\t')

To functionally annotate the obtained gene cluster, KEGG and GO enrichments are performed. The result shows that the 'Grik3' cluster is enriched in genes involved in cell adhesion (Figure 8) and synapse-related (Figure 9) activities, respectively. This agrees well with the fact that 'Grik3' is a well-characterized neurotransmitter receptor in the plasma membrane.

kk <- enrichKEGG(gene = entriz, keyType='kegg', organism = 'mmu', pvalueCutoff = 0.05, pAdjustMethod='BH', qvalueCutoff=0.05)
df.kk <- as.data.frame(kk)
barplot(kk, showCategory=5)
KEGG enrichment result in the use case.

Figure 8: KEGG enrichment result in the use case

go <- enrichGO(gene = entriz, keyType = "ENTREZID", OrgDb = 'org.Mm.eg.db', ont = 'ALL', pAdjustMethod = 'BH', pvalueCutoff  = 0.05, qvalueCutoff  = 0.05, readable = TRUE)
## 
df.go <- as.data.frame(go)
barplot(go, showCategory=5)
GO enrichment result in the use case.

Figure 9: GO enrichment result in the use case

SHM plotting is used to inspect the spatial expression patterns of the genes in the 'Grik3' cluster (Figure 10). The results indicate that the genes of the 'Grik3' cluster are higher expressed in brain than other tissues. In addition, six of these genes are also relatively high expressed in liver tissue. In contrast to this, all other genes of the same cluster are exclusively enriched in brain. For clearity, two representative genes 'Stx1b' and 'Grik3' are shown in Figure 10.

# Aggregate replicates across spatial features by mean.
en.brain <- aggr_rep(en.brain, sam.factor='com.by', con.factor=NULL, aggr='mean')
dat.enrich <- SPHM(svg=svg.mus, bulk=en.brain)
shm(data=dat.enrich, ID=c('Stx1b', 'Grik3'), legend.r=1, legend.nrow=3, sub.title.size=12, ncol=2, bar.width=0.09, lay.shm='con')
## ggplots/grobs: mus_musculus.male.svg ... 
## ggplot: Stx1b, con  
## ggplot: Grik3, con  
## Legend plot ... 
## CPU cores: 1 
## Converting "ggplot" to "grob" ... 
## Stx1b_con_1  Grik3_con_1  
## Converting "ggplot" to "grob" ... 
## 
## Plot layout ... 
## Plot layout ...
## Done!
Spatial heatmaps in the use case.

Figure 10: Spatial heatmaps in the use case

5 Co-visualization

The co-visualization feature integrates tissue with single-cell data by co-visualizing them in composite plots that combine SHMs with embedding plots of high-dimensional data. The resulting spatial context information is important for gaining insights into the tissue-level organization of single cell data or vice versa. The following examples showcase several important functionalities of the co-visualization tool. The data used in these examples includes single-cell (Ortiz et al. 2020) and bulk (Vacher et al. 2021) RNA-seq data from multiple tissues of mouse brain. The tissues represented in both bulk and single cell data are cerebral cortex (isocortex), hippocampus, hypothalamus, and cerebellum.

The bulk and single-cell data of mouse brain are downloaded and imported. Due to long computation time on the real single-cell data, most of the following code in this section are not evaluated when building this vignette.

set.seed(50) # Obtain reproducible results. 
# Load help functions.
source('https://raw.githubusercontent.com/jianhaizhang/cocluster_data/master/function/bulk_dat.R')
source('https://raw.githubusercontent.com/jianhaizhang/cocluster_data/master/function/scell_dat.R')

# Download the bulk data file (bulk_mouse_brain.xls) at https://github.com/jianhaizhang/cocluster_data/tree/master/validate/mouse_brain, and then proceed to import the downloaded file.
blk.mus.brain <- blk_dat_mus('bulk_mouse_brain.xls')
blk.mus.brain[1:3, ]

# Download single-cell data files (GSE147747_expr_raw_counts_table.tsv, GSE147747_meta_table.tsv) at https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE147747, and then proceed to import the dowloaded files. 
sc.mus.brain <- sc_dat_mus_brain(sc.pa= 'GSE147747_expr_raw_counts_table.tsv', meta.pa='GSE147747_meta_table.tsv') 
sc.mus.brain[1:3, 1:5]

Import the aSVG of mouse brain.

svg.mus.brain <- read_svg(system.file("extdata/shinyApp/data", "mus_musculus.brain.svg", package="spatialHeatmap"))
tail(svg.mus.brain[, 2][[1]], 3)[, c('feature', 'id')]
## # A tibble: 3 × 2
##   feature              id            
##   <chr>                <chr>         
## 1 hypothalamus         UBERON_0001898
## 2 nose                 UBERON_0000004
## 3 corpora.quadrigemina UBERON_0002259

Convert bulk tissue identifiers to spatial features in the aSVG.

cvt_vecter <- spatialHeatmap:::cvt_vector
colnames(blk.mus.brain) <- unname(cvt_vecter(from=c('CERE', 'HIPP', 'CERE.CORTEX', 'HYPOTHA'), to=c('cerebellum', 'hippocampus', 'cerebral.cortex', 'hypothalamus'), target=colnames(blk.mus.brain)))
blk.mus.brain[1:2, ]
blk <- SummarizedExperiment(assays=list(counts=blk.mus.brain), colData=DataFrame(bulk=colnames(blk.mus.brain)))

Genes in bulk data are filtered according to expression counts \(\ge\) 5 at a proportion of \(\ge\) 0.2 across bulk samples and coefficient of variance between 0.2 and 100.

blk.fil <- filter_data(data=blk, pOA=c(0.2, 5), CV=c(0.2, 100)); dim(blk.fil)

The single cell data are reduced to genes with expression values \(\ge\) 1 (cutoff) across \(\ge\) 7% cells and to cells with expression values \(\ge\) 1 (cutoff) across \(\ge\) 5% genes. After that, bulk and single-data are reduced to common genes between them.

blk.sc <- filter_cell(sce=sc.mus.brain, bulk=blk.fil, cutoff=1, p.in.cell=0.7, p.in.gen=0.5); blk.sc

SpEn is performed across all tissues in the bulk data.

# Separate bulk data from single-cell data.
bulk <- blk.sc$bulk
# All tissues are selected for SpEn.
blk.sub <- sf_var(data=bulk, feature='bulk', ft.sel=colnames(bulk), variable=NULL, var.sel=NULL, com.by='ft')
# SpEn.
enr.res <- spatial_enrich(blk.sub, method=c('edgeR'), norm='TMM', log2.fc=1, fdr=0.05, outliers=0)
# Enrichment results across tissues.
ovl_enrich(enr.res, type='up', plot='upset', font.size=5, venn.arg=list(), order.by='freq')

As an example, the enriched and depleted genes of hypothalamus are queried from the enrichment results.

en.hypo <- query_enrich(enr.res, 'hypothalamus')
# Enriched.
up.hypo <- subset(rowData(en.hypo), type=='up' & total==3)
dim(up.hypo); up.hypo[1:2, 1:3]
# Depleted.
dn.hypo <- subset(rowData(en.hypo), type=='down' & total==3)
dim(dn.hypo); dn.hypo[1:2, 1:3]

Prior to co-visualizing bulk and single-cell data, these two data types are jointly normalized.

blk.sc.nor <- read_cache(cache.pa, 'blk.sc.nor')
if (is.null(blk.sc.nor)) {
  blk.sc.nor <- norm_cell(sce=blk.sc$cell, bulk=blk.sc$bulk, com=FALSE)
  save_cache(dir=cache.pa, overwrite=TRUE, blk.sc.nor)
}

Bulk and single-cell data are separated, then tissue replicates are aggregated by mean, and dimensionalities are reduced for single-cell data using PCA, UMAP, and TSNE.

# Aggregated tissue replicates.
blk.aggr <- aggr_rep(data=blk.sc.nor$bulk, assay.na='logcounts', sam.factor='sample', aggr='mean')
# Reduce dimensionalities for single-cell data. 
cell <- reduce_dim(blk.sc.nor$cell)

Bulk tissues 'cerebral.cortex', 'hypothalamus', 'hippocampus', and 'cerebellum' are matched with corresponding cells labeled 'isocort', 'hypotha', 'hipp', and 'cere', respectively.

match2cell <- list(cerebral.cortex=c('isocort'), hypothalamus=c('hypotha'), hippocampus='hipp', cerebellum='cere')

The spatially enriched (marker) gene 'Slc6a11' in hypothalamus is chosen for co-visualizing the bulk and single-cell data using the 'cell-by-value' coloring, where each cell and tissue is colored independently accoring expression values of 'Slc6a11'. In this example, cells with high expression levels (red in the left embedding plot) can be associated with hypothalamus based on the expression similarity of 'Slc6a11' in the SHM (Figure 11). Additionally, cells from different tissues (known from available cell annotations) with similar expression of 'Slc6a11' may represent the same or similar cell types present in different source tissues (e.g. blood vessels).

dat.sep <- SPHM(svg=svg.mus.brain, bulk=blk.aggr, cell=cell, match=match2cell)
gg <- covis(data=dat.sep, ID='Slc6a11', sam.factor='sample', con.factor=NULL, dimred='TSNE', cell.group='sample', tar.bulk=names(match2cell), bar.width=0.07, dim.lgd.nrow=2, dim.lgd.text.size=12, height=0.7, legend.r=0.2, legend.key.size=0.02, legend.text.size=12, legend.nrow=5, col.idp=TRUE, vari.cell=NULL, h=0.35, dim.axis.font.size=8, sub.title.size=12) 
Co-visualization: single cell classification via marker biomolecules.

Figure 11: Co-visualization: single cell classification via marker biomolecules

The SpEn analysis of bulk data might fail to detect biomolecules that are only differentially expressed in sub-features of a chosen tissue. The co-visualization functionality can be used to overcome this limitation. Figure 12 shows a co-visualization for gene 'AI413582' that is not detected as spatially enriched in any of the assayed tissues. The embedding plot with 'cell-by-value' coloring reveals that this gene is only selectively expressed in a sub-population of cerebral cortex (isocortex) cells.

Identify non-spatially enriched genes.

en.cort <- query_enrich(enr.res, 'cerebral.cortex')
up.cort <- subset(rowData(en.cort), type=='up' & total==3)
dim(up.cort); up.cort[1:2, 1:3]
# Non-spatially enriched genes.
ids <- setdiff(rownames(blk.aggr), rownames(en.cort)); ids[1:3]
gg <- covis(data=dat.sep, ID='AI413583', sam.factor='sample', con.factor=NULL, dimred='TSNE', cell.group='sample', tar.bulk=names(match2cell), bar.width=0.07, dim.lgd.nrow=2, dim.lgd.text.size=12, height=0.7, legend.r=0.15, legend.key.size=0.02, legend.text.size=12, legend.nrow=5, col.idp=TRUE, vari.cell=NULL, h=0.35, dim.axis.font.size=8, sub.title.size=12) 
Co-visualization: identifiying sub-feature selective genes.

Figure 12: Co-visualization: identifiying sub-feature selective genes

5.1 Spatially Resolved Single-Cell Data

Spatially resolved single-cell (SRSC) assay data (e.g. RNA-seq) preserve the location of cells within the corresponding source features (e.g. tissues). To plot this data, spatialHeatmap projects (overlays) the spatially resolved single-cell data onto the corresponding features in anatomical images. The resulting plots are referred to as single-cell SHMs (scSHM, Figure 13). The following example demonstrates the co-visualization of SRSC (10X Visium) and bulk (Vacher et al. 2021) RNA-seq data of the sagittal mouse brain, where the latter is from the same study at Figure 11.

The SRSC data are loaded through the Seurat and SeuratData packages (Satija et al. 2015; Butler et al. 2018; Stuart et al. 2019; Hao et al. 2021).

# devtools::install_github('satijalab/seurat-data') 
library(Seurat); library(SeuratData);
# InstallData("stxBrain")
set.seed(50)
brain <- LoadData("stxBrain", type = "anterior1")

The bulk and SRSC data are jointly normalized.

nor.lis <- norm_srsc(cell=brain, assay='Spatial', bulk=bulk)
srt.sc <- nor.lis$cell; blk.sp <- nor.lis$bulk

After separated from single-cell data, the tissue replicates in the bulk data are averaged.

blk.aggr.sp <- aggr_rep(data=blk.sp, assay.na='logcounts', sam.factor='sample', aggr='mean')

After separated from bulk data, dimensionalities in SRSC data are reduced, then clustering is performed on the reduced dimensions.

# Reduce dimensionalities.
srt.sc <- RunPCA(srt.sc, assay = "SCT", verbose = FALSE)
srt.sc <- RunUMAP(srt.sc, assay = "SCT", dims = 1:5)
srt.sc <- RunTSNE(srt.sc, assay = "SCT", reduction = "pca", dims = 1:5)
# Cluster cells.
srt.sc <- FindNeighbors(srt.sc, reduction = "pca", dims = 1:30)
srt.sc <- FindClusters(srt.sc, verbose = FALSE)
srt.sc$seurat_clusters <- paste0('clus', srt.sc$seurat_clusters)

The mouse brain aSVG specific to SRSC data is imported. The coordinates of spatial spots in the SRSC data are rotated by 90 degrees to align with the anatomical strucutures in the aSVG.

svg.mus <- read_svg('img/mus_musculus.brain_sp1.svg', srsc=TRUE)
angle(svg.mus)[[1]] <- angle(svg.mus)[[1]] + 90

Different cell shapes are assigned different cell clusters.

grp <- unique(srt.sc$seurat_clusters)
sp <- setNames(c(0, 2:14, 19:25, 32:127)[seq_along(grp)], grp)
sp[c('clus2', 'clus3', 'clus4', 'clus7')] <- 15:18 

Figure 13 is the co-visualization of SRSC and bulk data with the 'cell-by-value' coloring. The gene 'Efhd2' is spatially enriched in cerebral cortex, which is detected by SpEn (SHM on the right in Figure 13). The cluster labels are utilized as the cell group labels (cell.group='seurat_clusters'). The 'cerebral.cortex' tissue aligns anatomically with cell clusters denoted as 'clus3', 'clus4', 'clus7', and 'clus2'. This alignment is defined by a list called lis.match. These four clusters are shown in the middle scSHM in Figure 13, where each cell is positioned according to its spatial coordinates. These cells are also presented in the embedding plot (on the left in Figure 13). The grey dots in Figure 13 denote cells belonging to other clusters. The scSHM reveals that 'Efhd2' has high and low expression levels in sub-features of cerebral cortex, respectively. These sub-features roughly form two clusters in the embedding plot. This result may indicate that there are potentially novel cell types present in the cerebral cortex.

# Association between "cerebral.cortex" and clusters.
lis.match <- list(cerebral.cortex=c('clus3', 'clus4', 'clus7', 'clus2')) 
dat.srsc <- SPHM(svg=svg.mus, bulk=blk.aggr.sp, cell=srt.sc, match=lis.match)
covis(data=dat.srsc, ID='Efhd2', assay.na='logcounts', dimred='TSNE', cell.group='seurat_clusters', tar.bulk=c('cerebral.cortex'), col.idp=TRUE, bar.width=0.07, dim.lgd.nrow=2, dim.lgd.text.size=8, legend.r=0.1, legend.key.size=0.013, legend.text.size=9, legend.nrow=3, h=0.6, profile=TRUE, ncol=3, vjust=5, dim.lgd.key.size=3, size.r=0.97, dim.axis.font.size=8, size.pt=1.5, shape=sp, lgd.plots.size=c(0.35, 0.25, 0.35), verbose=FALSE)
Co-visualization: spatially resolved single-cell data.

Figure 13: Co-visualization: spatially resolved single-cell data

5.2 Visualizing Bulk Deconvolution Results

Many deconvolution algorithms have been developed to infer cellular compositions in bulk data (Newman et al. 2015; Newman et al. 2019; Racle et al. 2017; Gong and Szustakowski 2013; Wang et al. 2019) that can be visualized using the co-visualization module. Specifically, the cell type proportions estimated by the deconvolution methods will be used to calculate an expression matrix from the bulk data where rows are genes and columns are inferred cell types. Next, the cell type expression matrix and the bulk assay data will be used for co-visualization (Figure 14. This utility is showcased using MuSiC (Wang et al. 2019) for deconvoluting bulk RNA-Seq data from human pancreatic islets (Fadista et al. 2014) with matching single cell data (Segerstolpe et al. 2016).

The bulk and single-cell data of human pancreatic islets are downloaded and imported.

library(MuSiC); library(SingleCellExperiment)
set.seed(50)
# Bulk data: https://xuranw.github.io/MuSiC/data/GSE50244bulkeset.rds
blk.eset <- readRDS('data/GSE50244bulkeset.rds')
blk.de <- exprs(blk.eset); blk.de[1:2, 1:3]
##         Sub1 Sub2 Sub3
## DDX11L1    0    0    1
## WASH7P    31   14   21
# Single-cell data: https://xuranw.github.io/MuSiC/data/EMTABsce_healthy.rds
sc.de <- readRDS('data/EMTABsce_healthy.rds'); colData(sc.de)[1:2, ]
## DataFrame with 2 rows and 4 columns
##         sampleID SubjectName cellTypeID cellType
##        <numeric>    <factor>  <numeric> <factor>
## AZ_A10         1   Non T2D 1          5    delta
## AZ_A11         1   Non T2D 1          2    alpha

Bulk deconvolutio is performed with MuSiC (Wang et al. 2019).

# Deconvolution.
res.p <- music_prop(bulk.mtx = blk.de, sc.sce = sc.de, clusters = 'cellType', samples = 'sampleID', select.ct = c('alpha', 'beta', 'delta', 'gamma', 'acinar', 'ductal'), verbose = FALSE)
names(res.p)
## [1] "Est.prop.weighted" "Est.prop.allgene"  "Weight.gene"      
## [4] "r.squared.full"    "Var.prop"
# Deconvolution results.
prop <- res.p$Est.prop.weighted; prop[1:3, ]
##          alpha      beta       delta        gamma     acinar
## Sub1 0.2079168 0.1746030 0.002203054 0.0015887041 0.09597685
## Sub2 0.6964173 0.1197928 0.003683287 0.0007471486 0.03078049
## Sub3 0.1913485 0.3521117 0.032037770 0.0003117540 0.19143687
##         ductal
## Sub1 0.5177117
## Sub2 0.1485789
## Sub3 0.2327533

By using the inferred cell type proportions, the expression matrix of inferred cell types are calculated from the bulk data of one subject (Sub1). The chosen bulk data are filtered according to expression counts \(\ge\) 5. The expression matrix of inferred cell types is stored in a SingleCellExperiment object.

# Filter the chosen bulk data. 
blk.sel <- subset(data.frame(blk.de), Sub1 >= 5)[, 'Sub1', drop=FALSE]
# Calculat expression matrix of inferred cell types.
res.de <- lapply(X=seq_len(ncol(blk.sel)), bulk=blk.sel, FUN=function(i, bulk) {
  lis0 <- lapply(seq_len(nrow(bulk)), function(r) { 
    round(data.frame(bulk)[r, i, drop=TRUE] * prop[i, ])
  }); dat <- do.call('rbind', lis0)
  rownames(dat) <- rownames(bulk); dat
})
# Store the expression matrix of cell types in a SingleCellExperiment object.
res.sc <- res.de[[1]]; res.sc[1:2, ]
##            alpha beta delta gamma acinar ductal
## WASH7P         6    5     0     0      3     16
## AL627309.1     6    5     0     0      3     15
res.sc <- SingleCellExperiment(list(counts=res.sc)) 

Filter the cell-type expression matrix, jointly normalize the cell and bulk data, and separate the two data types.

# Filter cell-type expression data.
blk.sc.de <- filter_cell(sce=res.sc, bulk=blk.sel, cutoff=1, p.in.cell=0.1, p.in.gen=0.5)
# Jointly normalize cell and bulk data. 
blk.se.de.nor <- norm_data(data=cbind(blk.sc.de$bulk, blk.sc.de$cell), norm.fun='CNF', log2.trans=TRUE)
assayNames(blk.se.de.nor) <- 'logcounts'
# Separate bulk from cell data.
blk.de.nor <- blk.se.de.nor[, seq_len(ncol(blk.sc.de$bulk))]
colnames(blk.de.nor) <- 'pancreas.islet'
sc.de.nor <- blk.se.de.nor[, seq_len(ncol(blk.sc.de$cell)) + ncol(blk.de.nor)]

Include essential information in the colData of the cell-type data, such as the corresponding bulk tissue, cell types, and cell type proportions. After that reduce dimensionalities for the cell-type data.

# Include essential information
sc.de.nor$bulk <- 'pancreas.islet'
sc.de.nor$cell.type <- colnames(sc.de.nor)
sc.de.nor$prop <- round(prop['Sub1', sc.de.nor$cell.type], 3)
# Assign unique shapes to each cell type.
cell.all <- unique(sc.de.dim$cell.type)
sp <- setNames(c(15, 17:19, 12, 8), cell.all)
# Reduce dimensionalities.
sc.de.dim <- runTSNE(sc.de.nor)
## Warning: useNames = NA is deprecated. Instead, specify either
## useNames = TRUE or useNames = TRUE.

Import the aSVG of human pancreatic islets.

islet <- read_svg('img/pancreas.svg')

Figure 14 shows co-visualization of bulk and inferred cell-type data of human pancreatic islets for gene 'AP006222.2', where 'cell-by-value' coloring is used. The estimated cellular proportions are shown as decimal numbers next to each cell type. The result indicates that pancreatic islets consist of two rare (yellow) and four abundant cell types, and that gene 'AP006222.2' is selectively expressed in the six cell types.

dat.decon <- SPHM(svg=islet, bulk=blk.de.nor, cell=sc.de.dim)
covis(data=dat.decon, ID='AP006222.2', dimred='TSNE', tar.cell=unique(sc.de.dim$cell.type), cell.group='cell.type', dim.lgd.text.size=10, dim.lgd.key.size=5, dim.lgd.nrow=2, bar.width=0.08, legend.nrow=1, h=0.5, col.idp=TRUE, legend.r=0.2, profile=TRUE, decon=TRUE, size.pt=5, size.lab.pt=3, vjust.lab.pt=0, hjust.lab.pt=0.5, shape=sp, verbose=FALSE)
## Converting "ggplot" to "grob" ... 
## dim_AP006222.2_con_1  
## Converting "ggplot" to "grob" ... 
## 
Co-visualization: visualizing bulk deconvolution results.

Figure 14: Co-visualization: visualizing bulk deconvolution results

6 Version Informaion

sessionInfo()
## R version 4.2.0 (2022-04-22)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.3 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets 
## [7] methods   base     
## 
## other attached packages:
##  [1] knitr_1.43                  stxBrain.SeuratData_0.1.1  
##  [3] SeuratData_0.2.2            SeuratObject_4.1.3         
##  [5] Seurat_4.3.0.1              cowplot_1.1.1              
##  [7] reshape2_1.4.4              scater_1.26.1              
##  [9] scuttle_1.8.4               SingleCellExperiment_1.20.1
## [11] MuSiC_1.0.0                 TOAST_1.12.0               
## [13] quadprog_1.5-8              EpiDISH_2.14.1             
## [15] nnls_1.4                    clusterProfiler_4.6.2      
## [17] enrichplot_1.18.4           biomaRt_2.54.1             
## [19] ggplot2_3.4.2               ExpressionAtlas_1.26.0     
## [21] BiocStyle_2.26.0            jsonlite_1.8.7             
## [23] RCurl_1.98-1.12             xml2_1.3.5                 
## [25] limma_3.54.2                SummarizedExperiment_1.28.0
## [27] Biobase_2.58.0              GenomicRanges_1.50.2       
## [29] GenomeInfoDb_1.34.9         IRanges_2.32.0             
## [31] S4Vectors_0.36.2            BiocGenerics_0.44.0        
## [33] MatrixGenerics_1.10.0       matrixStats_1.0.0          
## [35] spatialHeatmap_2.7.6        nvimcom_0.9-92             
## 
## loaded via a namespace (and not attached):
##   [1] rappdirs_0.3.3            SparseM_1.81             
##   [3] scattermore_1.2           GGally_2.1.2             
##   [5] coda_0.19-4               tidyr_1.3.0              
##   [7] bit64_4.0.5               irlba_2.3.5.1            
##   [9] DelayedArray_0.24.0       data.table_1.14.8        
##  [11] KEGGREST_1.38.0           doParallel_1.0.17        
##  [13] generics_0.1.3            ScaledMatrix_1.6.0       
##  [15] RSQLite_2.3.1             shadowtext_0.1.2         
##  [17] RANN_2.6.1                future_1.33.0            
##  [19] proxy_0.4-27              shinytoastr_2.1.1        
##  [21] bit_4.0.5                 spatstat.data_3.0-1      
##  [23] httpuv_1.6.11             assertthat_0.2.1         
##  [25] viridis_0.6.4             xfun_0.39                
##  [27] hms_1.1.3                 jquerylib_0.1.4          
##  [29] evaluate_0.21             promises_1.2.0.1         
##  [31] fansi_1.0.4               progress_1.2.2           
##  [33] caTools_1.18.2            dbplyr_2.3.3             
##  [35] htmlwidgets_1.6.2         igraph_1.5.0.1           
##  [37] DBI_1.1.3                 mcmc_0.9-7               
##  [39] reshape_0.8.9             spatstat.geom_3.2-4      
##  [41] purrr_1.0.1               ellipsis_0.3.2           
##  [43] dplyr_1.1.2               bookdown_0.34            
##  [45] annotate_1.76.0           MCMCpack_1.6-3           
##  [47] deldir_1.0-9              locfdr_1.1-8             
##  [49] sparseMatrixStats_1.10.0  vctrs_0.6.3              
##  [51] quantreg_5.96             ROCR_1.0-11              
##  [53] abind_1.4-5               cachem_1.0.8             
##  [55] withr_2.5.0               ggforce_0.4.1            
##  [57] HDO.db_0.99.1             progressr_0.13.0         
##  [59] sctransform_0.3.5         treeio_1.22.0            
##  [61] prettyunits_1.1.1         scran_1.26.2             
##  [63] goftest_1.2-3             cluster_2.1.4            
##  [65] DOSE_3.24.2               ape_5.7-1                
##  [67] grImport_0.9-7            lazyeval_0.2.2           
##  [69] crayon_1.5.2              spatstat.explore_3.2-1   
##  [71] genefilter_1.80.3         edgeR_3.40.2             
##  [73] pkgconfig_2.0.3           labeling_0.4.2           
##  [75] tweenr_2.0.2              nlme_3.1-162             
##  [77] vipor_0.4.5               globals_0.16.2           
##  [79] rlang_1.1.1               miniUI_0.1.1.1           
##  [81] lifecycle_1.0.3           MatrixModels_0.5-2       
##  [83] downloader_0.4            filelock_1.0.2           
##  [85] BiocFileCache_2.6.1       rsvd_1.0.5               
##  [87] rsvg_2.4.0                polyclip_1.10-4          
##  [89] lmtest_0.9-40             Matrix_1.6-0             
##  [91] aplot_0.1.10              zoo_1.8-12               
##  [93] beeswarm_0.4.0            ggridges_0.5.4           
##  [95] png_0.1-8                 viridisLite_0.4.2        
##  [97] bitops_1.0-7              shinydashboard_0.7.2     
##  [99] gson_0.1.0                KernSmooth_2.23-22       
## [101] Biostrings_2.66.0         blob_1.2.4               
## [103] DelayedMatrixStats_1.20.0 stringr_1.5.0            
## [105] qvalue_2.30.0             spatstat.random_3.1-5    
## [107] parallelly_1.36.0         gridGraphics_0.5-1       
## [109] shinyAce_0.4.2            beachmat_2.14.2          
## [111] scales_1.2.1              ica_1.0-3                
## [113] memoise_2.0.1             magrittr_2.0.3           
## [115] plyr_1.8.8                gplots_3.1.3             
## [117] zlibbioc_1.44.0           compiler_4.2.0           
## [119] scatterpie_0.2.1          dqrng_0.3.0              
## [121] RColorBrewer_1.1-3        fitdistrplus_1.1-11      
## [123] cli_3.6.1                 XVector_0.38.0           
## [125] listenv_0.9.0             pbapply_1.7-2            
## [127] patchwork_1.1.2           spsComps_0.3.3.0         
## [129] MASS_7.3-60               tidyselect_1.2.0         
## [131] stringi_1.7.12            highr_0.10               
## [133] yaml_2.3.7                GOSemSim_2.24.0          
## [135] BiocSingular_1.14.0       locfit_1.5-9.8           
## [137] ggrepel_0.9.3             grid_4.2.0               
## [139] sass_0.4.7                fastmatch_1.1-3          
## [141] tools_4.2.0               future.apply_1.11.0      
## [143] parallel_4.2.0            bluster_1.8.0            
## [145] foreach_1.5.2             metapod_1.6.0            
## [147] gridExtra_2.3             farver_2.1.1             
## [149] Rtsne_0.16                ggraph_2.1.0             
## [151] digest_0.6.33             BiocManager_1.30.21.1    
## [153] shiny_1.7.4.1             Rcpp_1.0.11              
## [155] later_1.3.1               RcppAnnoy_0.0.21         
## [157] httr_1.4.6                AnnotationDbi_1.60.2     
## [159] colorspace_2.1-0          tensor_1.5               
## [161] reticulate_1.30           XML_3.99-0.14            
## [163] splines_4.2.0             uwot_0.1.16              
## [165] yulab.utils_0.0.6         statmod_1.5.0            
## [167] spatstat.utils_3.0-3      tidytree_0.4.4           
## [169] graphlayouts_1.0.0        sp_2.0-0                 
## [171] ggplotify_0.1.1           plotly_4.10.2            
## [173] xtable_1.8-4              ggtree_3.6.2             
## [175] tidygraph_1.2.3           corpcor_1.6.10           
## [177] ggfun_0.1.1               R6_2.5.1                 
## [179] pillar_1.9.0              htmltools_0.5.5          
## [181] mime_0.12                 glue_1.6.2               
## [183] fastmap_1.1.1             BiocParallel_1.32.6      
## [185] BiocNeighbors_1.16.0      class_7.3-22             
## [187] codetools_0.2-19          fgsea_1.24.0             
## [189] utf8_1.2.3                spatstat.sparse_3.0-2    
## [191] lattice_0.21-8            bslib_0.5.0              
## [193] tibble_3.2.1              curl_5.0.1               
## [195] ggbeeswarm_0.7.2          leiden_0.4.3             
## [197] gtools_3.9.4              magick_2.7.4             
## [199] GO.db_3.16.0              survival_3.5-5           
## [201] rmarkdown_2.23            munsell_0.5.0            
## [203] e1071_1.7-13              GenomeInfoDbData_1.2.9   
## [205] iterators_1.0.14          gtable_0.3.3

References

Butler, Andrew, Paul Hoffman, Peter Smibert, Efthymia Papalexi, and Rahul Satija. 2018. “Integrating Single-Cell Transcriptomic Data Across Different Conditions, Technologies, and Species.” Nature Biotechnology 36: 411–20. doi:10.1038/nbt.4096.

Fadista, João, Petter Vikman, Emilia Ottosson Laakso, Inês Guerra Mollet, Jonathan Lou Esguerra, Jalal Taneera, Petter Storm, et al. 2014. “Global Genomic and Transcriptomic Analysis of Human Pancreatic Islets Reveals Novel Genes Influencing Glucose Metabolism.” Proc. Natl. Acad. Sci. U. S. A. 111 (38): 13924–9.

Gong, Ting, and Joseph D Szustakowski. 2013. “DeconRNASeq: A Statistical Framework for Deconvolution of Heterogeneous Tissue Samples Based on mRNA-Seq Data.” Bioinformatics 29 (8): 1083–5.

Hao, Yuhan, Stephanie Hao, Erica Andersen-Nissen, William M. Mauck III, Shiwei Zheng, Andrew Butler, Maddie J. Lee, et al. 2021. “Integrated Analysis of Multimodal Single-Cell Data.” Cell. doi:10.1016/j.cell.2021.04.048.

Knebel, Birgit, Sonja Hartwig, Jutta Haas, Stefan Lehr, Simon Goeddeke, Franciscus Susanto, Lothar Bohne, et al. 2015. “Peroxisomes Compensate Hepatic Lipid Overflow in Mice with Fatty Liver.” Biochim. Biophys. Acta 1851 (7): 965–76.

Merkin, Jason, Caitlin Russell, Ping Chen, and Christopher B Burge. 2012. “Evolutionary Dynamics of Gene and Isoform Regulation in Mammalian Tissues.” Science 338 (6114): 1593–9.

Newman, Aaron M, Chih Long Liu, Michael R Green, Andrew J Gentles, Weiguo Feng, Yue Xu, Chuong D Hoang, Maximilian Diehn, and Ash A Alizadeh. 2015. “Robust Enumeration of Cell Subsets from Tissue Expression Profiles.” Nat. Methods 12 (5): 453–57.

Newman, Aaron M, Chloé B Steen, Chih Long Liu, Andrew J Gentles, Aadel A Chaudhuri, Florian Scherer, Michael S Khodadoust, et al. 2019. “Determining Cell Type Abundance and Expression from Bulk Tissues with Digital Cytometry.” Nat. Biotechnol. 37 (7). Nature Publishing Group: 773–82.

Ortiz, Cantin, Jose Fernandez Navarro, Aleksandra Jurek, Antje Märtin, Joakim Lundeberg, and Konstantinos Meletis. 2020. “Molecular Atlas of the Adult Mouse Brain.” Science Advances 6 (26). American Association for the Advancement of Science: eabb3446.

Racle, Julien, Kaat de Jonge, Petra Baumgaertner, Daniel E Speiser, and David Gfeller. 2017. “Simultaneous Enumeration of Cancer and Immune Cell Types from Bulk Tumor Gene Expression Data.” Elife 6 (November). eLife Sciences Publications, Ltd: e26476.

Satija, Rahul, Jeffrey A Farrell, David Gennert, Alexander F Schier, and Aviv Regev. 2015. “Spatial Reconstruction of Single-Cell Gene Expression Data.” Nature Biotechnology 33: 495–502. doi:10.1038/nbt.3192.

Segerstolpe, Åsa, Athanasia Palasantza, Pernilla Eliasson, Eva-Marie Andersson, Anne-Christine Andréasson, Xiaoyan Sun, Simone Picelli, et al. 2016. “Single-Cell Transcriptome Profiling of Human Pancreatic Islets in Health and Type 2 Diabetes.” Cell Metab. 24 (4): 593–607.

Stepulak, Andrzej, Hella Luksch, Christine Gebhardt, Ortrud Uckermann, Jenny Marzahn, Marco Sifringer, Wojciech Rzeski, et al. 2009. “Expression of Glutamate Receptor Subunits in Human Cancers.” Histochem. Cell Biol. 132 (4): 435–45.

Stuart, Tim, Andrew Butler, Paul Hoffman, Christoph Hafemeister, Efthymia Papalexi, William M Mauck III, Yuhan Hao, Marlon Stoeckius, Peter Smibert, and Rahul Satija. 2019. “Comprehensive Integration of Single-Cell Data.” Cell 177: 1888–1902. doi:10.1016/j.cell.2019.05.031.

Thorgersen, Ebbe Billmann, Andreas Barratt-Due, Håkon Haugaa, Morten Harboe, Søren Erik Pischke, Per H Nilsson, and Tom Eirik Mollnes. 2019. “The Role of Complement in Liver Injury, Regeneration, and Transplantation.” Hepatology 70 (2): 725–36.

Vacher, Claire-Marie, Helene Lacaille, Jiaqi J O’Reilly, Jacquelyn Salzbank, Dana Bakalar, Sonia Sebaoui, Philippe Liere, et al. 2021. “Placental Endocrine Function Shapes Cerebellar Development and Social Behavior.” Nat. Neurosci. 24 (10). Nature Publishing Group: 1392–1401.

Wang, Xuran, Jihwan Park, Katalin Susztak, Nancy R Zhang, and Mingyao Li. 2019. “Bulk Tissue Cell Type Deconvolution with Multi-Subject Single-Cell Expression Reference.” Nat. Commun. 10 (1). Nature Publishing Group: 1–9.

Xiao, Bin, Zhenzhan Kuang, Weiyun Zhang, Jianfeng Hang, Lidan Chen, Ting Lei, Yongyin He, et al. 2019. “Glutamate Ionotropic Receptor Kainate Type Subunit 3 (Grik3) Promotes Epithelial-Mesenchymal Transition in Breast Cancer Cells by Regulating SPDEF/Cdh1 Signaling.” Mol. Carcinog. 58 (7): 1314–23.