Tutorial 3: Cross-modality representation of spatial RNA-ADT co-profiling human lymph node data generated by 10x Genomics¶
The human lymph node dataset, generated by 10x Genomics Visium transcriptomics and proteomics co-profiling technology, was collected from the Zenodo database.
[1]:
import warnings
import pandas as pd
import numpy as np
import scanpy as sc
import episcanpy.api as epi
from PRESENT import PRESENT_function
warnings.filterwarnings("ignore")
sc.set_figure_params(dpi=80, figsize=(4,4), facecolor="white")
[2]:
metadata = pd.read_csv(f"Data/annotation.csv", index_col=0)
adata_adt = sc.read_h5ad("Data/adata_ADT.h5ad")
adata_adt.obs["anno"] = metadata.loc[adata_adt.obs_names, "manual-anno"]
print(adata_adt)
adata_rna = sc.read_h5ad("Data/adata_RNA.h5ad")
adata_rna.obs["anno"] = metadata.loc[adata_rna.obs_names, "manual-anno"]
print(adata_rna)
AnnData object with n_obs × n_vars = 3484 × 31
obs: 'anno'
var: 'gene_ids', 'feature_types', 'genome'
obsm: 'spatial'
AnnData object with n_obs × n_vars = 3484 × 18085
obs: 'anno'
var: 'gene_ids', 'feature_types', 'genome'
obsm: 'spatial'
Run PRESENT model¶
[3]:
adata = PRESENT_function(
spatial_key = "spatial", ## obsm key under which to load the spatial matrix of spots
adata_rna = adata_rna, ## The RNA raw fragment count matrix of spots in anndata.AnnData format
gene_min_cells = 1, ## Minimum number of cells expressed required for a gene to pass filtering
num_hvg = 3000, ## Number of highly variable genes to select for RNA data
adata_adt = adata_adt, ## The ADT raw fragment count matrix of spots in anndata.AnnData format
protein_min_cells = 1, ## Minimum number of cells expressed required for a protein to pass filtering
nclusters = adata_adt.obs["anno"].nunique(),
device = "cuda" ## Device used for training: cuda or cpu
)
print(adata)
Loading data and parameters...
UserWarning:/home/lizhen/miniconda3/envs/PRESENT/lib/python3.9/site-packages/anndata/_core/anndata.py:1840: Variable names are not unique. To make them unique, call `.var_names_make_unique`.
UserWarning:/home/lizhen/miniconda3/envs/PRESENT/lib/python3.9/site-packages/anndata/_core/anndata.py:1840: Variable names are not unique. To make them unique, call `.var_names_make_unique`.
UserWarning:/home/lizhen/miniconda3/envs/PRESENT/lib/python3.9/site-packages/anndata/_core/anndata.py:1840: Variable names are not unique. To make them unique, call `.var_names_make_unique`.
UserWarning:/home/lizhen/miniconda3/envs/PRESENT/lib/python3.9/site-packages/anndata/_core/anndata.py:1840: Variable names are not unique. To make them unique, call `.var_names_make_unique`.
UserWarning:/home/lizhen/miniconda3/envs/PRESENT/lib/python3.9/site-packages/anndata/_core/anndata.py:1840: Variable names are not unique. To make them unique, call `.var_names_make_unique`.
Input data has been loaded
Computing METIS partitioning...
Done!
Model training: 44%|███▉ | 44/100 [00:25<00:32, 1.73it/s, NLL_loss=0.249, BNN_loss=0.322, MSE_loss=0.354, IOA_loss=0.0235, ES counter=20, ES patience=20]
Early stop the training process
ImplicitModificationWarning:/home/lizhen/miniconda3/envs/PRESENT/lib/python3.9/site-packages/anndata/_core/anndata.py:121: Transforming to str index.
TqdmWarning:/home/lizhen/miniconda3/envs/PRESENT/lib/python3.9/site-packages/tqdm/auto.py:21: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
Succeed to find 10 clusters at resolution 0.938
AnnData object with n_obs × n_vars = 3483 × 50
obs: 'anno', 'n_genes', 'leiden', 'LeidenClusters'
uns: 'neighbors', 'leiden'
obsm: 'spatial', 'embeddings'
obsp: 'distances', 'connectivities'
Visualization¶
[4]:
sc.pp.neighbors(adata, use_rep="embeddings")
sc.tl.umap(adata)
sc.pl.umap(adata, color=["anno", "LeidenClusters"], wspace=0.5)
[5]:
sc.pl.embedding(adata, basis="spatial", color=["anno", "LeidenClusters"], wspace=0.5)
[ ]: