Beyond scRNAseq: neuronal recordings

In this tutorial we are showing the applicability of scFates on other high-dimensional datasets, such as single neuronal acitvity recordings, in order to extract population dynamics.

The dataset and preprocessing

In this example we will use data from Peyrache et al, which is a study on mammalian head direction cell recordings. Here is the figure 1 that summaries the location and data recorded:

image0

The data used for this tutoral includes recordings from 22 neurons located in the anterodorsal thalamic nucleus (ADn) of a single mouse that was awake and foraging in an open environment, with measured head angles.

Preprocessing

Following Chaudhuri et al. method, spike times were converted into time-varying rates. Firing rates were estimated by convolving the spike times with a Gaussian kernel of standard deviation 100 ms. The rates were then replaced by their square root to stabilize the variance. For each timepoint is the related measured angle.

Load libraries and data

[1]:
import pandas as pd
import numpy as np
import palantir
import anndata
import warnings
# ignore all future warnings
warnings.filterwarnings("ignore")
import scFates as scf
import scanpy as sc
import scanpy.external as sce
import seaborn as sns
import matplotlib.colors as col
import seaborn
seaborn.reset_orig()
%matplotlib inline

N = 256
flat_huslmap = col.ListedColormap(sns.color_palette('husl',N))

angle_u = 'https://github.com/LouisFaure/scFates_notebooks/raw/main/data/angles.csv.gz'
counts_u = 'https://github.com/LouisFaure/scFates_notebooks/raw/main/data/counts.csv.gz'
findfont: Font family ['Raleway'] not found. Falling back to DejaVu Sans.
[2]:
counts=pd.read_csv(counts_u,header=None)
counts.index=counts.index.astype(str)
counts.columns=counts.columns.astype(str)
counts.shape
[2]:
(22841, 22)
[3]:
adata=anndata.AnnData(counts)
adata.obs['angles']=np.degrees(pd.read_csv(angle_u, header=None).values)
[4]:
adata.obs['time']=np.arange(0,adata.shape[0]*0.05,0.05)

Dimensionality reduction

PCA

[5]:
sc.pp.pca(adata)
[6]:
sc.set_figure_params()
sc.pl.pca(adata,color="angles",cmap=flat_huslmap)
_images/Beyond_scRNAseq_10_0.png

Multi-scale diffusion maps

[7]:
sce.tl.palantir(adata)
adata
Determing nearest neighbor graph...
[7]:
AnnData object with n_obs × n_vars = 22841 × 22
    obs: 'angles', 'time'
    uns: 'pca', 'palantir_EigenValues'
    obsm: 'X_pca', 'X_palantir_diff_comp', 'X_palantir_multiscale'
    varm: 'PCs'
    layers: 'palantir_imp'
    obsp: 'palantir_diff_op'
[8]:
adata.obsm["X_palantir_multiscale"].shape
[8]:
(22841, 7)
[9]:
sc.pl.embedding(adata,basis='X_palantir_multiscale',color='angles',cmap=flat_huslmap)
_images/Beyond_scRNAseq_14_0.png

Show several dimensions of multiscale diffusion space

[10]:
import matplotlib.pyplot as plt
fig, axs = plt.subplots(2,2,figsize=(8,8))
axs=axs.ravel()
for d in np.arange(2,6):
    sc.pl.embedding(adata,basis='X_palantir_multiscale',
                    title="dims %d, %d"%(d,d+1),
                    dimensions=(int(d),int(d+1)),
                    color='angles',cmap=flat_huslmap,
                    show=False,ax=axs[d-2],frameon=False)

    # complex way to remove the side colorbar
    axs[d-2].set_box_aspect(aspect=1)
    fig = axs[d-2].get_gridspec().figure
    cbar = np.argwhere(
        ["colorbar" in a.get_label() for a in fig.get_axes()]
    ).ravel()
    if len(cbar) > 0:
        fig.get_axes()[cbar[0]].remove()
_images/Beyond_scRNAseq_16_0.png

Principal circle fitting

[11]:
scf.tl.circle(adata,Nodes=30,use_rep="X_palantir_multiscale",device="gpu")
inferring a principal circle --> parameters used
    30 principal points, mu = 0.1, lambda = 0.01
    there are 1 non assigned nodes
    finished (0:00:16) --> added
    .uns['epg'] dictionnary containing inferred elastic circle generated from elpigraph.
    .obsm['X_R'] hard assignment of cells to principal points.
    .uns['graph']['B'] adjacency matrix of the principal points.
    .uns['graph']['F'], coordinates of principal points in representation space.
[12]:
scf.pl.graph(adata,basis='palantir_multiscale')
_images/Beyond_scRNAseq_19_0.png

Converting to soft assigment matrix

ElPiGraph lead to a hard assignment matrix \(R\), rendering it impossible to perform multiple mappings to account for variability, as the assignment to nodes are either 0 or 1:

[13]:
adata.obs["R"]=adata.obsm["X_R"][:,25]
sc.pl.embedding(adata,basis='X_palantir_multiscale',color='R')
_images/Beyond_scRNAseq_21_0.png

To alleviate this limitation, we can apply one iteration of SimplePPT algorithm, generating a soft assignment matrix \(R\), with continuous values between 0 and 1:

[14]:
scf.tl.convert_to_soft(adata,sigma=1,lam=100)
Converting R into soft assignment matrix
    finished (0:00:00) --> updated
    .obsm['X_R'] converted soft assignment of cells to principal points.
    .uns['graph']['F'] coordinates of principal points in representation space.
[15]:
adata.obs["R"]=adata.obsm["X_R"][:,25]
sc.pl.embedding(adata,basis='X_palantir_multiscale',color='R')
_images/Beyond_scRNAseq_24_0.png
[16]:
scf.pl.graph(adata,basis='palantir_multiscale')
_images/Beyond_scRNAseq_25_0.png

Projecting activities over pseudotime

Select root node characterized by activity at minimum angle

[17]:
scf.tl.root(adata,root="angles",min_val=True)
automatic root selection using angles values
node 27 selected as a root --> added
    .uns['graph']['root'] selected root.
    .uns['graph']['pp_info'] for each PP, its distance vs root and segment assignment.
    .uns['graph']['pp_seg'] segments network information.

To enable multiple mappings, the circle is considered as a trajectory composed of two segments from the selected root and clostest node converging to the furthest node.

[18]:
scf.tl.pseudotime(adata,n_jobs=50,n_map=50,seed=42)
projecting cells onto the principal graph
    mappings: 100%|██████████| 50/50 [00:58<00:00,  1.16s/it]
    finished (0:01:13) --> added
    .obs['edge'] assigned edge.
    .obs['t'] pseudotime value.
    .obs['seg'] segment of the tree assigned.
    .obs['milestones'] milestone assigned.
    .uns['pseudotime_list'] list of cell projection from all mappings.
[19]:
sc.pl.scatter(adata,x='angles',y='t',color="seg",palette=["tab:blue","Tab:red"])
sc.pl.embedding(adata,color="seg",basis='palantir_multiscale')
_images/Beyond_scRNAseq_31_0.png
_images/Beyond_scRNAseq_31_1.png

Unroll the circle to get a pseudotime value for all parts of the cicle

To obtain a pseudotime value that differs between the two segments, one can unroll the circle into a single curved trajectory starting from the root and ending to its closest node.

[20]:
scf.tl.unroll_circle(adata)
--> updated
    .obs['t'] assigned pseudotime value.
    .obs['seg'] assigned segment of the tree.
    .uns['graph']['root'] selected root.
    .uns['graph']['pp_info'] for each PP, its distance vs root and segment assignment.
    .uns['graph']['pp_seg'] segments network information.
[21]:
sc.pl.scatter(adata,x='angles',y='t')
_images/Beyond_scRNAseq_34_0.png

GAM fit of single neuron activities

[22]:
scf.tl.fit(adata,features=adata.var_names,n_jobs=10)
fit features associated with the trajectory
    single mapping : 100%|██████████| 22/22 [00:25<00:00,  1.15s/it]
    finished (adata subsetted to keep only fitted features!) (0:00:25) --> added
    .layers['fitted'], fitted features on the trajectory for all mappings.
    .raw, unfiltered data.
[23]:
adata.uns["graph"]["milestones"]
[23]:
{'10': 10, '24': 24, '27': 27}
[24]:
scf.pl.single_trend(adata,'3',basis='palantir_multiscale',frameon=False,
                    color_exp="k",alpha_expr=.04,ylab='activity')
_images/Beyond_scRNAseq_38_0.png
[25]:
g=scf.pl.trends(adata,plot_emb=False,highlight_features=[],ordering="max",
              fig_heigth=6,pseudo_cmap=flat_huslmap,ord_thre=.95,return_genes=True)
_images/Beyond_scRNAseq_39_0.png
[26]:
scf.pl.matrix(adata,features=g,nbins=20,
              cmap="RdBu_r",annot_top=False,
              colorbar_title="max\nnormalized\nactivity")
_images/Beyond_scRNAseq_40_0.png