Differential Methylation analysis and functional enrichment
In [1]:
Copied!
import pandas as pd
import numpy as np
import scanpy as sc
import scMethtools as scm
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import scanpy as sc
import scMethtools as scm
import matplotlib.pyplot as plt
#### #### # # ##### # # ##### #### #### # #### # # # ## ## # # # # # # # # # # #### # # ## # # ###### # # # # # # #### # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #### #### # # # # # # #### #### ###### #### Version: 1.0.0, Tutorials: https://ngdc.cncb.ac.cn/methbank/scm
In [5]:
Copied!
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
Import Data and settings¶
In [3]:
Copied!
#import anndata as ad
# adata = ad.read('/xtdisk/methbank_baoym/zongwt/single/data/GSE56789/scbs/after/promoters.h5ad')
adata = scm.read('/xtdisk/methbank_baoym/zongwt/single/data/GSE56789/scbs/after/promoters.h5ad')
#import anndata as ad
# adata = ad.read('/xtdisk/methbank_baoym/zongwt/single/data/GSE56789/scbs/after/promoters.h5ad')
adata = scm.read('/xtdisk/methbank_baoym/zongwt/single/data/GSE56789/scbs/after/promoters.h5ad')
/p300s/baoym_group/zongwt/zongwt/software/miniconda3/envs/py39/lib/python3.9/site-packages/anndata/__init__.py:51: FutureWarning: `anndata.read` is deprecated, use `anndata.read_h5ad` instead. `ad.read` will be removed in mid 2024. warnings.warn(
In [2]:
Copied!
#gloabel rcParams parameters
scm.settings.set_figure_params(dpi=150,figsize=[3,3],fontsize=8,font='Arial')
#gloabel rcParams parameters
scm.settings.set_figure_params(dpi=150,figsize=[3,3],fontsize=8,font='Arial')
- Differential analysis requires known cell sample labels. For methylation data, low-throughput datasets (e.g., those obtained using mouth pipetting) typically have known labels, such as sampling site and cell type.
- For high-throughput methods based on microfluidics and cell barcoding, cell labels may sometimes be unknown, requiring clustering and even cell type annotation beforehand.
- In the following example, we use the existing labels in the demo data for differential identification. These labels do not represent the true cell types but are used for comparison purposes only.
In [4]:
Copied!
scm.pl.set_colors(adata, 'Treatment', palette=scm.pl.palette(), n_colors=5)
scm.pl.set_colors(adata, 'Treatment', palette=scm.pl.palette(), n_colors=5)
Out[4]:
AnnData object with n_obs × n_vars = 45 × 38794 obs: 'cell_id', 'cell_name', 'sites', 'meth', 'n_total', 'global_meth_level', 'Series', 'condition', 'sample', 'Run', 'Organism', 'Cell_ID', 'Source_name', 'Cell_type', 'total_cell_type', 'subgroup', 'Development_stage', 'Strain', 'Tissue', 'Cell_line', 'Genotype', 'Treatment', 'Age', 'Sex', 'Race', 'other', 'Disease', 'CenterName', 'avgLength', 'LibraryStrategy', 'LibraryLayout', 'Platform', 'Model', 'Total_bases.Gb', 'Total_reads', 'Reads_After_trim', 'Uniq_mapping.reads', 'Mapping_ratio', 'Conversion_ratio', 'TotalCpGs(1X)', 'CpGsTotalUnique(1X)', 'CpGsMeanCov(1X)', 'CpGsMethRatio(1X)', 'CpGsMethReadRatio(1X)', 'TotalCpGs(3X)', 'CpGsTotalUnique(3X)', 'CpGsMeanCov(3X)', 'CpGsMethRatio(3X)', 'CpGsMethReadRatio(3X)', 'TotalCpGs(5X)', 'CpGsTotalUnique(5X)', 'CpGsMeanCov(5X)', 'CpGsMethRatio(5X)', 'CpGsMethReadRatio(5X)', 'Label', 'n_features', 'pct_peaks', 'leiden' var: 'chromosome', 'start', 'end', 'covered_cell', 'var', 'sr_var', 'mean', 'pct_cell', 'feature_select_sr', 'feature_select_var', 'Accession', 'Gene', 'Distance' uns: 'Cell_type_colors', 'Label_colors', 'Treatment_colors', 'dendrogram_leiden', 'label_color', 'leiden', 'leiden_colors', 'motif_analysis', 'neighbors', 'relative_pca_var_ratios', 'treatment', 'umap', 'workdir' obsm: 'X_tsne', 'X_umap', 'relative_pca' layers: 'relative' obsp: 'connectivities', 'distances'
In [6]:
Copied!
#scm.pl.set_colors(adata, 'Treatment', palette=scm.pl.palette(), n_colors=5)
scm.pl.grouped_value_boxplot(adata,color_by='Treatment',value_column='global_meth_level')
#scm.pl.set_colors(adata, 'Treatment', palette=scm.pl.palette(), n_colors=5)
scm.pl.grouped_value_boxplot(adata,color_by='Treatment',value_column='global_meth_level')
Get colors from adata.unsTreatment_colors ...
/p300s/baoym_group/zongwt/zongwt/software/miniconda3/envs/py39/lib/python3.9/site-packages/scMethtools/plotting/basic.py:233: UserWarning: Numpy array is not a supported type for `palette`. Please convert your palette to a list. This will become an error in v0.14 boxplot = sns.boxplot(x=color_by, y=value_column, data=adata.obs, /p300s/baoym_group/zongwt/zongwt/software/miniconda3/envs/py39/lib/python3.9/site-packages/scMethtools/plotting/basic.py:233: UserWarning: The palette list has more values (5) than needed (3), which may not be intended. boxplot = sns.boxplot(x=color_by, y=value_column, data=adata.obs, /p300s/baoym_group/zongwt/zongwt/software/miniconda3/envs/py39/lib/python3.9/site-packages/scMethtools/plotting/basic.py:238: UserWarning: Numpy array is not a supported type for `palette`. Please convert your palette to a list. This will become an error in v0.14 sns.stripplot(x=color_by, y=value_column, data=adata.obs, /p300s/baoym_group/zongwt/zongwt/software/miniconda3/envs/py39/lib/python3.9/site-packages/scMethtools/plotting/basic.py:238: UserWarning: The palette list has more values (5) than needed (3), which may not be intended. sns.stripplot(x=color_by, y=value_column, data=adata.obs,
Out[6]:
(<Figure size 450x450 with 1 Axes>, <Axes: title={'center': 'Grouped Boxplot of global_meth_level by Treatment'}, xlabel='Treatment', ylabel='global_meth_level'>)
Differentially methylated region calling¶
one vs one compare¶
- We can also do pairwise comparisons between two groups. For instance, clusters 1 & 2 are both mESC cells but different treatment.
In [8]:
Copied!
scm.dm.mdiff_pairwise(adata, group_by='Treatment', cluster='2i', reference='serum/LIF', key_added = "Treatment")
scm.dm.mdiff_pairwise(adata, group_by='Treatment', cluster='2i', reference='serum/LIF', key_added = "Treatment")
... Running differential methylation analysis between 2i and reference serum/LIF
In [9]:
Copied!
df_pairwise = sc.get.rank_genes_groups_df(adata,group=None,key='Treatment')
df_pairwise = sc.get.rank_genes_groups_df(adata,group=None,key='Treatment')
In [10]:
Copied!
df_pairwise.head()
df_pairwise.head()
Out[10]:
names | scores | logfoldchanges | pvals | pvals_adj | |
---|---|---|---|---|---|
0 | chr12:69360258-69362258 | 2.028904 | 1.587244 | 0.058618 | 1.0 |
1 | chr16:64850652-64852652 | 2.011422 | 1.679350 | 0.065555 | 1.0 |
2 | chr12:69360339-69362339 | 1.927094 | 1.426012 | 0.068754 | 1.0 |
3 | chr12:69360481-69362481 | 1.801351 | 1.302340 | 0.086583 | 1.0 |
4 | chr11:20542253-20544253 | 1.738439 | 2.164821 | 0.106372 | 1.0 |
In [12]:
Copied!
scm.pl.plot_volcano(adata,key='Treatment', group='2i',log2fc_min=-10, show_name=False)
scm.pl.plot_volcano(adata,key='Treatment', group='2i',log2fc_min=-10, show_name=False)
- It can be observed that mESCs treated with 2i exhibit differential hypomethylation compared to those treated with serum/LIF, with nearly all DMRs showing hypomethylation.
one vs rest compare¶
- We can also do comparisons of individual clusters on one vs many clusters to find specific DMR
In [13]:
Copied!
scm.dm.mdiff_specific(adata,group_by='Treatment', key_added='Treatment_all')
scm.dm.mdiff_specific(adata,group_by='Treatment', key_added='Treatment_all')
... Running differential methylation analysis between ['2i', 'Unknown', 'serum/LIF'] and reference rest
In [14]:
Copied!
df = sc.get.rank_genes_groups_df(adata,key='Treatment_all',group=None)
df = sc.get.rank_genes_groups_df(adata,key='Treatment_all',group=None)
In [15]:
Copied!
df.head()
df.head()
Out[15]:
group | names | scores | logfoldchanges | pvals | pvals_adj | |
---|---|---|---|---|---|---|
0 | 2i | chr3:96219168-96221168 | 2.810465 | NaN | 0.004947 | 0.009682 |
1 | 2i | chr11:20542253-20544253 | 2.784799 | 2.394727 | 0.005356 | 0.010361 |
2 | 2i | chr12:56535908-56537908 | 2.707800 | NaN | 0.006773 | 0.012659 |
3 | 2i | chr3:96218865-96220865 | 2.463970 | NaN | 0.013741 | 0.023136 |
4 | 2i | chr12:69360258-69362258 | 2.220139 | 1.572072 | 0.026409 | 0.040364 |
In [16]:
Copied!
df['group'].unique().tolist()
df['group'].unique().tolist()
Out[16]:
['2i', 'Unknown', 'serum/LIF']
- plot marker for cluster
In [25]:
Copied!
scm.pl.plot_marker(adata,top=10,basis='umap',key_added='Treatment_all',cluster='2i', size=30,ncols=5)
scm.pl.plot_marker(adata,top=10,basis='umap',key_added='Treatment_all',cluster='2i', size=30,ncols=5)
In [28]:
Copied!
var_names = adata.uns['Treatment_all']['names']['2i'].tolist()
sc.pl.stacked_violin(adata, var_names[:20], groupby='Treatment',save='test_stack.png',cmap='Greens')
var_names = adata.uns['Treatment_all']['names']['2i'].tolist()
sc.pl.stacked_violin(adata, var_names[:20], groupby='Treatment',save='test_stack.png',cmap='Greens')
/p300s/baoym_group/zongwt/zongwt/software/miniconda3/envs/py39/lib/python3.9/site-packages/scanpy/plotting/_stacked_violin.py:461: UserWarning: Numpy array is not a supported type for `palette`. Please convert your palette to a list. This will become an error in v0.14 row_ax = sns.violinplot( /p300s/baoym_group/zongwt/zongwt/software/miniconda3/envs/py39/lib/python3.9/site-packages/scanpy/plotting/_stacked_violin.py:461: UserWarning: Numpy array is not a supported type for `palette`. Please convert your palette to a list. This will become an error in v0.14 row_ax = sns.violinplot( /p300s/baoym_group/zongwt/zongwt/software/miniconda3/envs/py39/lib/python3.9/site-packages/scanpy/plotting/_stacked_violin.py:461: UserWarning: Numpy array is not a supported type for `palette`. Please convert your palette to a list. This will become an error in v0.14 row_ax = sns.violinplot(
WARNING: saving figure to file figures/stacked_violin_test_stack.png
In [31]:
Copied!
scm.pl.heatmap(adata, var_names=var_names[:50],swap_axes=True,annotation=['Cell_type','Treatment','Label'],groupby='leiden',cmap="GnBu",show_gene_labels=True,dendrogram=True)
scm.pl.heatmap(adata, var_names=var_names[:50],swap_axes=True,annotation=['Cell_type','Treatment','Label'],groupby='leiden',cmap="GnBu",show_gene_labels=True,dendrogram=True)
dendrogram_key by dendrogram_leiden Get colors from adata.unsCell_type_colors ... Get colors from adata.unsTreatment_colors ... Get colors from adata.unsLabel_colors ...
--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) Cell In[31], line 1 ----> 1 scm.pl.heatmap(adata, var_names=var_names[:50],swap_axes=True,annotation=['Cell_type','Treatment','Label'],groupby='leiden',cmap="GnBu",show_gene_labels=True,dendrogram=True) File /p300s/baoym_group/zongwt/zongwt/software/miniconda3/envs/py39/lib/python3.9/site-packages/scMethtools/plotting/heatmap.py:698, in heatmap(adata, var_names, groupby, annotation, use_raw, log, num_categories, dendrogram, gene_symbols, var_group_positions, var_group_labels, var_group_rotation, layer, standard_scale, swap_axes, show_gene_labels, show, save, figsize, vmin, vmax, vcenter, norm, **kwds) 695 if var_group_positions is not None and len(var_group_positions) > 0: 696 return_ax_dict["gene_groups_ax"] = gene_groups_ax --> 698 _utils.savefig_or_show("heatmap", show=show, save=save) 699 show = settings.autoshow if show is None else show 700 if show: AttributeError: module 'scanpy._utils' has no attribute 'savefig_or_show'
Gene Annotation of Differential Regions¶
- To investigate the potential functions of the differential regions, DMRs can be annotated to the nearest gene.
In [ ]:
Copied!
#数据下载
#!wget ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_19/gencode.v19.annotation.gtf.gz -O data/data_tutorial_buenrostro/gencode.v19.annotation.gtf.gz
gtf_file='/xtdisk/methbank_baoym/zongwt/single/genome/mouse/Mus_musculus.GRCm39.111.chr.gtf'
#数据下载
#!wget ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_19/gencode.v19.annotation.gtf.gz -O data/data_tutorial_buenrostro/gencode.v19.annotation.gtf.gz
gtf_file='/xtdisk/methbank_baoym/zongwt/single/genome/mouse/Mus_musculus.GRCm39.111.chr.gtf'
- The annotation process may take a few minutes.
In [56]:
Copied!
reference = scm.dm.parse_gtf(gtf_file)
scm.dm.annotation(adata,ref=reference,gene_type='protein_coding')
reference = scm.dm.parse_gtf(gtf_file)
scm.dm.annotation(adata,ref=reference,gene_type='protein_coding')
... Loading gene references ... protein_coding gene will be annotated. ... Done ... Loading regions info ... Overlapping genes with regions ...Overlapping finish
In [19]:
Copied!
adata.var.head()
adata.var.head()
Out[19]:
chromosome | start | end | covered_cell | var | sr_var | mean | pct_cell | feature_select_sr | feature_select_var | Accession | Gene | Distance | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
index | |||||||||||||
chr1:58391898-58393898 | chr1 | 58391898 | 58393898 | 42 | 0.005050 | 0.001915 | 0.030952 | 0.933333 | False | False | ENSMUSG00000079554 | Aox2 | 0 |
chr1:87189607-87191607 | chr1 | 87189607 | 87191607 | 43 | 0.119925 | 0.093893 | 0.453605 | 0.955556 | False | False | ENSMUSG00000026255 | Efhd1 | 478 |
chr1:91610406-91612406 | chr1 | 91610406 | 91612406 | 43 | 0.115410 | 0.091781 | 0.349442 | 0.955556 | False | False | ENSMUSG00000007805 | Twist2 | 116777 |
chr1:180356649-180358649 | chr1 | 180356649 | 180358649 | 43 | 0.173246 | 0.121766 | 0.501581 | 0.955556 | True | True | ENSMUSG00000026496 | Parp1 | 37840 |
chr1:123755558-123757558 | chr1 | 123755558 | 123757558 | 33 | 0.171086 | 0.069639 | 0.566970 | 0.733333 | False | False | ENSMUSG00000036815 | Dpp10 | 0 |
In [20]:
Copied!
scm.pl.umap(adata,frameon=False, color=['Safb2','Ccdc69'],gene_symbols='Gene',size=50,cmap='Greens')
scm.pl.umap(adata,frameon=False, color=['Safb2','Ccdc69'],gene_symbols='Gene',size=50,cmap='Greens')
Note: Multiple regions are matched for Gene, mean methylation value will be plotted Note: Multiple regions are matched for Gene, mean methylation value will be plotted
Functional Enrichment analysis¶
In [21]:
Copied!
df_anno = sc.get.rank_genes_groups_df(adata,key='Treatment_all',group=None, gene_symbols='Gene')
df_anno = sc.get.rank_genes_groups_df(adata,key='Treatment_all',group=None, gene_symbols='Gene')
In [22]:
Copied!
df_anno.head()
df_anno.head()
Out[22]:
group | names | scores | logfoldchanges | pvals | pvals_adj | Gene | |
---|---|---|---|---|---|---|---|
0 | 2i | chr3:96219168-96221168 | 2.810465 | NaN | 0.004947 | 0.009682 | Fcgr1 |
1 | 2i | chr11:20542253-20544253 | 2.784799 | 2.394727 | 0.005356 | 0.010361 | Sertad2 |
2 | 2i | chr12:56535908-56537908 | 2.707800 | NaN | 0.006773 | 0.012659 | Nkx2-1 |
3 | 2i | chr3:96218865-96220865 | 2.463970 | NaN | 0.013741 | 0.023136 | Fcgr1 |
4 | 2i | chr12:69360258-69362258 | 2.220139 | 1.572072 | 0.026409 | 0.040364 | Nemf |
In [33]:
Copied!
#filter
gene_list = df_anno['Gene'].unique().to_list()
#filter
gene_list = df_anno['Gene'].unique().to_list()
In [42]:
Copied!
gene_list = scm.get.get_dmr_genes(adata,key_added='Treatment_all',direction='up',groups='2i')
gene_list = scm.get.get_dmr_genes(adata,key_added='Treatment_all',direction='up',groups='2i')
In [46]:
Copied!
gene_list['2i'][:5] #gene name should be upper for enrich
gene_list['2i'][:5] #gene name should be upper for enrich
Out[46]:
['SERTAD2', 'NEMF', 'NEMF', 'ESRP1', 'NRG3']
Functional enrichment analysis requires corresponding gene and pathway annotation files. If the runtime environment has internet access, online analysis can be performed by specifying library files.
Built-in libraries:
- GO_Biological_Process_2018.gmt
- HDSigDB_Human_2021.gmt
- KEGG_2019_Human.gmt
- WikiPathways_2024_Human.gmt
- GO_Biological_Process_2025.gmt
- HDSigDB_Mouse_2021.gmt
- KEGG_2019_Mouse.gmt
- WikiPathways_2024_Mouse.gmt
If the runtime environment does not have internet access, a limited set of built-in libraries can be used.
Users can also download the required libraries and specify the file path via parameters for analysis, but the files must be in .gmt format.
- download .gmt file wget http://maayanlab.cloud/Enrichr/geneSetLibrary?mode=text&libraryName=KEGG_2021_Human -O KEGG_2021_Human.gmt
In [50]:
Copied!
pathways = scm.dm.enrich(gene_list['2i'], gene_sets='/xtdisk/methbank_baoym/zongwt/single/genome/enrich_library/WikiPathways_2024_Mouse.gmt')
pathways = scm.dm.enrich(gene_list['2i'], gene_sets='/xtdisk/methbank_baoym/zongwt/single/genome/enrich_library/WikiPathways_2024_Mouse.gmt')
2025-03-26 13:47:06,848 [INFO] User defined gene sets is given: /xtdisk/methbank_baoym/zongwt/single/genome/enrich_library/WikiPathways_2024_Mouse.gmt 2025-03-26 13:47:06,854 [INFO] Run: WikiPathways_2024_Mouse.gmt 2025-03-26 13:47:06,883 [INFO] Background is not set! Use all 4494 genes in WikiPathways_2024_Mouse.gmt. 2025-03-26 13:47:06,936 [INFO] Done.
In [51]:
Copied!
scm.pl.plot_enrichment(pathways,figsize=(8,4),dpi=200,palette='Reds',save='test_enrich.png')
scm.pl.plot_enrichment(pathways,figsize=(8,4),dpi=200,palette='Reds',save='test_enrich.png')
Saving figure to : ./figures/test_enrich.png
In [ ]:
Copied!
pathways = scm.dm.enrich(gene_list['2i'], gene_sets='GO_Biological_Process_2018')
pathways = scm.dm.enrich(gene_list['2i'], gene_sets='GO_Biological_Process_2018')
Motif Analysis¶
- Transcription factor binding motifs are often enriched with cis-regulatory elements, which can reveal potential regulatory mechanisms. The first step in studying these DNA motifs is to scan for their occurrences within genomic regions.
- First, we need the FASTA sequence of the organism's genome, which can be downloaded from the UCSC repository.
In [ ]:
Copied!
!mkdir -p data
!wget https://hgdownload.soe.ucsc.edu/goldenPath/mm10/bigZips/mm10.fa.gz -O data/mm10.fa.gz
!cd data/ && gzip -d -f mm10.fa.gz
!mkdir -p data
!wget https://hgdownload.soe.ucsc.edu/goldenPath/mm10/bigZips/mm10.fa.gz -O data/mm10.fa.gz
!cd data/ && gzip -d -f mm10.fa.gz
1. Motif Scan¶
- Using Position Weight Matrix (PWM) to predict transcription factor (TF) binding sites in DNA sequences.
- for backgroung regions, we use all regions passed filter before, you can specified your own background regions
In [58]:
Copied!
all_regions = scm.get.get_var(adata)
all_regions = scm.get.get_var(adata)
In [59]:
Copied!
len(all_regions)
len(all_regions)
Out[59]:
38794
In [ ]:
Copied!
scm.dm.motif_scan(adata, regions=all_regions, genome='/p300s/baoym_group/zongwt/zongwt/genome/mm10/mm10.fa')
scm.dm.motif_scan(adata, regions=all_regions, genome='/p300s/baoym_group/zongwt/zongwt/genome/mm10/mm10.fa')
- It takes approximately a few tens of minutes to scan motifs across the whole genome regions.
2. Motif Enrichment¶
- The enrichment of transcription factors (TFs) in target regions (DMRs) is statistically analyzed to infer whether TFs regulate specific genes.
In [124]:
Copied!
motif_result = motif_enrichment(adata,key_added='Treatment_all',direction='up',groups=None)
motif_result = motif_enrichment(adata,key_added='Treatment_all',direction='up',groups=None)
Calculating enrichment for group 2i ...
Finding enrichments: 100%|██████████| 1646/1646 [00:01<00:00, 1011.17it/s]
Calculating enrichment for group Unknown ...
Finding enrichments: 100%|██████████| 1646/1646 [00:01<00:00, 916.20it/s]
Calculating enrichment for group serum/LIF ...
Finding enrichments: 100%|██████████| 1646/1646 [00:01<00:00, 834.87it/s]
In [120]:
Copied!
motif_result
motif_result
Out[120]:
2i | Unknown | serum/LIF | |
---|---|---|---|
EN1 | 9.960413e-01 | 9.863188e-01 | 9.996782e-01 |
GFI1 | 1.000000e+00 | 1.000000e+00 | 8.308049e-01 |
TEAD1 | 9.999999e-01 | 9.999987e-01 | 9.999472e-01 |
OVO | 1.000000e+00 | 1.000000e+00 | 1.000000e+00 |
VSX1 | 1.000000e+00 | 1.000000e+00 | 1.000000e+00 |
... | ... | ... | ... |
KLF17 | 1.474806e-39 | 1.946404e-29 | 5.903653e-13 |
MSGN1 | 1.000000e+00 | 1.000000e+00 | 9.999996e-01 |
OVOL1 | 7.372146e-01 | 5.817805e-01 | 3.373687e-01 |
FOXF1 | 1.000000e+00 | 1.000000e+00 | 9.999999e-01 |
PTF1A.VAR.3 | 6.600707e-01 | 2.905419e-01 | 9.999988e-01 |
1646 rows × 3 columns
In [125]:
Copied!
scm.pl.plot_motif(motif_result,top_n=5)
scm.pl.plot_motif(motif_result,top_n=5)
Out[125]:
(<Figure size 450x450 with 2 Axes>, <Axes: title={'center': 'Motif Enrichment Heatmap'}, xlabel='Group', ylabel='Motif'>)
Methylation Pattern for DMR¶
- Do not use too long regions,this process is memory consumed
In [144]:
Copied!
gtf_file = "/xtdisk/methbank_baoym/zongwt/single/genome/mouse/Mus_musculus.GRCm39.111.chr.gtf"
scm.pl.profile(adata,
npz_file="/xtdisk/methbank_baoym/zongwt/single/data/GSE56789/sscm/chr7.npz",
region="7:97591403-97593402",
annotation_file=gtf_file,
clusters=["2i"],group_by="Treatment",
sample=None)
gtf_file = "/xtdisk/methbank_baoym/zongwt/single/genome/mouse/Mus_musculus.GRCm39.111.chr.gtf"
scm.pl.profile(adata,
npz_file="/xtdisk/methbank_baoym/zongwt/single/data/GSE56789/sscm/chr7.npz",
region="7:97591403-97593402",
annotation_file=gtf_file,
clusters=["2i"],group_by="Treatment",
sample=None)
2i shape: 12 Others shape: 33 ['Gene Annotation', 'Profile', '2i', 'Others']
In [ ]:
Copied!