cellranger count output does not give all genes.
1
0
Entering edit mode
2.5 years ago
Adem80 • 0

Dear all,

I have recently started using the cellranger from 10x for scRNA-seq data, after having used my own pipeline (with STAR alignment) for smart-seq2 data, to get the count matrix for then later analyzing with Seurat or Scanpy.

I have however faced a strange issue: not all genes appear in the features.csv file. For example, the gene Clec7a does not appear, which seems strange to me. I then checked the total number of features present in the features.csv file and it was approximately 32000, which is very different from the 53000 I used to have with my own pipeline. This seems to indicate that the genome annotation files (gtf) are somehow different.

Does 10x filter the genome annotation file in a certain way to decrease the number of genes? Is there an option to control this?

Note: I use the same reference genome version in my own pipeline as the 10x one.

cellranger 10x • 956 views
ADD COMMENT
1
Entering edit mode
2.5 years ago

I assume you are talking about mouse

Here's the entry for that gene in the gtf I downloaded from ensembl

6 ensembl_havana gene 129438554 129449742 . - .gene_id "ENSMUSG00000079293"; gene_version "12"; gene_name "Clec7a"; gene_source "ensembl_havana"; gene_biotype "polymorphic_pseudogene";

See also the ensembl entry: Look at the "gene type" in the summary

http://uswest.ensembl.org/Mus_musculus/Gene/Summary?db=core;g=ENSMUSG00000079293;r=6:129438554-129449742

If you filter the gtf the way 10xGenomics suggest, that's going to be filtered away. I don't know why the gene is so labeled, when it has transcripts which are counted as protein coding.

ADD COMMENT
0
Entering edit mode

That's really helpful!

In this case, I will have to make a new reference using "cellranger mkref" without the filterings to overcome this problem.

ADD REPLY

Login before adding your answer.

Traffic: 2481 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6