Question: No XIST in Emsembl cdna.fa
0
gravatar for WouterDeCoster
4 weeks ago by
Belgium
WouterDeCoster21k wrote:

I noticed that Salmon didn't count XIST in my RNA-seq data, and tracked it down to the transcriptome fasta from Ensembl which was used to build the index. I'm not sure if I'm being stupid or something is wrong, but I can't find XIST (ENSG00000229807) in the Homo_sapiens.GRCh38.cdna.all.fa.gz transcriptome file, while it is present in the gtf Homo_sapiens.GRCh38.89.gtf.gz

ftp to the Homo_sapiens.GRCh38.cdna.all.fa.gz
ftp to the Homo_sapiens.GRCh38.89.gtf.gz

Does anyone have an idea what I'm missing or what's wrong?

Thanks!

ensembl • 289 views
ADD COMMENTlink modified 4 weeks ago by Satyajeet Khare1.0k • written 4 weeks ago by WouterDeCoster21k
3
gravatar for Satyajeet Khare
4 weeks ago by
Satyajeet Khare1.0k
Pune, India
Satyajeet Khare1.0k wrote:

This page says cDNA fasta file contains "cDNA sequences for protein-coding genes". Whereas, this page says its a file containing "cDNA sequences for Ensembl or ab initio predicted genes". Can you check if all non-coding RNAs are absent or only XIST?

ADD COMMENTlink written 4 weeks ago by Satyajeet Khare1.0k
1

Thanks! That seems to be it... HOTAIR is also not present. Not what I expected!
But XIST is present in Homo_sapiens.GRCh38.ncrna.fa.gz. I wonder if it would be appropriate to just concatenate the ncrna and cdna fasta to get a more complete reference.

For me, cDNA was "transcript-coding DNA" and not necessarily "protein-coding DNA".

I'm not sure if I'm being stupid

Hypothesis confirmed...

ADD REPLYlink written 4 weeks ago by WouterDeCoster21k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 524 users visited in the last hour