SARS-CoV-2 genome on NCBI website
2
0
Entering edit mode
11 weeks ago

Welcome Why is it that when downloading a Fasta file for a specific SARS-CoV-2 genome from the NCBI website, it is DNA and not RNA? Is there an agreement for that?

SARS-CoV-2 NCBI RNA DNA • 386 views
0
Entering edit mode
11 weeks ago

I don't have a link for the rationale explained,

but most RNA sequences in databases are represented as DNA,

makes life a lot easier when comparing and matching RNA, DNA and protein sequences.

0
Entering edit mode
11 weeks ago

Why does the FASTA sequence for coronavirus look like DNA, not RNA?

The reason is simple, we never sequence directly from RNA because RNA is too unstable and easily degraded by RNase. Instead the genome is reverse transcribed, either by targeted reverse transcription or random amplification and thus converted to cDNA. cDNA is stable and is essentially reverse transcribed RNA.

The cDNA is either sequenced directly or further amplified by PCR and then sequenced. Hence the sequence we observe is the cDNA rather than RNA, thus we observe thymine rather than uracil and that is how it is reported.

0
Entering edit mode

still, I'd say is not quite a fully satisfying explanation,

at the end of sequencing we could just as well transcribe it back to RNA since we do know that the original product was RNA, the rest is just protocol, we are not measuring bases either, but fluoresence etc. yet we are not calling it red

0
Entering edit mode

While I am not 100% certain, GenBank contains only DNA sequences. So even for genomes that are RNA the sequence is represented as DNA counterpart. e.g. Chikungunya virus.

LOCUS       NC_004162              11826 bp **ss-RNA**     linear   VRL 29-OCT-2018
DEFINITION  Chikungunya virus, complete genome.
ACCESSION   NC_004162
VERSION     NC_004162.2