Entering edit mode
6.3 years ago
sangram_keshari
▴
260
Please correct me if I am wrong.
I wan to align my set of RNA-seq reads into Arabidopsis thaliana genome. So where would I get the FASTA file of this?
I looked for the [Ensemble FTP][1] and downloaded the file - Arabidopsis_thaliana.TAIR10.dna.toplevel.fa.gz
Is this the correct genome file to start the alignment or should I get it from somewhere else like TAIR. But in TAIR also its confusing which one to download.
Can anyone please help me with this. Thank You :)
I guess Ensemble data also treated as standard??
Primary sequence data comes from the genome project. Ensembl may have additional annotation but otherwise everything should be the same.
You could also use Araport to get the sequence/annotations. You would need to build your own indexes.