Question

Arabidopsis thaliana whole genome

0

Entering edit mode

6.3 years ago

sangram_keshari ▴ 260

Please correct me if I am wrong.

I wan to align my set of RNA-seq reads into Arabidopsis thaliana genome. So where would I get the FASTA file of this?

I looked for the [Ensemble FTP][1] and downloaded the file - Arabidopsis_thaliana.TAIR10.dna.toplevel.fa.gz

Is this the correct genome file to start the alignment or should I get it from somewhere else like TAIR. But in TAIR also its confusing which one to download.

Can anyone please help me with this. Thank You :)

RNA-Seq alignment genome • 3.0k views

ADD COMMENT • link updated 6.3 years ago by GenoMax 141k • written 6.3 years ago by sangram_keshari ▴ 260

score 1 · Answer 1 · 2017-12-21

1

Entering edit mode

6.3 years ago

GenoMax 141k

Get the sequence/annotation/aligner index bundle from iGenomes site.

ADD COMMENT • link 6.3 years ago by GenoMax 141k

0

Entering edit mode

I guess Ensemble data also treated as standard??

ADD REPLY • link 6.3 years ago by sangram_keshari ▴ 260

1

Entering edit mode

Primary sequence data comes from the genome project. Ensembl may have additional annotation but otherwise everything should be the same.

You could also use Araport to get the sequence/annotations. You would need to build your own indexes.

ADD REPLY • link 6.3 years ago by GenoMax 141k