Question: Where Can I Download All Exons Of The Human Genome In Fasta Format (One Big File!) ?
3
gravatar for mariusorion
7.3 years ago by
mariusorion50
mariusorion50 wrote:

Hi to all,

Where can I download all exons of the human genome in FASTA format (one big file!) ?

thanks.

genome fasta exon human download • 9.7k views
ADD COMMENTlink modified 3.9 years ago by Alejandro Jimenez Sanchez120 • written 7.3 years ago by mariusorion50

Thank you very much to all. Thank you Malachi, this is what I am looking for.

ADD REPLYlink written 7.3 years ago by mariusorion50

Not an answer... This should go as a comment. Thanks!

ADD REPLYlink written 7.3 years ago by Josh Herr5.7k

Here - ftp://ftp.ebi.ac.uk/pub/databases/astd/current_release/human/

ADD REPLYlink modified 6 months ago by RamRS27k • written 4.5 years ago by hamzakhanvit10
6
gravatar for Malachi Griffith
7.3 years ago by
Washington University School of Medicine, St. Louis, USA
Malachi Griffith18k wrote:

Using Ensembl BioMART

  1. Go to BioMART website.
  2. Choose database: Ensembl Genes 70
  3. Choose dataset: Homo sapiens
  4. Click 'Attributes' then select the 'Sequences' option
  5. Expand the sequences pane and select the 'Exon sequences' option
  6. Expand the 'Header information' pane. Select the info you want to associate with each sequence record (e.g. Ensembl Gene ID, Ensembl Transcript ID, Ensembl Exon ID).
  7. Click the 'Results' button.
  8. Make sure the example output looks good, select the 'Compressed File' option and hit the 'Go' button.

The download took a while but eventually I got a file 'mart_export.txt.gz'. This file has output that looks like this

>ENSG00000264715|ENST00000578347|ENSE00002715610
AGGAGTGACCAGAAGACAAGAGTGCGAGCCTTCTGTTATGCCCAGACAGGGCCACCAGAGGGCTCCTTGGTCTAGTGGTAACGCCA 
>ENSG00000265161|ENST00000580394|ENSE00002716056
AGTAGAGATGGGGTTTCACCATGTTGGCCAGGCTGGTCTCAAACTCCTGACCTCAGGTGATCCATCCACC
>ENSG00000200917|ENST00000364047|ENSE00001438810
GCACATACATATACTAAAATTGGAACAATACAGAGAAGATTAGCATAGCCCCTGCGCAAGGATGACATGCAAATTCGTGAAGTGTTCCATATTAAA
...
ADD COMMENTlink modified 6 months ago by RamRS27k • written 7.3 years ago by Malachi Griffith18k

Thank you Malachi, this is what I am looking for :)

ADD REPLYlink written 7.3 years ago by mariusorion50
3
gravatar for Pierre Lindenbaum
7.3 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum129k wrote:

get the coordinates:

Exon coordinates of hg19 genome download

and fetch the sequences:

How to get the sequence of a genomic region from UCSC?

Batch fetching fasta sequences from bed file

ADD COMMENTlink written 7.3 years ago by Pierre Lindenbaum129k
2
gravatar for aindap
7.3 years ago by
aindap120
United States
aindap120 wrote:

Have a look at the UCSC Table Browser: position base query

ADD COMMENTlink written 7.3 years ago by aindap120

The page does not work

ADD REPLYlink written 7.3 years ago by mariusorion50
0
gravatar for hamzakhanvit
4.5 years ago by
hamzakhanvit10
hamzakhanvit10 wrote:

Here - ftp://ftp.ebi.ac.uk/pub/databases/astd/current_release/human/

ADD COMMENTlink modified 6 months ago by RamRS27k • written 4.5 years ago by hamzakhanvit10
0
gravatar for Alejandro Jimenez Sanchez
3.9 years ago by
Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK

If anyone is looking for the same but for GRCh37 here is the link.

ADD COMMENTlink written 3.9 years ago by Alejandro Jimenez Sanchez120
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 816 users visited in the last hour