Question: How To Get Ref.Fasta
3
gravatar for Zhshqzyc
7.6 years ago by
Zhshqzyc470
Zhshqzyc470 wrote:

Hi,

I want to use samtools command

samtools faidx <ref.fasta> [region1 [...]]

My question: where can I get ref.fasta or how to create ref.fasta by some command? Suppose I have a bam file already.

Thanks.

sequence samtools • 7.2k views
ADD COMMENTlink written 7.6 years ago by Zhshqzyc470

Which genome was used to create your BAM file? By that I mean, to which genome were the reads aligned?

ADD REPLYlink written 7.6 years ago by iw9oel_ad6.0k

Human genome. dbGaP phenotype release

ADD REPLYlink written 7.6 years ago by Zhshqzyc470

Which human genome? hg18, hg19, another one? Normally you can download the hgXX as single chromosomes and merged them to hgXX.fasta, meaning ref.fasta

ADD REPLYlink written 7.6 years ago by Mdeng510

hg18 genome. Where can I download it?

ADD REPLYlink written 7.6 years ago by Zhshqzyc470

just a warning: if you already have a BAM file it means that the reads have already been mapped, so the reference file should have already been available. you should try to retrieve such reference file, because if you download a different file you would end having nomenclature or position errors that won't be easy to deal with.

ADD REPLYlink written 7.6 years ago by Jorge Amigo11k
4
gravatar for Mdeng
7.6 years ago by
Mdeng510
Germany
Mdeng510 wrote:

Get it here:

http://hgdownload.cse.ucsc.edu/goldenPath/hg18/bigZips/

and download hg18.2bit, this is hg18 binary coded. This one you can convert to the fasta format using twoBitToFa. You can download this tool here:

http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/

//Edit:

Ok, then you can download the single chromosomes here:

http://hgdownload.cse.ucsc.edu/goldenPath/hg18/chromosomes/

Skip that ones with random in the name, just from chr1.fa.gz to chrY.fa.gz (depends on what you need - may ask the guy who did the alignment).

After that unzip them and merge. How this works please find here:

http://biostar.stackexchange.com/questions/9743/fasta-file-vs-fa-file

fa equals fasta

ADD COMMENTlink modified 7.6 years ago • written 7.6 years ago by Mdeng510

mdeng, many many thanks.

ADD REPLYlink written 7.6 years ago by Zhshqzyc470

Let us know if it works. If not try the Link which you posted. Seems also to be hg18.fasta and you don't have to convert formats.

ADD REPLYlink written 7.6 years ago by Mdeng510

twoBitToFa is a corrupt text file. If you have a direct download for hg18.fasta, please let me know. Thanks.

ADD REPLYlink written 7.6 years ago by Zhshqzyc470

Edit my post...

ADD REPLYlink written 7.6 years ago by Mdeng510
2
gravatar for Drio
7.6 years ago by
Drio910
United States
Drio910 wrote:

The BAM file will not contain the reference genome (if that is what you are asking). Check the header:

samtools view -H my.bam

You may find some information about the exact version that was used to align the data. If you can't find anything I'd suggest you contact the person that generated the alignments.

ADD COMMENTlink written 7.6 years ago by Drio910

There is a header file. I found something like. @SQ SN:chr1_random LN:1663265 AS:HG18 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly18.fasta M5:cc05cb1554258add2eb62e88c0746394 SP:Homo sapiens So should I download this file as reference fasta?

ADD REPLYlink written 7.6 years ago by Zhshqzyc470

Yes, that's exactly what you want to do.

ADD REPLYlink written 7.6 years ago by Chris Miller20k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1013 users visited in the last hour