Question: BAM to individual chromosome fasta/fastq
0
gravatar for win
4.7 years ago by
win810
India
win810 wrote:

Hi all,

I have a BAM (paired end) file which i want to convert to individual chromosome fasta or fastq files. We tried to use EMBOSS 6.6 but this did not work. Can someone please help with how this can be accomplished.

thanks.

bam fasta • 2.3k views
ADD COMMENTlink modified 4.7 years ago by David Langenberger9.1k • written 4.7 years ago by win810
1
gravatar for David Langenberger
4.7 years ago by
Deutschland
David Langenberger9.1k wrote:

I'm not completely sure, if I got your question correctly, but using this code, you'll write out all reads that were mapped to chr1 in a fastq-file. To make it for the other chromosomes, you can just read in the header, loop over the chromosomes and change 'chr1' to a variable.

samtools view -ub INPUT.bam chr1 | samtools bam2fq - > chr1.fastq
ADD COMMENTlink written 4.7 years ago by David Langenberger9.1k

just a follow up question. the bam is aligned to a certain reference file, if we generate individual chromosome bam files and the convert each bam to fastq and align the chromosome fastq to another reference i am going to mess things up, since mapping could be different i.e a sequence which mapped to chromosome 1 in one reference could map to another chromosome in another reference.

ADD REPLYlink written 4.7 years ago by win810

Well, that depends... are you talking about different species? Within one species the differences between the assemblies should not be that different. If you, for example, map your reads against hg18 and hg19, the best hits should show up on the same chromosomes. If you map them on hg19 and ce10, they will map on different chromosomes. ;)

ADD REPLYlink written 4.7 years ago by David Langenberger9.1k

Its going to be same species, human. Even if say some sequences remapped to another chromosome and then if i merged all the new BAMs the overall effect would be the same.

ADD REPLYlink written 4.7 years ago by win810

I would highly recommend you to map the reads against the complete genome and not chromosome by chromosome. A correct merging of all bams is a pain in the a...

 

ADD REPLYlink written 4.7 years ago by David Langenberger9.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2113 users visited in the last hour