How to work with a BAM file that have an inaccessible reference genome
1
0
Entering edit mode
11 months ago
Ak ▴ 60

So I have come across a BAM file of a parasite genome and was trying to identify the variants.

As I could not find the reference genome that was used for the alignment, I was thinking of converting the BAM file into fastq format and have them categorized accordingly into R1, R2 and singletons. Then, aligning it to the reference genome that I have.

But I am wondering if this is feasible? Could anyone give some advice? Thanks.

Genome fastq alignment Reference bam • 486 views
0
Entering edit mode
0
Entering edit mode

@ Ak Why did you delete this post?

0
Entering edit mode

I've noticed that there're actually alot of similar questions out there (e.g. convert bam to fastq etc.), just the way that I've asked may be more of a roundabout way. So I figured it was rather redundant, might as well deleting it.

0
Entering edit mode

In that case, add an answer with links to one or more posts that you found useful and maybe add some text about how you found these posts. Then, accept that answer. You have learned how to use the forum better, and that knowledge could be useful to others.

1
Entering edit mode
10 months ago
Ak ▴ 60

Thanks all for the help/advice. I eventually used samtools fastq to extract the sequences. But prior to that I used samtools collate.

samtools collate -u -o input.collate.bam input.bam
samtools fastq -1 paired1.fq -2 paired2.fq -0 unmapped.fq -s singletons.fq -n input.collate.bam