flagstat interpretation from a bam file
0
0
Entering edit mode
7.3 years ago
fi1d18 ★ 4.1k

sorry friends,

i had a bam file but my adviser believes that because of some reason the results seem not significant, i used flagstat (below)

5541714 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
5541714 + 0 mapped (100.00%:-nan%)
0 + 0 paired in sequencing
0 + 0 properly paired (-nan%:-nan%)
0 + 0 with itself and mate mapped
0 + 0 singletons (-nan%:-nan%)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)

may you please tell me how is the alignment??

thank you

bam samtools flagstat • 3.3k views
0
Entering edit mode

Well, based only on this flagstat, you've got perfect alignment (all the reads are aligned perfectly). I will suspect that this is a post-processed bam file (not the raw bam alignment file). So we cannot tell yo anything about it. Also, what do you mean by not significant? Are you trying to do exome sequencing? RNA Sequencing? Significant in terms of what? Is it paired end data?

0
Entering edit mode

Thanks Sam,

I am doing ribo-seq and this is bam produced of footprints. But when trimming the adapters (illumina TruSeq Universal adapter), the rate of trimmed reads were too low, then supervisor told maybe the results are influenced of bad trimming or multi alignment and asked me to check the bowtie2 output but from biostars I knew that by flagstat I can check the quality

Now I don't know if the reads aligned properly or not

1
Entering edit mode

So let me be clear, what you have done is:

1. Trimming
2. Alignment (bowtie2)
3. Flagstat

Have you run fastQC to see how the reads quality looks? Also, when you mean the rate of trimmed reads were too low, you mean only a few reads need to be trimmed?

0
Entering edit mode

Thanks,

I mean in the results, reads with adapter were just 4 percent. In fastqc result, Overrepresented sequences were varying based on the file. Yeah all right I did which you mentioned

0
Entering edit mode

Actually, you should try performing the fastQC before and after the trimming. That should let you know whether if there is any over-representative sequences in the start of your reads. If there is only a small amount of over-representative sequence (or adaptors, which fastQC can sometimes detect), then it is normal for you to only trim 4% of all your reads.

Usually, when we perform Exome Sequencing and RNA Seq, if the data is nice (no read-through), we can have nice fastq files that doesn't contain the adaptor sequences, therefore there is no need of trimming. However, as I have never played with Ribo-seq data, I am not sure if that is the case for you.

0
Entering edit mode

thank you Sam