BBMap output meaning
0
1
Entering edit mode
4.5 years ago
Tania ▴ 140
Hi All

Does anyone understand the following top lines from BBMap ihist output;
I use some previous biostars to run it but can't get what are the #InsertSize and Count.


Does it mean the first insert size is 77? Then can we have insert sizes = 1 as in the end of the output?

    Here is my output:
#Mean   193.157
#Median 168
#Mode   145
#STDev  396.478
#PercentOfPairs 92.468
#InsertSize     Count
1       77
2       87
3       105
4       107
5       98
6       92
7       94
8       103
9       121
10      112
..... .... ... ..

29754   1
29919   1
30005   1
30062   1
30481   1
30732   1
31142   1

bbmap rnaseq • 1.8k views
0
Entering edit mode

Can you provide the full command line used?

0
Entering edit mode

I used the following command, found it somewhere in biostars:

 ./bbmap.sh -Xmx50g  in1=Sread1.fastq.gz in2=Sread2.fastq.gz ihist=histmap1m.txt reads=1000000 ref=human_genome.fa

0
Entering edit mode

I am not sure why you have insert sizes that small and with quite a few counts as well. Has this data been trimmed? You probably should have set a minlen=25 when you trimmed the data to avoid getting very short reads.

0
Entering edit mode

Sorry to bother again. But which column is the insert sizes? the count column? These are the raw data before doing anything.

0
Entering edit mode
#InsertSize     Count


First column should be insert size and second column is count of fragments with that insert size.

Have you scanned/trimmed this data? Can you try that?

I will tag @Brian Bushnell on this thread so he can offer his take. Please be aware that he has been busy of late and it may be a few days before he may look at this.

0
Entering edit mode

Edited: Thanks genomax, much appreciated ! I will use the trimmed data and see the difference. But this also mean we have too big insert sizes at end? Does this make sense?

0
Entering edit mode

Tagging: Brian Bushnell