plotting read length distribution of Single End data
1
0
Entering edit mode
4 months ago
Meghan.T ▴ 10

So I have a rather obvious question that's been bugging me for few days. I am trying to plot read length distribution of a Single end sequencing data, essentially to get a better understanding of the fragment length distribution. With paired end data I see there are tools like picard [1]CollectInsertSizeMetrics that plot the fragment length distribution. I am looking to produce a figure like below but don't know what tool to use.

enter image description here

read_length_distribution single_end_sequencing WGS • 474 views
ADD COMMENT
2
Entering edit mode
4 months ago
GenoMax 154k

You can't determine fragment length from single end sequencing data since you are missing information about sequence at 3'-end of that fragment (with one exception: if your insert sizes are shorter then the length of sequencing then even with a single end read you will see the sequence of adapter on the 3'-end and thus should be able to determine fragment/insert size).

I am trying to plot read length distribution of a Single end sequencing data,

Read length distribution is something different. You can use a tool from BBTools to get the length/count list, which you can then plot.

$ reformat.sh -Xmx4g in=test.fastq lhist=readstats.txt

$ more readstats.txt 
#Length Count
156     1
175     1
203     1
221     1
233     1
242     1
246     1
266     1
267     1
273     1
274     1
278     1
279     2
282     1
286     1
292     1
297     1
298     3
299     6
300     11
301     62
ADD COMMENT

Login before adding your answer.

Traffic: 5373 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6