Entering edit mode
                    8.3 years ago
        t2g4free
        
    
        •
    
    0
    After QC, I got the different length read, I want to get the reads length distribution. Any suggestions?
After QC, I got the different length read, I want to get the reads length distribution. Any suggestions?
using gnuplot:
$ curl -sL "https://raw.githubusercontent.com/MedicineAndTheMicrobiome/AnalysisTools/master/FASTQ/Split_Fastq/Example.fastq" | \
  paste - - - - | \
  awk -F '\t'  '{printf("%d\n",10*int(length($2)/10.0));}' |\
  sort |uniq -c | sort -n |\
  gnuplot -e "set terminal dumb 80 50 ; set title 'Fastq sequence len.'; set xlabel 'length'; set auto x;set style data histogram; plot '-'  using 1:xticlabels(2)  with lines notitle;"
                               Fastq sequence len.
  20 ++---+----+----+----+----+----+----+-----+----+----+----+----+----+---+*
     +    +    +    +    +    +    +    +     +    +    +    +    +    +    *
     |                                                                      *
     |                                                                      *
  18 ++                                                                    *+
     |                                                                     *|
     |                                                                     *|
     |                                                                     *|
     |                                                                     *|
  16 ++                                                                    *+
     |                                                                     *|
     |                                                                    * |
     |                                                                    * |
  14 ++                                                                   *++
     |                                                                    * |
     |                                                                    * |
     |                                                                    * |
  12 ++                                                                   *++
     |                                                                   *  |
     |                                                                   *  |
     |                                                                   *  |
     |                                                                   *  |
  10 ++                                                                  * ++
     |                                                                   *  |
     |                                                                  *   |
     |                                                                  *   |
   8 ++                                                                 *  ++
     |                                                                  *   |
     |                                                                  *   |
     |                                                                  *   |
   6 ++                                                                 *  ++
     |                                                                 *    |
     |                                                                 *    |
     |                                                                 *    |
   4 ++                                                                *   ++
     |                                                               **     |
     |                                                             **       |
     |                                            *****************         |
     |                                          **                          |
   2 ++                 ************************                           ++
     |                **                                                    |
     *****************                                                      |
     +    +    +    +    +    +    +    +     +    +    +    +    +    +    +
   0 ++---+----+----+----+----+----+----+-----+----+----+----+----+----+---++
    100  120  170  240  110  140  160  180   230  190  200  210  220  150  250
                                     length
                    
                
                reformat.sh from BBMap Suite used like this: reformat.sh in=your.fq ihist=filename_you_want.txt
FastQC to check read length distribution and if you want to only retain reads with certain length after QC-trimming use cutadapt
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
+1 for creativity. Using BBMap solution may be easier for a giant file.
The X-axis goes "100, 120, 170, 240, 110, 140" etc. I suspect that's not intentional... but it might be related to the two odd jaggies in the graph.