Read length for de bruijn graph
0
0
Entering edit mode
6.4 years ago
faraz.k89 • 0

Hi everyone, I am trying to build De Bruijn graph from short reads. I have some reads that has length < 10 (0.01 % only). I am just worried if those reads (however very small % of them) will create problem for graph building?

The stats i am getting for graph building is :

bank                                    
            bank_uri                                 : SRR2847385_interleaved.fasta,SRR2847386_interleaved.fasta
            bank_size                                : 117637467610
            bank_total_nt                            : 89977612926
            sequences                               
                seq_number                               : 455322542
                seq_size_min                             : 1
                seq_size_max                             : 250
                seq_size_mean                            : 197.6
                seq_size_deviation                       : 55.4
            kmers                                   
                kmers_nb_valid                           : 80415655424
                kmers_nb_invalid                         : 3772037
        stats                                   
            histogram                               
                cutoff                                   : 23
                nb_ge_cutoff                       : 332423627
                first_peak                              : 91
            kmers                                   
                solidity_kind                            : sum
                thresholds                               : 3 3 
                kmers_nb_distinct                   : 931511370
                kmers_nb_solid                       : 452752530
                kmers_nb_weak                      : 478758840
                kmers_percent_weak               : 51.4

As you can see large number of them are valid k-mers. Do you think the graph just ignore reads below a certain length?

Thanks in advance. Faraz.

Assembly genome sequence • 1.1k views
ADD COMMENT
0
Entering edit mode

The number of valid reads would depend on the k-mer size specified for the graph.

ADD REPLY

Login before adding your answer.

Traffic: 2000 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6