Question: Bfast Match Paired End Reads - Reports Half Total Number Of Reads
7.9 years ago by
Kenneth Daily wrote:

I'm using bfast 0.7.0a and testing on the paired end data present in the bfast user manual (Figure 5.4 in bfast-book.pdf). The format for this fastq file is shown that paired reads should follow sequentially in the file (read1R1, read1R2, read2R1, read2R2, etc). The same name is to be used for the sequential reads in a pair. So, the user manual data has 4 reads = two pairs. When I run bfast match:

bfast match -A 0 -t -n 16 -f hg19.fa -i 1 -r bfast_book.fastq
Checking input parameters supplied by the user ...
Validating fastaFileName hg19.fa.
Validating readsFileName bfast_book.fastq.
Validating tmpDir path ./.
**** Input arguments look good!
Printing Program Parameters:
programMode:                            [ExecuteProgram]
fastaFileName:                          hg19.fa
mainIndexes                             1
secondaryIndexes                        [Not Using]
readsFileName:                          bfast_book.fastq
offsets:                                [Using All]
loadAllIndexes:                         [Not Using]
compression:                            [Not Using]
space:                                  [NT Space]
startReadNum:                           1
endReadNum:                             2147483647
keySize:                                [Not Using]
maxKeyMatches:                          8
keyMissFraction:                        1.000000
maxNumMatches:                          384
whichStrand:                            [Both Strands]
numThreads:                             16
queueLength:                            250000
tmpDir:                                 ./
timing:                                 [Using]
Searching for main indexes...
Found 1 index (1 file).
Not using secondary indexes.
Reading in reference genome from hg19.fa.nt.brg.
In total read 85 contigs for a total of 3101810128 bases
Reading bfast_book.fastq into a temp file.
Will process 2 reads.
Searching index file 1/1 (index #1, bin #1)...

There are four reads in this file; why does it say that it will process two reads? I've tested this on my own data as well. I just want to make sure that it is really aligning all four reads (as far as I could tell, bfast match is not paired-read aware; that comes in the postprocess step).

Thank you!

ADD COMMENTlink written 7.9 years ago by Kenneth Daily

I would not get hung up on the what the message says, after all that could be a slight miscommunication they may have meant 2 pairs.

Instead look at the output SAM file and see how many reads and pairs have been aligned.

ADD REPLYlink written 7.9 years ago by Istvan Albert

You're right; and I haven't taken it through to the next steps (localalign and postprocess). I will do so.


ADD REPLYlink written 7.9 years ago by Kenneth Daily
