Question: Mapping fastq files with paired reads and member of the paired reads not present
0
gravatar for vigprasud
5.0 years ago by
vigprasud60
United States
vigprasud60 wrote:

I have fastq files with paired reads and also reads whose member of the pair is absent. 

Eg:

R1.fq
@HWI:xxxxxxxxxxxxx/1
@HWI:xxxxxxxxxxxxy

Eg:

@HWI:xxxxxxxxxxxxx/2
@HWI:xxxxxxxxxxxyx

When tried to align using bwa, it throws an error saying member of the mate pair absent

How do I align these fastq files? Note that these fastq files were created from a previously aligned bams using picard tools.

bwa alignment mapped reads • 2.2k views
ADD COMMENTlink modified 5.0 years ago by geek_y10.0k • written 5.0 years ago by vigprasud60
4
gravatar for SES
5.0 years ago by
SES8.2k
Vancouver, BC
SES8.2k wrote:

You can use Pairfq to fix you paired-end reads, specifically the pairfq makepairs command (more info on the wiki). I have tried the other solution mentioned and this is a more efficient and general solution (accepts muliti-line FASTA/Q as input, files can be compressed, and it is not restrictive on the read name). To be clear, I developed this tool, and I did so because I got tired of repeating awk commands, cleaning up intermediate files, fixing the file names after the commands, modifying the commands for different inputs, etc. There is also a script included that has no dependencies, so you should be able to use this anywhere you can use awk.  

EDIT: To be clear on the last part, this is all you need:

curl -L git.io/pairfq_lite > pairfq_lite
chmod +x pairfq_lite
./pairfq_lite -h

The last command will just print the usage menu. All the documentation is available at the command line (with ./pairfq_lite -m) and there is more information on the wiki. I hope that makes it easier to use.

ADD COMMENTlink modified 4.7 years ago • written 5.0 years ago by SES8.2k

Thanks, The tool created a merged R1_R2 paired file and another file with orphan reads. Is there any other script that you have written that splits the merged file into R1 and R2 respectively as that would be necessary for alignment. There needs to be two different files [R1 and R2] during alignment. 

ADD REPLYlink written 5.0 years ago by vigprasud60
1

The pairfq makepairs command creates separate files that are in order, the pairfq joinpairs command will interleave the pairs. The command you want for splitting the pairs from an interleaved file is pairfq splitpairs (see the wiki for that command for more info). Just for reference, you can type pairfq and it will list all the commands, and a description of the basic usage can be found on the wiki home page. Feel free to ask me questions, or post them online under the "issues" tab. 

ADD REPLYlink written 5.0 years ago by SES8.2k

Thanks, The singles are combined or are they printed out in different files as well?

ADD REPLYlink written 5.0 years ago by vigprasud60

If you are referring to the pairfq makepairs command, the singleton reads from each pair are written to separate files (explained here).

ADD REPLYlink written 5.0 years ago by SES8.2k

I tried the script. I got the pairs seperate but I could not get them in order. 

R1.fq has 

ReadA/1

ReadB/1

ReadC/1 

while R2.fq has

ReadC/2

ReadA/2

ReadB/2

 

Is there a way that this program sorts them ??

ADD REPLYlink written 5.0 years ago by vigprasud60
2
gravatar for Ashutosh Pandey
5.0 years ago by
Philadelphia
Ashutosh Pandey11k wrote:

See this post:

Combining The Paired Reads From Illumina Run

It will help you create a pair of ordered fastq files. You can align ordered fastq files as a paired-end reads separately from unpaired reads (also known as orphan reads) that need to be aligned as a single end. 

ADD COMMENTlink written 5.0 years ago by Ashutosh Pandey11k

Thanks, For single end reads, doesnt it need to know if it is a R1 read or an R2 read?

ADD REPLYlink written 5.0 years ago by vigprasud60
0
gravatar for geek_y
5.0 years ago by
geek_y10.0k
Barcelona
geek_y10.0k wrote:

You also can try this script.

https://www.dropbox.com/s/4apg7uykv35koto/cmpfastq_pe.pl?dl=0

It will take two fastq files from illumina and compares them and spits out two files for R1 and R2 with common reads in order. It also spits out reads that are unique to R1 and R2 in to two separate files.

If you have different pattern of readname, you need to edit the regex string to make it work for your files.

ADD COMMENTlink modified 5.0 years ago • written 5.0 years ago by geek_y10.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1776 users visited in the last hour