Question: To extract unmapped pair end reads
0
gravatar for Bioinfonext
2.3 years ago by
Bioinfonext140
Korea
Bioinfonext140 wrote:

Actually I map raw reads to CDS of Draft genome where all genes CDS is not present.

I used below cammand lines for pair end read mapping to CDS and want to extract unmapped pair end reads in which BOTH READS of Pair not able to map to CDS.

But when I am running last cammnd it show some warning on screen: but it write R1 and R2 READS in two separete files. is it ok?

WARNING: Query PC168224:127:C98A1ANXX:1:2316:21235:23135 is marked as paired, but its mate does not occur next to it in your BAM file. Skipping. WARNING: Query PC168224:127:C98A1ANXX:1:2316:21236:13229 is marked as paired, but its mate does not occur next to it in your BAM file. Skipping. *WARNING: Query PC168224:127:C98A1ANXX:1:2316:21237:66490 is marked as paired, but its mate does not occur next to it in your BAM file. Skipping.

Cammand Lines:

For BWA indexing

/home/yog/software/bwa-0.7.15/bwa index -p Radish_index Radish_cds.fasta

BWA Cammand

/home/yog/software/bwa-0.7.15/bwa mem Radish_index forward_216_5W_Ca1.fastq reverse_216_5W_Ca1.fastq >216_5W_Ca1.sam

Convert Sam to BAM

/home/yog/software/samtools-1.3.1/samtools view -bS 216_5W_Ca1.sam > 216_5W_Ca1.bam

Extract Unmapped Reads:

/home/yog/software/samtools-1.3.1/samtools view -b -f 4 216_5W_Ca1.bam > unmap_216_5W_Ca1.bam

To convert bam to Fastq File:

/home/yog/software/samtools-1.3.1/samtools sort  -n unmap_Ca1.bam -o unmap_216_5W_Ca1._resorted

/home/yog/software/bedtools2/bin/bamToFastq -i unmap_Ca1._resorted -fq Unmap_Ca1_read1.fastq -fq2 Unmap_Ca1_read2.fastq
rna-seq • 2.4k views
ADD COMMENTlink modified 2.3 years ago by Brian Bushnell16k • written 2.3 years ago by Bioinfonext140
2
gravatar for Vitis
2.3 years ago by
Vitis2.1k
New York
Vitis2.1k wrote:

I think the use of flag 4 is not correct. Flag 4 indicates that the read is not mapped, but the mate can be properly mapped. So if you're looking for pairs with neither mate is mapped, you need to use flag 12, indicating the read itself is not mapped and the mate is also not mapped.

ADD COMMENTlink written 2.3 years ago by Vitis2.1k
1
gravatar for Brian Bushnell
2.3 years ago by
Walnut Creek, USA
Brian Bushnell16k wrote:

This is a copy of this post, which has already been answered.

ADD COMMENTlink written 2.3 years ago by Brian Bushnell16k
0
gravatar for Charles Plessy
2.3 years ago by
Charles Plessy2.6k
Japan
Charles Plessy2.6k wrote:

In the BWA index, each reference sequence (chromosomes, ...) follow each other directly. Not sure for bwa mem, but with bwa align it sometimes happened to me that one mate aligned at the end of a chromosome, the other mate at the beginning of the next chromosome, and BWA would happily mark them as properly paired because their distance in the index, ignoring the transition between the two chromosomes, was small enough. Perhaps this happened to you as well ?

ADD COMMENTlink written 2.3 years ago by Charles Plessy2.6k

you did not understand what I wanted to say..........

I map raw reads to CDS transcripts not on genome sequences.

But some of the gene sequences is missing in these CDS transcripts.

So when I map raw reads to these CDS, some pair end reads do not able to map.

I want to extract to those pair end reads in which BOTH READS of Pair not able to map to CDS.

For that I used below cammand line:

Cammand Lines:

For BWA indexing /home/yog/software/bwa-0.7.15/bwa index -p Radish_index Radish_cds.fasta

BWA Cammand /home/yog/software/bwa-0.7.15/bwa mem Radish_index forward_216_5W_Ca1.fastq reverse_216_5W_Ca1.fastq >216_5W_Ca1.sam

Convert Sam to BAM /home/yog/software/samtools-1.3.1/samtools view -bS 216_5W_Ca1.sam > 216_5W_Ca1.bam

Extract Unmapped Reads: /home/yog/software/samtools-1.3.1/samtools view -b -f 4 216_5W_Ca1.bam > unmap_216_5W_Ca1.bam

To convert bam to Fastq File: /home/yog/software/samtools-1.3.1/samtools sort -n unmap_Ca1.bam -o unmap_216_5W_Ca1._resorted

/home/yog/software/bedtools2/bin/bamToFastq -i unmap_Ca1._resorted -fq Unmap_Ca1_read1.fastq -fq2 Unmap_Ca1_read2.fastq

But when last cammnd line write R1 and R2 reads it shows some warning on screen:

WARNING: Query PC168224:127:C98A1ANXX:1:2316:21235:23135 is marked as paired, but its mate does not occur next to it in your BAM file. Skipping. WARNING: Query PC168224:127:C98A1ANXX:1:2316:21236:13229 is marked as paired, but its mate does not occur next to it in your BAM file. Skipping. *WARNING: Query PC168224:127:C98A1ANXX:1:2316:21237:66490 is marked as paired, but its mate does not occur next to it in your BAM file. Skipping.

So I am asking is there I am making any mistake?

ADD REPLYlink written 2.3 years ago by Bioinfonext140
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 920 users visited in the last hour