Question: how to keep common reads in paired end reads if the number of reads are not same in read1.fq and read2.fq
0
gravatar for ak93sharma
2.5 years ago by
ak93sharma10
ak93sharma10 wrote:

hello folks, I am mapping reads with bowtie2 but it shows error as " fewer reads in the file specified with -2 than in file specified with -1 "

bowtie2 -x indexFile  -1 read1.fq   -2 read2.fq  -S result.sam

the number of reads are not anymore same in read1 and read2 after filtering of reads2, I want to keep common reads in both paired end reads so the number of reads is same in both, any help?

awk rna-seq sed bowtie2 perl • 1.5k views
ADD COMMENTlink modified 2.5 years ago by shenwei3564.8k • written 2.5 years ago by ak93sharma10
1
gravatar for shenwei356
2.5 years ago by
shenwei3564.8k
China
shenwei3564.8k wrote:

How to extract paired reads from two paired-end reads file?

Firstly, extract sequence IDs of two file and compute the intersection:

$ gzip -d -c read_1.fq.gz read_2.fq.gz | seqkit seq --name --only-id | sort | uniq -d > id.txt

Then retrieve reads using id.txt:

$ gzip -d -c read_1.fq.gz | seqkit grep --pattern-file id.txt  | gzip -c > read_1.f.fq.gz
$ gzip -d -c read_2.fq.gz | seqkit grep --pattern-file id.txt  | gzip -c > read_2.f.fq.gz

Note that this example assumes that the IDs in the two reads file have same order. If not you can sort them after previous steps. Shell sort can sort large file using disk, so temporary directory is set as current directory by option -T ..

$ gzip -d -c read_1.f.fq.gz | seqkit fx2tab | sort -k1,1 -T . | seqkit tab2fx | gzip -c > read_1.f.sorted.fq.gz
$ gzip -d -c read_2.f.fq.gz | seqkit fx2tab | sort -k1,1 -T . | seqkit tab2fx | gzip -c > read_2.f.sorted.fq.gz
ADD COMMENTlink modified 2.5 years ago • written 2.5 years ago by shenwei3564.8k
0
gravatar for Carlo Yague
2.5 years ago by
Carlo Yague4.6k
Belgium
Carlo Yague4.6k wrote:

The question is more : what did you do you that created orphan reads ? Did you trim/quality filter your r1 and r2 reads independently ? Some trimming tools, like trimmomatic, have a paired-end mode to avoid those issues.

But if you really need to fix those files, you can try to remove the orphant reads.

ADD COMMENTlink modified 2.5 years ago • written 2.5 years ago by Carlo Yague4.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1360 users visited in the last hour