Off topic:Error when split WES data into sub-files
0
0
Entering edit mode
2.4 years ago
Laven9 • 0

I have got my WES data and now I want to split my data into a sub-file containing data only from chr10 (both my data that in fastq format and the reference ucsc.hg19.fasta ). My code was like this:

for my data: I map it to ucsc.hg19.fasta by BWA

  bwa mem ~/reference/ucsc.hg19.fasta ~/input/data.1.fastq.gz ~/input/data.2.fastq.gz |samtools view -S -b > ~/output/data.bam

then I sort it and index it.

 samtools sort -n ~/output/data.bam > ~/output/data.sorted.bam
 samtools index ~/output/data.sorted.bam

then I split it and convert it back to fastq

 samtools view -h -b ~/output/data.sorted.bam chr10 > ~/output/chr10.bam
 samtools bam2fq ~/output/chr10.bam > ~/output/chr10.fastq
cat ~/output/chr10.fastq | grep '^@.*/1$' -A 3 --no-group-separator > ~/output/chr10.1.fastq 
cat ~/output/chr10.fastq | grep '^@.*/2$' -A 3 --no-group-separator > ~/output/chr10.2.fastq

And I do for reference:

samtools faidx ~/reference/ucsc.hg19.fasta chr10 > ~/reference/ucsc.hg19.chr10.fasta

But when I use these two sub-files ( chr10.1.fastq ,chr10.2.fastq) and reference(ucsc.hg19.chr10.fasta) for BWA again, I found I got a much smaller bam file( only 1 kb). It is unusual, for my chr10.bam is more than 40,000 kb.

What is wrong with my code? Please give me some advice! Thank you!

WES bwa fastq bam reference • 598 views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 2464 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6