Question: Unmapped reads in BLAT output ( psl file )
2
gravatar for syrup16g_TO
4.9 years ago by
syrup16g_TO40
Japan
syrup16g_TO40 wrote:

Hello,

Please tell me about unmapped reads in BLAT run.

If reads are not mapped to reference sequence by BLAT, what is written in output psl file?

How do I extract unmapped reads?

Thank you very much.

blat alignment • 1.9k views
ADD COMMENTlink modified 4.9 years ago by Prakki Rama2.2k • written 4.9 years ago by syrup16g_TO40
0
gravatar for Prakki Rama
4.9 years ago by
Prakki Rama2.2k
Singapore
Prakki Rama2.2k wrote:

I think there is no straight forward option in BLAT to collect unmapped reads. May be you can try this.

1) Collect your mapped reads using cut command first.

cut -d "       " -f 10 output.psl | sort -u >mapped_header.txt ## 10 here is my mapped read column.

2) Now collect all your all read headers.

LC_ALL=C fgrep ':N:' sample.fastq >all_header.txt ## I used this pattern ":N:" because it is present in all my headers. If your read also has similar pattern you can probably use this or use something common in all the reads like "@HISEQ" or "@HWI" etc

3) Now collect those headers which are unmapped using following command.

awk 'NR==FNR{a[$0];next}!($0 in a)' mapped_header.txt all_header.txt >unmapped.txt

4)Now grep those reads from the original fastq file.
LC_ALL=C grep -A 3 -F -f unmapped.txt sample.fastq >unmapped.fastq

*Since, we are searching for fixed strings, the LC_ALL grep would not take too much time.

 

ADD COMMENTlink modified 4.9 years ago • written 4.9 years ago by Prakki Rama2.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 756 users visited in the last hour