Trimming overrepresented sequence possible with Trimmomatic?
1
0
Entering edit mode
9 weeks ago
ella • 0

Dear Community, :)

I am using Trimmomatic for quality trimming and adapter removal from my RNA reads, before I continue with mapping (via GSNAP). Unfortunately, the transcript of the sequence I´m overexpressing is so highly overrepresented, that reads of it map to my reference; even though I included my overexpressed plasmid sequence fasta to my reference fasta to map it simultaneously. My idea was to remove the reads that map to my overexpression plasmid, already before mapping. Is it possible to do so via Trimmomatic? I thought it should be possible, as Trimmomatic cuts adapter sequence as well. How would I have to adjust the command? My plan B would be mapping to the plasmid sequence first and using the unmapped read output file for mapping to the actual reference. But I doubt, that this might cause mismapping of other transcripts to my overexpression plasmid sequence as well... Plan C is finding an option to trim the OE-sequence within the mapping process of GSNAP. I know that it has some options of quality filtering as well. How would you proceed?

Thanks a lot for any kind of suggestion or help. :) :)

Have a nice weekend,

Ella

rnaseq ngs overexpression trimmomatic gsnap • 280 views
0
Entering edit mode

Little addtition: I found the post removing overrepresented sequences from rna-seq. Here it was stated, that including overrepresented sequences to the Trimmomatic Adapter Fasta file would only work for sequences on 5' end of the reads. As in my case the complete read would hit the overexpression plasmid fasta, is that a problem? And is it a problem, that the fasta of the overexpression plasmid is way larger than read length?

Thanks again :)

2
Entering edit mode
9 weeks ago
GenoMax 104k

It may be best to do this using bbduk.sh and sequence of your plasmid in filter mode. A guide for BBDuk is available. Keep in mind that you may lose additional reads if there is sequence similarity between your plasmid and the genome by chance.

0
Entering edit mode

Great, thank you very much for the tip! :) I will check that out for sure.

1
Entering edit mode

perhaps also consider switching to a more modern mapper? HiSat2, STAR, BBmap, are a few suggestions

0
Entering edit mode

Hi lieven.sterck, thanks for the suggestion! I changed our pipeline to BBmap. Works great so far :)

0
Entering edit mode

Hi GenoMax, just wanted to thank you for the great help, bbduk.sh worked even better than I would have hoped :)