How much adapters is it ok to keep in my samples after RNA seq?
1
0
Entering edit mode
5 months ago
bioinfo ▴ 150

Hello,

The software I am using for demultiplexing allows to specify adapters in the samplesheet so that the adapters can be trimmed. However, I noticed that when I did multiqc on my fastq files there are adapters left. I am including some examples below. In this example the adapter content is pretty low but I have had other samples with 3-7% adapters on the multiqc file.

When I specify the adapter exactly as Illumina says on their manual the adapter can reach up to 0.74%. The software also allows for automatic detection of the adapters. When I use that the adapter sequence that is picked up is the illumina adapter sequence+ 5 more bases. When I use that for adapter trimming multiqc says: No samples found with any adapter contamination > 0.1%

I wanted to know what would be the more appropriate way to trim the adapters? Would the extra 5 bases cause an issue?

Thank you

RNA-seq illumina adapters • 975 views
ADD COMMENT
3
Entering edit mode
5 months ago
GenoMax 142k

I wanted to know what would be the more appropriate way to trim the adapters? Would the extra 5 bases cause an issue?

This may depend on the end use application for the data. If you are planning to align to a good reference then most aligners will take care of any residual bases from adapter, that will not align, by soft-clipping.

For de novo work all extraneous sequence needs to be removed so you will need to use a separate scan/trim program such as bbduk.sh/fastp. A dedicated program is the best option to do this.

Running external scanning/trimming is a "one time and done" operation. That will leave you with guaranteed clean data that can be used for any application.

ADD COMMENT
0
Entering edit mode

Thank you. I am planning to align the data with kallisto. Does that make a difference?

ADD REPLY
0
Entering edit mode

No it should not make a difference.

ADD REPLY
0
Entering edit mode

Thank you. I jut want to confirm. By not making a difference do you mean that trimming the extra 5 bases should not make a difference?

ADD REPLY
0
Entering edit mode

You could find out by running a sample with and without trimming. Passing the data through a trimming program is not going to add a great computational burden. Trim the data once and be sure that there is no extraneous sequence.

ADD REPLY
0
Entering edit mode

I had compared trimming with the illumina sequence which did not seem to work vs trimming with the illumina sequence+the 5 bases which seems to have trimmed everything based on fastqc. I had done a scatterplot with a pearson correlation for the gene counts and the tpm counts and the pearson correlation was 1. Do you think this is an appropriate way to compare them?

ADD REPLY
0
Entering edit mode

See: Quality filtering prior to pseudoalignment

This addresses trimming as well.

ADD REPLY

Login before adding your answer.

Traffic: 2391 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6