Question: Illumina adapter identification
2
gravatar for Jonathan Crowther
4.5 years ago by
Leuven/Dublin
Jonathan Crowther180 wrote:

Hi guys,

I have a question about adapter identification.

I have the raw fastq.tar.gz files from an RNAseq experiment.

I am trying to replicate the pre-processing of a service provider as a learning experience. I have run the FastQC and see that the quality is good and generally all seems correct. The over-represented sequences does not contain any sequences. So all looks well. Now I am trying to trim off the adapters but alas I do not know which adapters were used so I though if I provided the illumina_adapters.fa and to cutadapt (Version 1.4.1) using the following command line:

cutadapt -b file:illumina_adapters.fa -m 15 -O 10 -e 0.1 Sample_file.fastq -o trimmed_Sample_file.fastq

I am using the same parameters as the service provider however when i run it this way I seem to trim approximately 3,000 reads but the service provider trims only 500 reads.

Using the

P5 - AATGATACGGCGACCACCGA 

Reverse compliment P7 - TCGTATGCCGTCTTCTGCTTG

Sequences I am able to pull what I think are adapter dimers. If this is the case am I correct in thinking I should be able to find the adapter sequence? 

grep 'P5 Sequence' Sample_file.fastq

GCTTCTGTAATTGAAAACCTAGAT-AATGATACGGCGACCACCGA-ACAAAGT
GCTTCTGTAATTGAAAACCTAGAT-AATGATACGGCGACCACCGA-ACACAGT
GCTTCTGTAATTGAAAACCTAGAT-AATGATACGGCGACCACCGA-ACACTGT
GCTTCTGTAATTGAAAACCTAGAT-AATGATACGGCGACCACCGA-ACACTGT
CTAAAGCTTCACACTTGATC-AATGATACGGCGACCACCGA-ACCCACTTTGC

grep 'P7 Sequence' Sample_file.fastq

CAAATGTATTTTAATAAGGTGATG-TCGTATGCCGTCTTCTGCTTG-AAAAAA
CTAAAGCTTCACACTTGATCAGGGATC-TCGTATGCCGTCTTCTGCTTG-AAA
CTAAAGCTTCACACTTGATCAGGTATC-TCGTATGCCGTCTTCTGCTTG-AAA
CTAAAGCTTCACACTTGATCAGGTATC-TCGTATGCCGTCTTCTGCTTG-AAA
CTAAAGCTTCACACTTGATCAGGTATC-TCGTATGCCGTCTTCTGCTTG-AAC
GTCGATGAGAGCCCAGAAATGTGAGAAAA-TCGTATGCCGTCTTCTGCTTG-A

 

Which part would be the adapter in this case?

 

Thanks in advance

 

rna-seq illumina adapters • 4.8k views
ADD COMMENTlink modified 4.5 years ago by geek_y9.6k • written 4.5 years ago by Jonathan Crowther180
0
gravatar for geek_y
4.5 years ago by
geek_y9.6k
Barcelona/CRG/London/Imperial
geek_y9.6k wrote:

I am assuming that you have used illumina platform.

In general, for RNA-SEQ, there is no need to remove adapter sequences. For small-RNA, as the sequence is around 22 nucleotides, the adapter gets sequenced along with the 22 nucleotide small-RNA.

but in case of RNA-SEQ, there is very less likey that the adapter gets sequences as the transcripts will be longer than 100 base pairs.When the input DNA fragment is less than the read length, then only adapter gets sequenced.

Anyway, it's good to remove adapters to be sure. So the service provider will have a list of adapters used for multiplexing your libraries. You need to contact them for list of adapters used for your libraries.

Edit: FastQC will inform list of ovverrepresented/adapter sequences.

 

ADD COMMENTlink modified 21 months ago • written 4.5 years ago by geek_y9.6k

It's always good to know from your service provider, which kit was used for preparing libraries.

ADD REPLYlink modified 4.5 years ago • written 4.5 years ago by geek_y9.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1055 users visited in the last hour