Question: Determine adapter sequence in RNA-seq samples
0
gravatar for sabaghianamir70
11 days ago by
iran
sabaghianamir7010 wrote:

Hello

I was searching about, How chose the correct Adapter triming in this pdf https://support.illumina.com/content/dam/illumina-support/documents/documentation/chemistry_documentation/experiment-design/illumina-adapter-sequences-1000000002694-11.pdf . but i dont know which one, i just know the data,s are make with Next seq 500 illumina. i know im missing some important point, can you guys help me , thanks

rna-seq • 177 views
ADD COMMENTlink modified 10 days ago • written 11 days ago by sabaghianamir7010

So if the athor tells me they are cut all the adaptors, i dont need for trimming anything related to adaptors ? even the first 12 nocleotide in this picture ? enter image description here

ADD REPLYlink written 10 days ago by sabaghianamir7010
2

That pattern, in the beginning, is caused by the library construction method which uses enzymatic fragmentation.

ADD REPLYlink written 10 days ago by JC8.8k

So should i cut it out or leave it be ?

ADD REPLYlink written 8 days ago by sabaghianamir7010

The article genomax linked explains the issue, and suggests what should be done:

Mitigation

People often suggest fixing this issue by 5′ trimming of the reads to remove the biased portion – this however is not a fix. Since the biased composition is created by the selection of sequencing fragments and not by base call errors the only effect of trimming would be to change from having a library which starts over biased positions, to having a library which starts slightly downstream of biased positions.

Prevention

Ultimately this only fix for this issue will be in the introduction of new library preparation kits with a less bias prone priming step.

ADD REPLYlink written 8 days ago by h.mon27k
1

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized. SUBMIT ANSWER is for new answers to original question.

As for the pattern you see in the plot above it is normal for RNAseq data. You can read more about this observation in a blog post from FastQC authors here. You do not need to do anything to that part of the read. It should align without any issues.

As for the adapters, as long as you are just aligning the data, modern aligners should be able to take care of any residual adapter sequences by soft-clipping them. If you are going to do any de novo assembly work then you should use one of the methods detailed below to ensure that all extraneous sequence gets removed before assembly.

ADD REPLYlink modified 10 days ago • written 10 days ago by genomax73k
2
gravatar for h.mon
10 days ago by
h.mon27k
Brazil
h.mon27k wrote:

If your data is paired-end, several programs (such as fastp, peat or bbduk) can trim by overlapping forward and reverse reads and, strictly speaking, they don't need to know the adapters. fastp can auto-detect adapters also for single endequencing, and it will output adapter statistics, including adapter inferred / detected sequences.

ADD COMMENTlink written 10 days ago by h.mon27k

Another vote for fastp. I've switched to using to lately and really love it.

ADD REPLYlink written 10 days ago by Dave Carlson250
2
gravatar for Mensur Dlakic
10 days ago by
Mensur Dlakic1.7k
USA
Mensur Dlakic1.7k wrote:

Yet another solution is AdapterRemoval:

AdapterRemoval --file1 reads_1.fastq --file2 reads_2.fastq --threads 8 --basename trimmed

It will print out lots of useful info:

Processed a total of 317,643,940 reads in 12:43.1s; 416,000 reads per second on average ...
   Found 103092850 overlapping pairs ...
   Of which 527537 contained adapter sequence(s) ...

Printing adapter sequences, including poly-A tails:
  --adapter1:  AGATCGGAAGAGCACACGTCTGAACTCCAGTCACNNNNNNATCTCGTATGCCGTCTTCTGCTTG
               ||||||||||||||||||||||||||||||||| ****** | |  | |       |
   Consensus:  AGATCGGAAGAGCACACGTCTGAACTCCAGTCAGCAGTTTTTTTTCTTTAAAAAAATAAAAAAAAAAAAAAAAAAAAAAAAAATAAATANTAAAAAATTTTTTTTTTTTTTTTTTTTTTTTTTTTTATTTTTTTTTTTA
     Quality:  ***)))(((''&&&%%%$$$$###"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""

    Top 5 most common 9-bp 5'-kmers:
            1: AGATCGGAA = 71.70% (137696)
            2: AGATCGGCA =  0.27% (516)
            3: AGATAGGAA =  0.19% (365)
            4: CGATCGGAA =  0.18% (337)
            5: AGCTCGGAA =  0.16% (299)

  --adapter2:  AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT
               |||||||||||||||||||||||||||||||||| || |   |  |     | | |||
   Consensus:  AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTATATTTTTTTTTTTTTTTTTTTATTAAAAAAAAAAAAAAAAAAAAAAAAAAATAAAAAAAAAAAAAATTTTTTTTTTTTTTTTTTTTTTTATTATTATTTTTATTT
     Quality:  ,,,+++**))((('&&&%%%$$$##""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""


    Top 5 most common 9-bp 5'-kmers:
            1: AGATCGGAA = 80.81% (176579)
            2: AGCTCGGAA =  0.21% (459)
            3: CGATCGGAA =  0.21% (448)
            4: AGATCGGCA =  0.20% (445)
            5: AGATAGGAA =  0.16% (360)

    --adapter1 SEQUENCE
        Adapter sequence expected to be found in mate 1 reads [default:
        AGATCGGAAGAGCACACGTCTGAACTCCAGTCACNNNNNNATCTCGTATGCCGTCTTCTGCTTG].

    --adapter2 SEQUENCE
        Adapter sequence expected to be found in mate 2 reads [default:
        AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT].
ADD COMMENTlink written 10 days ago by Mensur Dlakic1.7k
1
gravatar for JC
11 days ago by
JC8.8k
Mexico
JC8.8k wrote:

a) ask the provider or who did the sequencing

b) use FastQC to check which adapter was used

ADD COMMENTlink modified 11 days ago • written 11 days ago by JC8.8k
1
gravatar for Timze W
10 days ago by
Timze W40
Timze W40 wrote:

The best solution is to ask your sequencing data provider.
Typically, QC software (such as fastQC) can report some regular adapter types, while trim-galore can automatically detect and cut these adapters.

ADD COMMENTlink written 10 days ago by Timze W40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1329 users visited in the last hour