Question: Determine adapter sequence in RNA-seq samples
0
gravatar for sabaghianamir70
9 months ago by
iran
sabaghianamir7010 wrote:

Hello

I was searching about, How chose the correct Adapter triming in this pdf https://support.illumina.com/content/dam/illumina-support/documents/documentation/chemistry_documentation/experiment-design/illumina-adapter-sequences-1000000002694-11.pdf . but i dont know which one, i just know the data,s are make with Next seq 500 illumina. i know im missing some important point, can you guys help me , thanks

rna-seq • 665 views
ADD COMMENTlink modified 9 months ago • written 9 months ago by sabaghianamir7010

So if the athor tells me they are cut all the adaptors, i dont need for trimming anything related to adaptors ? even the first 12 nocleotide in this picture ? enter image description here

ADD REPLYlink written 9 months ago by sabaghianamir7010
2

That pattern, in the beginning, is caused by the library construction method which uses enzymatic fragmentation.

ADD REPLYlink written 9 months ago by JC10k

So should i cut it out or leave it be ?

ADD REPLYlink written 9 months ago by sabaghianamir7010

The article genomax linked explains the issue, and suggests what should be done:

Mitigation

People often suggest fixing this issue by 5′ trimming of the reads to remove the biased portion – this however is not a fix. Since the biased composition is created by the selection of sequencing fragments and not by base call errors the only effect of trimming would be to change from having a library which starts over biased positions, to having a library which starts slightly downstream of biased positions.

Prevention

Ultimately this only fix for this issue will be in the introduction of new library preparation kits with a less bias prone priming step.

ADD REPLYlink written 9 months ago by h.mon30k
1

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized. SUBMIT ANSWER is for new answers to original question.

As for the pattern you see in the plot above it is normal for RNAseq data. You can read more about this observation in a blog post from FastQC authors here. You do not need to do anything to that part of the read. It should align without any issues.

As for the adapters, as long as you are just aligning the data, modern aligners should be able to take care of any residual adapter sequences by soft-clipping them. If you are going to do any de novo assembly work then you should use one of the methods detailed below to ensure that all extraneous sequence gets removed before assembly.

ADD REPLYlink modified 9 months ago • written 9 months ago by genomax87k
2
gravatar for h.mon
9 months ago by
h.mon30k
Brazil
h.mon30k wrote:

If your data is paired-end, several programs (such as fastp, peat or bbduk) can trim by overlapping forward and reverse reads and, strictly speaking, they don't need to know the adapters. fastp can auto-detect adapters also for single endequencing, and it will output adapter statistics, including adapter inferred / detected sequences.

ADD COMMENTlink written 9 months ago by h.mon30k

Another vote for fastp. I've switched to using to lately and really love it.

ADD REPLYlink written 9 months ago by Dave Carlson320
2
gravatar for Mensur Dlakic
9 months ago by
Mensur Dlakic6.0k
USA
Mensur Dlakic6.0k wrote:

Yet another solution is AdapterRemoval:

AdapterRemoval --file1 reads_1.fastq --file2 reads_2.fastq --threads 8 --basename trimmed

It will print out lots of useful info:

Processed a total of 317,643,940 reads in 12:43.1s; 416,000 reads per second on average ...
   Found 103092850 overlapping pairs ...
   Of which 527537 contained adapter sequence(s) ...

Printing adapter sequences, including poly-A tails:
  --adapter1:  AGATCGGAAGAGCACACGTCTGAACTCCAGTCACNNNNNNATCTCGTATGCCGTCTTCTGCTTG
               ||||||||||||||||||||||||||||||||| ****** | |  | |       |
   Consensus:  AGATCGGAAGAGCACACGTCTGAACTCCAGTCAGCAGTTTTTTTTCTTTAAAAAAATAAAAAAAAAAAAAAAAAAAAAAAAAATAAATANTAAAAAATTTTTTTTTTTTTTTTTTTTTTTTTTTTTATTTTTTTTTTTA
     Quality:  ***)))(((''&&&%%%$$$$###"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""

    Top 5 most common 9-bp 5'-kmers:
            1: AGATCGGAA = 71.70% (137696)
            2: AGATCGGCA =  0.27% (516)
            3: AGATAGGAA =  0.19% (365)
            4: CGATCGGAA =  0.18% (337)
            5: AGCTCGGAA =  0.16% (299)

  --adapter2:  AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT
               |||||||||||||||||||||||||||||||||| || |   |  |     | | |||
   Consensus:  AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTATATTTTTTTTTTTTTTTTTTTATTAAAAAAAAAAAAAAAAAAAAAAAAAAATAAAAAAAAAAAAAATTTTTTTTTTTTTTTTTTTTTTTATTATTATTTTTATTT
     Quality:  ,,,+++**))((('&&&%%%$$$##""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""


    Top 5 most common 9-bp 5'-kmers:
            1: AGATCGGAA = 80.81% (176579)
            2: AGCTCGGAA =  0.21% (459)
            3: CGATCGGAA =  0.21% (448)
            4: AGATCGGCA =  0.20% (445)
            5: AGATAGGAA =  0.16% (360)

    --adapter1 SEQUENCE
        Adapter sequence expected to be found in mate 1 reads [default:
        AGATCGGAAGAGCACACGTCTGAACTCCAGTCACNNNNNNATCTCGTATGCCGTCTTCTGCTTG].

    --adapter2 SEQUENCE
        Adapter sequence expected to be found in mate 2 reads [default:
        AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT].
ADD COMMENTlink written 9 months ago by Mensur Dlakic6.0k
1
gravatar for JC
9 months ago by
JC10k
Mexico
JC10k wrote:

a) ask the provider or who did the sequencing

b) use FastQC to check which adapter was used

ADD COMMENTlink modified 9 months ago • written 9 months ago by JC10k
1
gravatar for Makplus T
9 months ago by
Makplus T70
Makplus T70 wrote:

The best solution is to ask your sequencing data provider.
Typically, QC software (such as fastQC) can report some regular adapter types, while trim-galore can automatically detect and cut these adapters.

ADD COMMENTlink written 9 months ago by Makplus T70
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1355 users visited in the last hour