Question: I need to know how to trimming RNAseq of GEO dataset.
1
gravatar for silas008
4.2 years ago by
silas008100
Brazil
silas008100 wrote:
I need to trimming the adapters of a GEO NCBI RNAseq. Link http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE28888 The fastqc shows that there are some Ilumina primeira and adapters. But Im not sure of acurace of fastqc for that. The autors not provide de adapters sequence on GEO. How can I know especificaly what adapters to cut? Thank you
rna-seq • 1.5k views
ADD COMMENTlink modified 4.2 years ago • written 4.2 years ago by silas008100
1

In fastqc output you get over represented seqences, but they get it from first 200K sequences which is nothing in comparison to library size.

but you can make a consensus of overrepresented sequences and then check by 

head -n 400000 fastq | grep 'consensus_sequences' | wc -l

tail -n 400000 fastq | grep 'consensus_sequences' | wc -l

keep on editing the sequence unless you get output of 90000-99000

I am sure there is an another way, which is with allignment  something, I had read in a paper where authors took bulk of ClIP-seq data and then trimmed their adapter  by some calculations.

here is the paper, see in their method section they did something to remove adapters

ADD REPLYlink modified 4.2 years ago • written 4.2 years ago by Manvendra Singh2.1k
2
gravatar for h.mon
4.2 years ago by
h.mon27k
Brazil
h.mon27k wrote:

You are two degrees of separation from your answer. On the GEO page you provided:

Citation(s) Warf MB, Shepherd BA, Johnson WE, Bass BL. Effects of ADARs on small RNA processing pathways in C. elegans. Genome Res 2012 Aug;22(8):1488-98. PMID: 22673872

When you click on the PMID link, it takes you to the PubMed entry for the paper, there you find a link to the whole article, where you learn they used the Illumina Small RNA Prep Kit v1.0 or v1.5, and Novoalign to trim 3' adaptors.

You may use BBDuk with the options tpe and tbo, it should trim adapters even without knowing its sequence.

edit: read this thread on using BBDuk for small RNA, it says it doesn't work well for them. However, Brian Bushnell - the author of BBTools - updates them like crazy, it may well be fixed by now.

ADD COMMENTlink modified 4.2 years ago • written 4.2 years ago by h.mon27k
2

As a matter of fact... :)

My recommended methodology has changed slightly for situations where you do not know the adapter sequence.  It still requires paired reads, though.  First, you can determine the adapter sequences like this:

bbmerge.sh in1=read1.fq in2=read2.fq outa=adapters.fa reads=1m

(for small RNAs, add the flags mininsert=15 mininsert0=15)

Then you can run BBDuk:

bbduk.sh in1=read1.fq in2=read2.fq out1=trimmed1.fq out2=trimmed2.fq ref=adapters.fa k=23 mink=11 hdist=1 tbo tpe

This is more sensitive than just running BBDuk with the "tbo" flag and no adapter sequence.

ADD REPLYlink modified 4.2 years ago • written 4.2 years ago by Brian Bushnell16k
0
gravatar for silas008
4.2 years ago by
silas008100
Brazil
silas008100 wrote:

Thank you so very much. I mixed your answers and could find my adapter sequence! 

It is: ATCTCGTATGCCGTCTTCTGCTTG

But i dont know if i cut the whole adapter sequence or the sequence overrepresented in fastqc, that is part of adapter sequence.

ADD COMMENTlink written 4.2 years ago by silas008100
1

To use as the adapter sequence for trimming reads, use the whole adapter sequence, not the short overrepresented kmers from fastqc.  They're too short and will either be ignored or will result in false positives - longer is better.

ADD REPLYlink modified 4.2 years ago • written 4.2 years ago by Brian Bushnell16k
0
gravatar for silas008
4.2 years ago by
silas008100
Brazil
silas008100 wrote:

Ok.

Thank you again!

ADD COMMENTlink written 4.2 years ago by silas008100

Biostars etiquette: You can (should) also use the button "Reply" or  "add comment" when replying to an answer instead of adding a new answer. Adding new answers instead of commenting on an existing one will not destroy the world, but keeps things tidy and threads of conversation easy to follow :) 

ADD REPLYlink written 4.2 years ago by A. Domingues2.1k

Ohhhh... Im sorry. I started using Biostar yersterday and I was not pay attention that. Thank you ;)

ADD REPLYlink written 4.2 years ago by silas008100
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1130 users visited in the last hour