Question: Trimmomatic SureSelect protocol
0
gravatar for Moneeb Bajwa
8 months ago by
Delaware, USA
Moneeb Bajwa0 wrote:

Hi,

I was wondering if the reads I am using say "Construction protocol: Agilent SureSelect Strand Specific RNA" can I just use TruSeq3 adapter files to trim? What about Nextera? I am using Trimmomatic. It was Illumina HiSeq 3000 which was used. Sorry if the question doesn't make sense, as I am new to this...

Thank you

sequencing rna-seq assembly • 558 views
ADD COMMENTlink modified 8 months ago • written 8 months ago by Moneeb Bajwa0

Hello bajwa.m

For Agilent SureSelect read the manual Agilentmanual.

You can use following seq as adapter for trimming "CTGTCTCTTGATCACA".

For initial check (to know how may of your reads contains adapter)

use grep "CTGTCTCTTGATCACA" input.fastq | wc -l

U can not use other adapter. If you find from library preparation what is the adapter first used, use grep command mentioned above and then only use that adapter for trimming.

ADD REPLYlink written 8 months ago by mks002150

I am not really understanding...when I use that grep command on one of the files I get 0 occurrences. These are the SRA sequences: https://www.ncbi.nlm.nih.gov/sra?linkname=bioproject_sra_all&from_uid=434667. I do get 34 occurrences in one of the files if I use just CTGTCTCTTGATC. Not sure how this works, please help! Also where did you get that particular adapter sequence from, as I could not find it in the link you gave.

ADD REPLYlink modified 8 months ago • written 8 months ago by Moneeb Bajwa0

I think you are first time working with NGS data.

You have to convert the SRA sequences to fastq format using fastq-dump. Then on the fastq files you can perform trimming. you start reading more post on biostar to get yourself going.

ADD REPLYlink written 8 months ago by mks002150

No I did fastq-dump

ADD REPLYlink written 8 months ago by Moneeb Bajwa0
1

can u share the top 10 lines of fastq fileshead "input.fastq"

ADD REPLYlink written 8 months ago by mks002150

Yes i just fixed my last comment; you can see it now

ADD REPLYlink written 8 months ago by Moneeb Bajwa0
@DRR089573.1 J00158:10:H7CTLBBXX:1:1101:30594:1226 length=36
NTTGGGGGGAAGGTCTGGATCCAAGATGGTGATGAT
+DRR089573.1 J00158:10:H7CTLBBXX:1:1101:30594:1226 length=36
#<AAFJJJJJJJJJJJJJJJJJJJJJJJJFJJJJJJ
@DRR089573.2 J00158:10:H7CTLBBXX:1:1101:30695:1226 length=36
NCGTCATTGTCCCCTTGGCAGTGAGCAAAGGCCGTG
+DRR089573.2 J00158:10:H7CTLBBXX:1:1101:30695:1226 length=36
#<AAFJFFJJJJJJFFJFFJF<JAFJJJJFFFFJ<A
@DRR089573.3 J00158:10:H7CTLBBXX:1:1101:30756:1226 length=36
NCCGGATCCCATCTGAGAAGAAGTACACGCCAGGGG
ADD REPLYlink modified 8 months ago • written 8 months ago by Moneeb Bajwa0

Use some ten bases "CTGTCTCT" and go ahead for trimming.

Below is the text from Manual:

MiSeq platform sequencing run setup and adaptor trimming guidelines Use the Illumina Experiment Manager (IEM) software to generate a custom primer Sample Sheet. Set up the run to include adapter trimming using the IEM Sample Sheet Wizard. When prompted by the wizard, select the Use Adapter Trimming option, and specify CTGTCTCTTGATCACA as the adapter sequence. This enables the MiSeq Reporter software to identify the adaptor sequence and trim the adaptor from reads.

ADD REPLYlink written 8 months ago by mks002150

Is this something that is usually done? Is it possible it is a different adapter?

ADD REPLYlink written 8 months ago by Moneeb Bajwa0

In the manual you gave, I see it is for SureSelect QXT: https://www.agilent.com/cs/library/usermanuals/public/G9682-90000.pdf; but mine are SureSelect Strand Specific. Does that matter? Mine are also single-ended reads.

ADD REPLYlink modified 8 months ago • written 8 months ago by Moneeb Bajwa0
1

Do one thing run FastQC on your fastq files and check for the over represented sequences. If any such adapter is present in your sample , you ll get to know after ruuning fastqc. Good luck

ADD REPLYlink written 8 months ago by mks002150

OK thanks! The result was the following for overrepresented sequences: GATCGGAAGAGCACACGTCTGAACTCCAGTCACGAA (0.10780696988082418%) - TruSeq Adapter, Index 7 (97% over 35bp), and AATGATACGGCGACCACCGAGATCGGAAGAGCACAC (0.13940876934680813%) - Illumina Single End PCR Primer 1 (95% over 24bp). Does that make sense if it was Agilent SureSelect Protocol? These are also short reads of only 36bp, does that matter?

ADD REPLYlink written 8 months ago by Moneeb Bajwa0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 943 users visited in the last hour