Question: Trimming SOLiD adapters
0
gravatar for bioinformaticsfilesdrive
22 months ago by

Hello,

I am currently working with SOLiD RNA-seq NGS data from 2013. I was provided 48 .csfasta and .qual files from my PI (24 F3 files and 24 F5 files)- I used galaxy's solid2fastq tool (called Convert SOLiD output to fastq: https://usegalaxy.org/?tool_id=solid2fastq&version=1.0.0&__identifer=3y01dzhi3ya) to create 24 .fastqcssanger files (12 F3 and 12 F5). I ran a FastQC on these files and almost all of them had traces of ABI Solid3 Adapter B in them in the overepresented sequences. I want to remove these adapters, naturally, before moving on to alignment.

I was able to find the sequence of this adapter on https://github.com/csf-ngs/fastqc/blob/master/Contaminants/contaminant_list.txt but I also was given a sequence for it on galaxy through FastQC: which sequence should I use in an adapter removal program?

My next question was which tool should I use to trim this SOLiD adapter? I have been trying cutadapt for quite a while now and I keep getting an invalid syntax error over and over (SyntaxError: invalid syntax). I have been trying the following code and I have been unable to get any success:

  • 1) "/usr/bin/python" to initialize our Python (the system properly loads Python)

  • And then one of the following (note below that the adapter sequence I have been entering has been from the fastqc website, not the one from the FastQC overrepresented sequence- hence the above question) :

  • 2a) "cutadapt -c -a CCACTACGCCTCCGCTTTCCTCTCTATGGGCAGTCGGTGAT '/home/WMurphy/Documents/Chondrocyte NGS/I3_2013_02_01_F5-RNA.csfasta' '/home/WMurphy/Documents/Chondrocyte NGS/I3_2013_02_01_F5-RNA.QV.qual' > output.fastq"

  • 2b) "cutadapt -c -a CCACTACGCCTCCGCTTTCCTCTCTATGGGCAGTCGGTGAT '/home/WMurphy/Documents/Chondrocyte NGS/I3_2013_02_01_F5-RNA.csfasta' '/home/WMurphy/Documents/Chondrocyte NGS/I3_2013_02_01_F5-RNA.QV.qual'"
  • 2c) "cutadapt -c -a CCACTACGCCTCCGCTTTCCTCTCTATGGGCAGTCGGTGAT '/home/WMurphy/Desktop/I3_F5.fastqcssanger'"
  • 2d) "cutadapt -c -a CCACTACGCCTCCGCTTTCCTCTCTATGGGCAGTCGGTGAT '/home/WMurphy/Desktop/I3_F5.fastqcssanger' > output.fastq".

Nothing has worked. I have been rereading http://cutadapt.readthedocs.io/en/stable/colorspace.html over and over, and I still don't know what I am doing wrong. Any and all help would be immensely appreciated!

Have a great day!

Sincerely,

bioinformaticsfilesdrive

cutadapt rna-seq galaxy solid • 1.5k views
ADD COMMENTlink written 22 months ago by bioinformaticsfilesdrive0

Are you actually quoting your commands? Or are those just for this forum quoted?

ADD REPLYlink written 22 months ago by WouterDeCoster39k

Thanks for the reply- they are just quoted on the forum

ADD REPLYlink written 22 months ago by bioinformaticsfilesdrive0

And the single quotes around paths? You don't need those.

ADD REPLYlink written 22 months ago by WouterDeCoster39k

Those single quotes are from me dragging the file into the terminal instead of typing it all out- i'll try it without those quotes. Which code sequence should I enter? 2a, 2b, 2c, or 2d?

EDIT: I tried all of them, with removing the single quotes too, and still received the same syntax error for all of them.

ADD REPLYlink modified 22 months ago • written 22 months ago by bioinformaticsfilesdrive0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1236 users visited in the last hour