Entering edit mode
7.6 years ago
ashkan
▴
160
I have some fastq files from Ribo-seq data but don't know the adapters when processing files. do you know how I can get the adapter sequence from fastq files?
Try running FastQC on you fastq files. It will report if there is adapter sequence (also type of adapter) in your files. Check sample results from FastQC. And then trim your adapters using trimmomatic or cutadapt. Or select the tool according to your data (check Table 1 from this article).
Really, another question without following up on all previous threads?
Clearly @ashkan does not believe in giving credit/acknowledging help where it is due.
And many people here are angels/heroes/fantastic and will help anyone, regardless of their terrible attitude.
you are counting and following all of his posts :joy:
I wouldn't call it a lot of fun, but I just copy my previous comment and add the latest thread to it. I should write a bot to do that.
If you know that the sequence was generated using a standard commercial (e.g. Illumina) kit then you could use full set of adapters available in BBMap suite in the resources directory to scan your data.
If that is not the case and if you have paired-end data then you may be able to use
bbmerge.sh
from BBMap likebbmerge.sh in1=r1.fq in2=r2.fq outa=adapters.fa
to identify adapters.Typically this is difficult to figure out, as not all sequences will have the adaptor, and when it does appear in the sequence it will have different offsets (only the first base, only the first 2 bases, etc). I asked this question a while back and the solution seems to be to give Trim Galore! or similar just a list of all possible adapter sequences, and see what the result looks like afterwards. From this you can probably even figure out what adapter was removed post-hoc. I think there's also something in the BBMap toolkit that can help, but my recollection is that it didn't work on my data for reason (i can't remember, perhaps it only worked when the reads overlapped, or something). But that was some time ago, and BBMap is perhaps one of the most frequently updated suite of tools in bioinformatics right now :)
BBMerge will only tell you the adapter sequence for paired reads that are overlapping. But then, paired reads that are not overlapping won't have any adapter sequence :) Other than adapter-dimers.
Right, yup, i've gotten mixed up and confused adapters with other things that are sequenced at the beginning of the read. My mistake :)