Trimming fastq up until a sequence
2
0
Entering edit mode
4.2 years ago
SemiQuant ▴ 80

Hi

I have a somewhat difficult problem to find a solution to on google. I need to trim my fastq files up until a sequence, and not remove that sequence (but remove everything before it).

This 
someRandomNoise_aKnownSequence_unknownSequence 
becomes
aKnownSequence_unknownSequence

All the tools I use, and that I have seen, would remove both the "someRandomNoise" and the "aKnownSequence"

I could try to find the location of the sequence in each read and then trim then in a loop, but this seem very inefficient.

trimming fastq sequencing • 1.8k views
ADD COMMENT
0
Entering edit mode

To verify your trimming results, you might like to clone our visualisation tool Trimviz, see example report here (currently in beta testing). I apologize for the shameless plug, but it's exactly this kind of non-standard trimming situation for which I envisaged it would be useful. Dependencies include a few common R and python libs, plus samtools (and ideally seqtk). In FQ mode, give it the pre-trimmed and post-trimmed fastq file names ( python path/to/trimviz.py FQ -u <untrimmed.fq.gz> -t <trimmed.fq.gz> -o <outdir> , and use -k 50000 if you don't have seqtk installed or are in a hurry). I imagine you would see a big block of vertical stripes around the 5' trimming site in the sequence heat-maps, corresponding to the desired target sequence. If it is on the RIGHT of the 5'-trimming site, then that sequence has been successfully retained in your reads but everything before it is trimmed.

ADD REPLY
1
Entering edit mode
4.2 years ago
GenoMax 141k

You can use bbduk.sh from BBMap suite with following structure.

bbduk.sh in=input.fq.gz out=output.fq.gz literal=aKnownSequence ktrim=l

A detailed guide is available here.

ADD COMMENT
0
Entering edit mode

I can't believe I missed that in the guide (its the first paragraph!) Thanks.

ADD REPLY
0
Entering edit mode
4.2 years ago
SemiQuant ▴ 80

I've fount that filtlong has a trim option that also does this.

ADD COMMENT

Login before adding your answer.

Traffic: 1948 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6