Raw Illumina Data
5
2
Entering edit mode
12.7 years ago
Cdiez ▴ 150

Hello,

I have just received the raw data of an Illumina genomic library (one line) so I already have a 6GB fastq file. I know I have to trim the adaptors and condense de files taking only the "unique" sequences. But, is there any package able to do this process? I mean, processing the "raw" data of a library, or os writing your own perl scripts the only way to face the problem?

Thanks in advance!

illumina adaptor perl • 4.8k views
ADD COMMENT
4
Entering edit mode
12.7 years ago
Swbarnes2 ★ 1.6k

Most genomic libraries don't have problems with adaptors. They only crop up when the sequences that one wants sequenced are very short. You probably have 60-100 bases of a 200+ base DNA insert, so you won't see adaptors.

Usually, duplicates are figured out after alignment, not before. Computationally, it's easier on a sorted .bam than on raw reads, if you go by coordinates.

ADD COMMENT
0
Entering edit mode

Thanks! yes I have this problem with the adaptors because I am sequencing small RNA (20-30 nt) then I have to trim them before starting the analysis.

ADD REPLY
0
Entering edit mode

there are a few adapter trimming solutions out there: http://seqanswers.com/forums/showthread.php?t=1159

ADD REPLY
0
Entering edit mode

Thanks! I appreciate your help! I'm also checking fastx toolkit and it seems quite useful.

ADD REPLY
3
Entering edit mode
12.7 years ago
Ido Tamir 5.2k

The fastx toolkit provides some simple to use command line utilities to do this.

ADD COMMENT
2
Entering edit mode
12.7 years ago
Darked89 4.6k

Check Tagdust from: http://genome.gsc.riken.jp/osc/english/dataresource/

If you need to do the oposite (select fastq reads with certain pattern) there is fqgrep: https://github.com/indraniel/fqgrep

ADD COMMENT
1
Entering edit mode
12.7 years ago
Bach ▴ 550

Filtering for unique sequence is a very bad habit I never understood why people would even envisage doing that: you enrich for sequencing errors.

ADD COMMENT
1
Entering edit mode
ADD REPLY
1
Entering edit mode
12.7 years ago
Ying W ★ 4.2k

You can use FastQC to figure out if you have adapter issues and also base bias.

ADD COMMENT

Login before adding your answer.

Traffic: 2601 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6