Question: Roche 454: How to know if reads are paired ?
1
gravatar for pastatonio78
5.3 years ago by
pastatonio7830
pastatonio7830 wrote:

Hi,

Imagine a Fastq file generated from a Roche 454 platform. You have no information whatsoever about the protocol that what used. The header of the reads give no specific information, just random alphanumeric characters. Each read starts with a 30 bp sequence and ends with a 15bp sequence that look to me like an adapter (?).

How can I be sure that reads are single-ends or paired-ends? Is there anyway to know that just on the basis of sequence information?

Thanks ;)

hts fastq 454 paired end paired-end • 5.5k views
ADD COMMENTlink modified 5.3 years ago by rtliu2.0k • written 5.3 years ago by pastatonio7830
3
gravatar for rtliu
5.3 years ago by
rtliu2.0k
New Zealand
rtliu2.0k wrote:

For 454 flx:

grep 'GTTGGAACCGAAAGGGTTTGAATTCAAACCCTTTCGGTTCCAAC' 454Reads.fastq | wc -l

You should see a big number for 454 'paired-end' data, or 0 for single end data.

The built-in linker sequences are:

  1. -linker flx -- GTTGGAACCGAAAGGGTTTGAATTCAAACCCTTTCGGTTCCAAC, a palindrome, equal to its own reverse complement.
  2. -linker titanium -- TCGTATAACTTCGTATAATGTATGCTATACGAAGTTATTACG and the reverse-complement CGTAATAACTTCGTATAGCATACATTATACGAAGTTATACGA.

For more, http://wgs-assembler.sourceforge.net/wiki/index.php/SffToCA

ADD COMMENTlink modified 8 weeks ago by RamRS25k • written 5.3 years ago by rtliu2.0k

Thanks. Sorry for the late answer.... 16 months ago LOL

ADD REPLYlink written 3.9 years ago by pastatonio7830
2
gravatar for kmcarr00
5.3 years ago by
kmcarr00270
United States
kmcarr00270 wrote:

In 454 technology there is no such thing as paired reads, at least in the sense that we all understand paired end sequencing. Given the design of their bead based sequencing it is impossible to generate reads from both ends of a template fragment. All 454 reads single reads, possibly with a barcode at the start of the read.

Roche had a protocol which they called "paired end" but that was misappropriation of the term. It was a protocol used for amplicon sequencing which mixed capture beads with the A and B oligos to randomize which end of an amplicon molecule would get sequenced. You still only got one read from each fragment.

ADD COMMENTlink written 5.3 years ago by kmcarr00270
1

What Roche/454 calls 'paired end' is sequencing both ends of longer fragments by circularisation, linker ligation, fragmentation and sequencing the fragments containing the linker. We would now call that a variant of mate pair sequencing.

ADD REPLYlink written 5.3 years ago by lexnederbragt1.2k

You are right Lex. It has been so long since I've dealt with 454 data I forgot about that format.

ADD REPLYlink written 5.3 years ago by kmcarr00270
0
gravatar for Prakki Rama
5.3 years ago by
Prakki Rama2.3k
Singapore
Prakki Rama2.3k wrote:

I would check like this:

LC_ALL=C fgrep 'ID' 454Reads.fastq | cut -d " " -f 1 | sort | uniq -d

If the count is 2 for many reads, then it must be paired read file.

*ID in the above command should be the common string you see in all the reads.

ADD COMMENTlink modified 5.3 years ago • written 5.3 years ago by Prakki Rama2.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 650 users visited in the last hour