How to handle Trinity post-filtering singletons from paired-end sequencing?
2
5
Entering edit mode
8.8 years ago
Ekarl2 ▴ 120

I have a paired-end dataset. After using Trimmomatic to filter the data, I get singletons over. These are reads whereby one of the pairs survived filtering, but the other did not. I have singletons from both left and right reads and I am wondering how to include them in a Trinity assembly.

Thankfully, there are a few places online where I can find suggestions, but they recommend different things and I would like to understand it more in detail.


The Trinity FAQ says:

If you have additional singletons, add them to the .fq file that they correspond to based on the sequencing method used (if they're equivalent to the left.fq entries, add them there, etc).

This seems to suggest that the left singletons goes into left and the right singletons goes into right? Does Trinity handle that so that it will not treat to singletons that have nothing to do with each other as pairs?

Trinity mailing list says

For running Trinity, you don't need to separate any unpaired reads from the paired reads. If you want to, you can just cat them all together into a single file, and run trinity as:

Trinity.pl --single all_reads.fastq --run_as_paired <other_opts>

One SEQanswers post from 2012 suggests renaming all right singletons to /1 and adding all singletons to left:

If you have both paired and unpaired data, and the data are NOT strand-specific, you can combine the unpaired data with the left reads of the paired fragments. Be sure that the unpaired reads have a /1 as a suffix to the accession value similarly to the left fragment reads. The right fragment reads should all have /2 as the accession suffix. Then, run Trinity using the --left and --right parameters as if all the data were paired.


My main question is this. Should I:

  • add all of my left singletons to the left file and all of my right singletons to the right file?
  • add all of my singletons regardless of /1 or /2 suffix, to the left one?
  • add all of my singletons, renamed to /1 suffix, to the left one?

Does it matter? Or is this one of those questions where I need to do all options and compare the results? Or just ignore it and run Trinity with a single read file with all reads --run_as_paired to make it less of a hassle?

singletons paired-end Trinity • 4.6k views
ADD COMMENT
0
Entering edit mode
8.8 years ago

The easiest way to preserve pairing is to generate a fake read 1bp long as the mate of each singleton. In fact, BBDuk can do this automatically; if you set the flag removeifeitherbad=f pairs will only be discarded if both are trimmed to below minlen; and BBDuk always trims reads to a minimum of 1bp, so that pairing is maintained. You CAN set outs in BBDuk to get singleton reads, but in many cases it's more convenient to not do that.

A sample command, for quality trimming to Q15 and discarding pairs only if both of them end up shorter than 30bp, would be:

bbduk.sh in1=r1.fq in2=r2.fq out1=trimmmed1.fq out2=trimmed2.fq qtrim=r trimq=15 minlen=30 removeifeitherbad=f

This uses the phred algorithm for quality trimming, but BBDuk also supports window-based trimming like Trimmomatic.

Please note, by the way, that I don't use Trinity so I don't know how it handles paired reads versus singletons; but this approach should work no matter how it handles them.

ADD COMMENT
0
Entering edit mode
8.1 years ago
qingxiangg ▴ 40

Don't know whether your problem has been solved or not. /

Personally, in a short, include the singletons only when there were too many reads were trimmed, and trinity can handle this; or else only use the paired sequences that passed the trimming.

ADD COMMENT

Login before adding your answer.

Traffic: 3426 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6