ABySS syntax for reads split across two lanes
1
0
Entering edit mode
8.4 years ago
kcamnairb ▴ 40

Hi, I have a paired end library that was split among two sequencing lanes. The left and right reads are in separate files so I have a total of 4 files for each library. How would I specify this library in abyss, or do I need to concatenate the reads from the two lanes? For example, would this be proper:

abyss-pe k=64 name=ecoli lib='pe200 pe500'
    pe200='pe200_lane1_1.fa pe200_lane1_2.fa pe200_lane2_1.fa pe200_lane2_2.fa' pe500='pe500_lane1.fa pe500_lane1_2.fa pe500_lane2.fa pe500_lane2_2.fa'
abyss assembly • 1.7k views
ADD COMMENT
0
Entering edit mode

I definitely think the easiest thing to do is to concatenate your reads - putting all your Left reads in one file, and all your Right reads in the other, for each library that you have.

However I have to say, I don't think I've ever heard of 1 library being split and sequenced between two lanes. I'd be careful with that data. If there are any disparities between the performance of the two lanes, your data might be a bit wacky. On the other hand, maybe it's not such a big deal. Just curious, is there any particular reason you sequenced the library in this fashion?

ADD REPLY
0
Entering edit mode

Thanks, I'll try concatenating the reads. I actually have two libraries, so I think pooling libraries across multiple lanes is supposed to reduce any lane bias.

ADD REPLY
0
Entering edit mode

When we have N samples to run across M lanes, it's normal to pool them across all to normalize any lane performance disparities across all samples. I haven't seen a lane completely fail , but the sequencing people claim it used to happen and this strategy is to make sure no samples are completely lost by a failed lane.

ADD REPLY
1
Entering edit mode
8.4 years ago
benv ▴ 730

Hi @kcambairb,

The command line you have specified is correct.

If multiple files are specified, ABySS will assume that the files are ordered in the following way: dataset1_read1.fq, dataset1_read2.fq, dataset2_read1.fq, dataset2_read2.fq, ...

If you only specify a single file, ABySS will assume it contains both reads 1 and 2 (interleaved).

Btw, ABySS understands gzipped files and in addition to FASTQ you can also use FASTA, SAM, and BAM formats.

ADD COMMENT

Login before adding your answer.

Traffic: 1435 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6