Bowtie error when indexing a reference file
9 weeks ago
Apex92 ▴ 260

Dear all,

I have a condition where I need to use all sequencing reads (a concatenated fasta file) as a reference. The concatenated fasta file has 50,829,402 reads. I tried to use bowtie 1 to build the index of the reference as bowtie-build concatenated.fasta ref but I get the following error.

Error: Reference sequence has more than 2^32-1 characters!  Please divide the
reference into batches or chunks of about 3.6 billion characters or less each
and index each independently.


How can I solve this? Instead of running bowtie-build on a concatenated fasta file can I use bowtie-build -f *.fasta ref?

Preferably I need to use bowtie 1.

Thanks.

bowtie RNA-seq • 499 views
May I ask why you do that? I cannot imagine any situation where concatenating reads (which is randomly fragmentated DNA after all) would make any sense.

These are sequenced PEA products that I need to check with the expected PEA sequences. I tried to map reads which are long (due to UMI and primers) to the expected PEA sequences (shorter) allowing zero mismatches but I got no alignment that makes sense. Thus I decided to do it the other way around (to map the expected PEA sequences to the sequenced PEA products).

What is PEA? I think it would make sense to include a layout and brief description (technically) of what you did and how the R1/R2 structure and reference looks like and then one can suggest an alignment strategy. This so far sounds quite non-standard.

It is not clear if it is possible to analyze the data using standard software.

Yes, use their software if posssible or contact customer support. This is no standard assay it seems.