How to divide the reference sequence into batches or chunks
0
0
Entering edit mode
8.0 years ago
ddzhangzz ▴ 90

I got error message "error: Reference sequence has more than 2^32-1 characters! Please divide the reference into batches or chunks of about 3.6 billion characters or less each and index each independently.":

    $ bowtie2-build -f hg19mm10.fa hg19mm10
Settings:
  Output files: "hg19mm10.*.bt2"
  Line rate: 6 (line is 64 bytes)
  Lines per side: 1 (side is 64 bytes)
  Offset rate: 4 (one in 16)
  FTable chars: 10
  Strings: unpacked
  Max bucket size: default
  Max bucket size, sqrt multiplier: default
  Max bucket size, len divisor: 4
  Difference-cover sample period: 1024
  Endianness: little
  Actual local endianness: little
  Sanity checking: disabled
  Assertions: disabled
  Random seed: 0
  Sizeofs: void*:8, int:4, long:8, size_t:8
Input files DNA, FASTA:
  hg19mm10.fa
Reading reference sizes
Error: Reference sequence has more than 2^32-1 characters!  Please divide the
reference into batches or chunks of about 3.6 billion characters or less each
and index each independently.

I am wondering how to divide the reference into batches or chunks as suggested. Does someone have this experience?

RNA-Seq • 2.6k views
ADD COMMENT
1
Entering edit mode

Use a more recent version of bowtie2, which supports large indexes.

ADD REPLY
0
Entering edit mode

Thanks, very helpful!

ADD REPLY
0
Entering edit mode

what is the main goal behind combining hg19 and mm10 ?

ADD REPLY
0
Entering edit mode

build index for human and mouse combined

ADD REPLY
0
Entering edit mode

What about regions which are already rather similar/conserved? I don't know about your downstream application, but this sounds like a tricky approach.

ADD REPLY
1
Entering edit mode

This is a pretty standard approach for dealing with mixed samples (I assume that's what OP has).

ADD REPLY
0
Entering edit mode

Learned something new, but I assume some ambiguity with highly conserved regions.

ADD REPLY

Login before adding your answer.

Traffic: 2044 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6