I'm trying to create reference index for all human chromosomes 22+X+Y+M using bowtie2, but its getting stuck after performing some steps.
You should add other details in order for us to help you. You should tell us what error it is throwing?
it's not giving any error, it is getting stuck. i left it running for 24hours on 8GB RAM system, but it stuck.
There are some per-built indexes are available ftp://ftp.ccb.jhu.edu/pub/data/bowtie_indexes/. but it seems they are for Bowtie not for Bowtie2 as they have extension .ebwt.
so can i use these indexes for Bowtie2 also ?
No, bowtie1 indices won't work for bowtie2.
Just download the prebuilt ones from iGenomes. Note that they don't seem to have the most recent human reference (GRCh38). If you really do need that, just stop the current bowtie2-build process and start it again, this is presumably a one-off even.
ya i want GRch38 reference index. 8GB RAM is enough or it needs more RAM?
i already tried it 3 times but it is not working.
I would expect that that's enough, but I don't have anything with that little RAM with which to test.
bowtie2-build -f GRch38.fasta ref_index
please check, above command line arguments i am using. please let me know, if its okay or i am missing something ?
You don't need -f, but that shouldn't be causing the problem. The command looks fine.
Do you want the UCSC or Ensembl chromosome names? I can just make the indices and put them on google drive or something for you.
I want Ensembl chromosome names (GRch38). please do me a favor, if you can.
thanks a lot Devon Ryan !!
Can do. Is the "primary assembly" OK or do you want "top level" (i.e., with all of the patches and alternate haplotypes)?
basically, i'm gonna use this index for mapping reads to find SNPs and INDELs. and i've downloaded dbSNP human build 141(vcf file) based on GRch38 assembly.
please give me whichever is better with above mentioned dbSNP version.
i've also changed sequence identifiers in fasta files according to the name given in dbSNP.
so i can provide you the fasta sequences.
Here are the indices for UCSC (since Ensembl isn't yet available):
I'll remove them in a day or two.
thanks a lot !!
please let me know, which chromosomes you have included in index?
all 22+X+Y+M ?
also tell me the version, hg38 ?
GRCh38, which for UCSC includes all of the chromosomes and unplaced contigs and I believe alternate alleles as well.
BTW, that requires ~6 gigs of RAM to index.
I tried it with 8GB RAM. don't know what was the problem
Thanks a lot again. i've downloaded files.
As mentioned below, a GRCh38 reference hasn't been made by Ensembl yet, so I can only offer the UCSC version.
Actually, Ensembl hasn't released a version for GRCh38 yet (it'll be in the next release, number 76).
Hi, Did you find the bug?
Login before adding your answer.
Use of this site constitutes acceptance of our User Agreement and Privacy