Testing data for read mapping tool
1
0
Entering edit mode
6.1 years ago
nada • 0

Hi for everyone

I have run the SparkBWA (https://github.com/citiususc/SparkBWA )tool for read mapping in AZURE Microsoft (HDinsight cluster) and it works OK but it cost me while running because of cloud services

and I need to do some modifications on the code of SparkBwa and test the code but this will be very expensive for me if i conducted on Azure and see the results for modification so, is there any suggestion or solutions for a quick test since large test takes long time? Can I use a small read file I mean fastq and fasta even not real data? because I just want small datasets

Because I have tried to run the SParkBwa in my local machine but it hanged-in and a black screen appeared may be because of the large reference and the huge number of work

so can I change the two read files as just small once

and if the code OK I will try it with the real data later on

thanks for your help

I am new with bioinformatics

alignment genome next-gen • 934 views
ADD COMMENT
0
Entering edit mode
6.1 years ago
michael.ante ★ 3.8k

Hi nada,

for simple testings like this, you can use any small fasta file (e.g. viral, bacterial, or just one small chromosome of a model organism) and than use from the BBMap tools randomreads.sh to generate enough - but not too many - reads for your purpose.

Cheers,

Michael

ADD COMMENT
0
Entering edit mode

can you give me a link to download the data for viral or bacterial that I need for paired ends reads (read1.fasta and read2.fasta) with the fastq file

Thank you for your help

ADD REPLY
0
Entering edit mode

A very famous virus genome is that of Phi X, which is maybe too small. A small bacterial genome is for instance that of Mycoplasma. E. coli has a bigger genome.

In general, you can search NCBI's Nucleotide database, or Ensembl bacteria.

ADD REPLY

Login before adding your answer.

Traffic: 3152 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6