I am new to gene alignment. I am trying to align few short/long reads with a reference genome using bwa software. I have been following this tutorial - https://icb.med.cornell.edu/wiki/index.php/Elementolab/BWA_tutorial
I did the following -
- Downloaded bwa-0.5.9.tar.bz2 and build.
- Downloaded reference human genome as wg.fa
- Created bwt index using - bwa index -p hg19bwaidx -a bwtsw wg.fa.
- Try and align long reads from 454seqs.txt - bwa bwasw hg19bwaidx 454seqs.txt > 454seqs.sam
The output (SAM format) file of alignment only has the header information and no alignment information. Can someone guide me if I am missing something. Also, what would be the next steps in your opinion to play around with this software to get a better understanding of the run times for alignment.
Hi swbarnes2, Thanks for your answer. Here are my comments on your observations
The sam file looks something like this @SQ SN:chr10 LN:135534747 @SQ SN:chr11 LN:135006516 @SQ SN:chr11_gl000202_random LN:40103 . . . @SQ SN:chrX LN:155270560 @SQ SN:chrY LN:59373566 @PG ID:bwa PN:bwa VN:0.7.17-r1188 CL:bwa samse hg19bwaidx 454seqs.txt.bwa 454seqs.txt
I am sorry the version I am using is bwa-0.7.17. Can you shed some light on this version.
Hi, I was able to resolve the issue.
My long reads file looked something like this before -
>
read1_2079_205 AGTCTCCCTCTTTGTCTCTCCCATTGACTCAGCTTTCTATGGCCTCAGATTCCCCATCCCCTTCCCAACGCCCCAGCACTGGAAGACACGTGCTGTCCCTGGTGCCGGCTCCTACGGCTAGGTCGTGGCTGGCCAGGAAATCCCCCTGGTTGGCTATGCCCCCGCTGCTCCCCCCCGCGATTACCTCCCCGGGCGCGTGCGCAATI had to replace
>
with > for it to work.