Hi, I'm doing a GWAs study in Arabidopsis thaliana, I'm using for preprocessing the samples de command bwa aln and sampe, then samtools import reference.fa sample.sam sample.bam.
But when the script continous, it has an error and I don't know how to fix. This is my error:
[samopen] SAM header is present: 7 sequences.
[sam_read1] reference 'ID:bwa PN:bwa VN:0.7.10-r789 CL:/home/bwa-0.7.10/
bwa sampe ./thaliana.fa SRR519713_1.sai SRR519713_2.sai SRR519713_1.fastq SRR519713_2.fastq
' is recognized as '*'.
[main_samview] truncated file.
If anyone knows where I am wrong?
And another question which reference genome I have to put in this commandline:
java -Xmx4g -jar $HOME/opt/GATK2/GenomeAnalysisTK.jar -T RealignerTargetCreator -R $2 -I $3.bam -o $3
Thank you!
What command are you giving that produces that error message? It looks like the header is completely messed up. BTW, use
samtools view
instead ofsamtools import
.