Hello everybody,
I am working with RNAseq Data and I'm trying to create VCF files with GATK. Here is my command line:
SNPref=/home/theos974/projects/def-thchlava/Chromomes_GATK_Files/GenomeAnalysisTK.jar
Humanref=/home/theos974/projects/def-benlab11/reference/hg38ercc.fa
readarray -t input_bam_files_cohort2 < input_bam_files_cohort2.txt
readarray -t output_vcf_Files_chr1_Cohort2 < output_vcf_Files_chr1_Cohort2.txt
java -jar $SNPref -L /home/theos974/projects/def-thchlava/Chromomes_GATK_Files/chr1_KG.all.chr.bim.hg38.intervals -T HaplotypeCaller -R $Humanref -U ALLOW_N_CIGAR_READS -rf ReassignMappingQuality -DMQ 60 -I ${input_bam_files_cohort2[$SLURM_ARRAY_TASK_ID]} -stand_call_conf 20 -o ${output_vcf_Files_chr1_Cohort2[$SLURM_ARRAY_TASK_ID]}"
This commad seems to be good because it worked with my first set of data. But with my new set I obtain this ERROR message:
ERROR MESSAGE: SAM/BAM/CRAM file /home/theos974/projects/def-thchlava/Cohort2/bam/NAC_215.sorted.bam is malformed. Please see "software.broadinstitute.org/gatk/documentation/article?id=1317" for more information. Error details: SAM file doesn't have any read groups defined in the header. The GATK no longer supports SAM files without read groups
I hope this is not a big problem,
Thank you very much in advance for any answer you would provide me.
well known problem. GATK requires BAMs to contain Read-Groups : https://gatkforums.broadinstitute.org/gatk/discussion/6472/read-groups
Hello and welcome to biostars theo.stefan.1 ,
Please use the formatting bar (especially the
code
option) to present your post better. I've done it for you this time.Thank you!
I will do it next time thank you!