I have been struggling with GATK 4 and feeling an acute lack of the command line pipelines available in previous versions. I took a look at the Tool Index and tried to make out the commands available from the https://github.com/gatk-workflows/gatk4-rnaseq-germline-snps-indels/blob/master/gatk4-rna-best-practices.wdl section
I have RNASeq data and I have used a GrCh38.p13 genomic reference not in the GATK resource bucket and currently I have processed around 600 samples with this same reference and followed till the Split n Cigar step with no problem
In the BaseRecalibrator step I get the following error :
a) GATK version
The Genome Analysis Toolkit (GATK) v22.214.171.124 HTSJDK Version: 2.21.2 Picard Version: 2.21.9
b) Exact GATK commands used
GATK BaseRecalibrator -I input.bam -R reference.fasta --known-sites resources_broad_hg38_v0_1000G_phase1.snps.high_confidence.hg38.vcf --known-sites resources_broad_hg38_v0_Homo_sapiens_assembly38.dbsnp138.vcf --known-sites resources_broad_hg38_v0_Homo_sapiens_assembly38.known_indels.vcf --known-sites resources_broad_hg38_v0_Mills_and_1000G_gold_standard.indels.hg38.vcf -O recal_data.table
c) The entire error log if applicable.
A USER ERROR has occurred: Input files reference and features have incompatible contigs: No overlapping contigs found. reference contigs = [NC_000001.11, NT_187361.1, NT_187362.1, NT_187363.1, NT_187364.1, NT_187365.1, NT_187366.1, NT_187367.1, NT_187368.1, NT_187369.1, NC_000002.12, NT_187370.1, NT_187371.1, NC_000003.12, NT_167215.1, NC_000004.12, NT_113793.3
I downloaded these files from the Resource Bundle. Could you please tell me why they aren't working. Since re-mapping at this stage is not preferable and I am sure I am using the same reference as I used for all steps before this.
Please help. Thank you very much