Help with gatk BaseRecalibrator
1
0
Entering edit mode
4 months ago
Chris ▴ 260

Hi Biostars,

I try to do variant calling and got error at this step. Would you please have a suggestion? Thank you so much.

gatk BaseRecalibrator -I ${aligned_reads}/SRR062634_sorted_dedup_reads.bam -R ${ref} --known-sites ${known_sites} -O ${data}/recal_data.table

Invalid argument '/recal_data.table
GATK variant-calling • 1.2k views
ADD COMMENT
2
Entering edit mode
4 months ago
Ram 43k

${data} has no value, which is why the command sees /recal_data.table instead of whatever directory you need it to be within.

ADD COMMENT
0
Entering edit mode

My bad. After adding data, I got this error:

A USER ERROR has occurred: Invalid argument '/variant_calling/data/recal_data.table'
ADD REPLY
1
Entering edit mode

look at the path, it is an absolute path, you probably meant it to be a relative path

ADD REPLY
0
Entering edit mode

This is the path of data:

data="/variant_calling/data"

I just try to follow the code here: https://github.com/kpatel427/YouTubeTutorials/blob/main/variant_calling.sh

ADD REPLY
1
Entering edit mode

do you have data in the folder /variant_calling/data ?

seems odd that you would since that is a folder that opens from root, nobody should store data like that

... you could have that folder of course, but my guess is that it is a relative path gone wrong ...

make sure the paths exist and are valid and contain the files

ADD REPLY
0
Entering edit mode

No, data is a folder and it is empty at this point. It was create to store output I think.

ADD REPLY
1
Entering edit mode

first of all make it a relative path

./variant_calling/data

so that you can be sure you can write to it,

long story short and we are trying to tell you this, that the file structure seems all messed up so no wonder you have problems

make sure all the files and paths are valid, that's why you are having problems,

once you make paths with superuser account (which you had to do to make a folder under root), all kinds of weird problems may occur, the permission can be all wrong etc.

long story short, write to paths that you are sure to be writable by current user and that means paths in your home directory

ADD REPLY
0
Entering edit mode

Would you please have a look at this code? https://github.com/kpatel427/YouTubeTutorials/blob/main/variant_calling.sh This is code from a popular bioinformatics youtube channel. Maybe there is something wrong with the tutorial.

ADD REPLY
1
Entering edit mode

kpatel clearly declares a valid path for the data variable (https://github.com/kpatel427/YouTubeTutorials/blob/main/variant_calling.sh#L46). You're not following the tutorial right.

Istvan has tried explaining multiple times how your directory path for the table file is not set up properly but you insist that you have it right and it must be us or even the popular tutorial that is wrong. Are you sure you understand the commands you're trying to run?

ADD REPLY
0
Entering edit mode

This is the path in the tutorial

data="/Users/kr/Desktop/demo/VC/data"

and this is my path:

data="/labs/chris/variant_calling/data"

Still not sure what I did wrong.

ADD REPLY
2
Entering edit mode

at this point I would recommend that you do a Unix tutorial on paths, permissions

these are very straightforward concepts, but at the same time can also give you extraordinarily confusing errors when used incorrectly

many tools have a hard time with invalid inputs and will raise cryptic errors,

so it is always very important that you ensure that paths exist, you can write to them, understand what relative and absolute paths are and so on,

troubleshooting one path error at a time won't be an effective way of using either one of our times

ADD REPLY
1
Entering edit mode

The path you show here is not the same as the one you showed above, so we're not sure what you're doing.

Run this in the command line and show us the output:

stat /labs/chris/variant_calling/data
ls -lh /labs/chris/variant_calling/data
ADD REPLY
0
Entering edit mode

This is what I have:

stat /labs/chris/variant_calling/data

Access: (2775/drwxrwsr-x)  Uid: (425492/  chris)   Gid: (   99/  nobody)
Access: 2023-12-08 11:10:27.000000000 -0800
Modify: 2023-12-03 14:58:45.000000000 -0800
Change: 2023-12-03 17:22:23.000000000 -0800
 Birth: -

ls -lh /labs/chris/variant_calling/data

total 0
ADD REPLY
0
Entering edit mode

And what's the error message when you run the gatk command?

ADD REPLY
0
Entering edit mode

I got this:

A USER ERROR has occurred: Invalid argument '/variant_calling/data/recal_data.table'
ADD REPLY
1
Entering edit mode

Add an echo before the gatk, run the command and show us the output.

ADD REPLY
0
Entering edit mode

This is what I got:

echo gatk BaseRecalibrator -I ${aligned_reads}/SRR062634_sorted_dedup_reads.bam -R ${ref} --known-sites ${known_sites} -O ${data}/recal_data.table

gatk BaseRecalibrator -I /SRR062634_sorted_dedup_reads.bam -R /labs/chris/variant_calling/Desktop/demo/supporting_files/hg38/hg38.fa --known-sites -O /labs/chris/variant_calling/data/recal_data.table
ADD REPLY
1
Entering edit mode

That makes it clear - your ${known_sites} is empty. Always echo the command and make sure it looks good before executing it.

ADD REPLY
0
Entering edit mode

Thank you so much for the follow up. Unfortunately, I got a new error:

A USER ERROR has occurred: Input Homo_sapiens_assembly38.dbsnp138.vcf must support random access to enable queries by interval. If it's a file, please index it using the bundled tool IndexFeatureFile

The machine couldn't find Homo_sapiens_assembly38.dbsnp138.vcf if I use: known_sites="labs/chris/variant_calling/Desktop/demo/supporting_files/hg38/Homo_sapiens_assembly38.dbsnp138.vcf" so I copied it to the current directory and run:

gatk BaseRecalibrator -I ${aligned_reads}/SRR062634_sorted_dedup_reads.bam -R ${ref} --known-sites Homo_sapiens_assembly38.dbsnp138.vcf -O ${data}/recal_data
.table
ADD REPLY
1
Entering edit mode

You need to google individual error messages to understand what you're missing. This is becoming almost like chat support right now. You cannot use a plain VCF file, you need to bgzip and tabix-index it.

ADD REPLY

Login before adding your answer.

Traffic: 1337 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6