Entering edit mode
10 months ago
Chris
▴
340
Hi Biostars,
I try to do variant calling and got error at this step. Would you please have a suggestion? Thank you so much.
gatk BaseRecalibrator -I ${aligned_reads}/SRR062634_sorted_dedup_reads.bam -R ${ref} --known-sites ${known_sites} -O ${data}/recal_data.table
Invalid argument '/recal_data.table
My bad. After adding
data
, I got this error:look at the path, it is an absolute path, you probably meant it to be a relative path
This is the path of data:
I just try to follow the code here: https://github.com/kpatel427/YouTubeTutorials/blob/main/variant_calling.sh
do you have data in the folder
/variant_calling/data
?seems odd that you would since that is a folder that opens from root, nobody should store data like that
... you could have that folder of course, but my guess is that it is a relative path gone wrong ...
make sure the paths exist and are valid and contain the files
No, data is a folder and it is empty at this point. It was create to store output I think.
first of all make it a relative path
./variant_calling/data
so that you can be sure you can write to it,
long story short and we are trying to tell you this, that the file structure seems all messed up so no wonder you have problems
make sure all the files and paths are valid, that's why you are having problems,
once you make paths with superuser account (which you had to do to make a folder under root), all kinds of weird problems may occur, the permission can be all wrong etc.
long story short, write to paths that you are sure to be writable by current user and that means paths in your home directory
Would you please have a look at this code? https://github.com/kpatel427/YouTubeTutorials/blob/main/variant_calling.sh This is code from a popular bioinformatics youtube channel. Maybe there is something wrong with the tutorial.
kpatel clearly declares a valid path for the
data
variable (https://github.com/kpatel427/YouTubeTutorials/blob/main/variant_calling.sh#L46). You're not following the tutorial right.Istvan has tried explaining multiple times how your directory path for the table file is not set up properly but you insist that you have it right and it must be us or even the popular tutorial that is wrong. Are you sure you understand the commands you're trying to run?
This is the path in the tutorial
and this is my path:
Still not sure what I did wrong.
at this point I would recommend that you do a Unix tutorial on paths, permissions
these are very straightforward concepts, but at the same time can also give you extraordinarily confusing errors when used incorrectly
many tools have a hard time with invalid inputs and will raise cryptic errors,
so it is always very important that you ensure that paths exist, you can write to them, understand what relative and absolute paths are and so on,
troubleshooting one path error at a time won't be an effective way of using either one of our times
The path you show here is not the same as the one you showed above, so we're not sure what you're doing.
Run this in the command line and show us the output:
This is what I have:
And what's the error message when you run the gatk command?
I got this:
Add an
echo
before thegatk
, run the command and show us the output.This is what I got:
That makes it clear - your
${known_sites}
is empty. Always echo the command and make sure it looks good before executing it.Thank you so much for the follow up. Unfortunately, I got a new error:
The machine couldn't find
Homo_sapiens_assembly38.dbsnp138.vcf
if I use: known_sites="labs/chris/variant_calling/Desktop/demo/supporting_files/hg38/Homo_sapiens_assembly38.dbsnp138.vcf" so I copied it to the current directory and run:You need to google individual error messages to understand what you're missing. This is becoming almost like chat support right now. You cannot use a plain VCF file, you need to bgzip and tabix-index it.