error relate to R
0
0
Entering edit mode
2.8 years ago
rheab1230 ▴ 140

I am running a R script to update vcf file with rsid of snp file.I am getting error.

Error in `$<-.data.frame`(`*tmp*`, variant_id, value = "snp__") :
  replacement has 1 row, data has 0
Calls: $<- -> $<-.data.frame
Execution halted

Can anyone please help me with this?

TWAS SNP R • 1.8k views
ADD COMMENT
0
Entering edit mode

Hi, this is not reproducible. Both code and data are missing. Please provide it with a minimum working example.

ADD REPLY
0
Entering edit mode
## this is the R script:
## called from the matching .sh script (which passes in the parameters)


# getting file name as passed from the bash script
passed_args<-commandArgs(trailingOnly = TRUE)

# limits the snp reference file to only the chrom of interest
chr_num<-passed_args[3]

# variable for the snp151 reference file - expecting at least a CHROM, POS, and rsid columns in the snp ref file
snp_file<-passed_args[4]

# imports the reference data file for the SNPs (old vcf "ID" column contains snp_22_30192384; new "ID" should be the rsid)
snp_ref<-read.delim(snp_file, header=TRUE, sep=" ")
snp_ref<-snp_ref[which(snp_ref$CHROM==chr_num),]
snp_ref$variant_id<-paste("snp_",snp_ref$CHROM, "_",  snp_ref$POS, sep="")
snp_ref$CHROM<-as.integer(snp_ref$CHROM)
paste("Chr ", chr_num, "SNPs found: ", nrow(snp_ref), sep="")

#get from args[] the .vcf file
vcf_file_name<- passed_args[1]

#find the first line of actual data
header_rows_skip<-as.integer(passed_args[2])-1

#imports the file into a table without the metadata lines
body_only_VCF<-read.delim(vcf_file_name, header = TRUE, skip=header_rows_skip, sep="\t")
colnames(body_only_VCF)[1]<-"CHROM"
paste("orig VCF file rows ", nrow(body_only_VCF))

#removes anything where the reference or alternate allele is not G/C/T/A (i.e. narrows down vcf working file to just SNPs)
body_only_VCF<-body_only_VCF[which(body_only_VCF$REF %in% c("A","T","C","G") & body_only_VCF$ALT %in% c("A","T","C","G")),]
paste("VCF filtered to only SNPS ", nrow(body_only_VCF))

#merges the VCF file with records that have a matching lookup in the snp_ref table
final_VCF<-merge(body_only_VCF,snp_ref, all = FALSE )
paste("Merge complete - matched VCF records by CHROM and POS", nrow(final_VCF))

#get final file column order
VCF_col_names<-names(body_only_VCF)

#trying to free up some memory
rm(body_only_VCF)

#udpated the ID column and removes the columns from snp_ref
final_VCF$ID<-final_VCF$rsid
final_VCF<-final_VCF[,VCF_col_names]

#rename first col to #CHROM to match vcf file formatting
colnames(final_VCF)[1]<-"#CHROM"

#write final file to the vcf_directory
final_file=paste(substr(vcf_file_name, 1, nchar(vcf_file_name)-4), "_updated2.vcf", sep="")
write.table(final_VCF, file = final_file, sep="\t", row.names = FALSE, quote=FALSE)
ADD REPLY
0
Entering edit mode

What the script does is it takes rsid from snp file and put that in ID coloumn in vcf file.

ADD REPLY
0
Entering edit mode

Thanks for showing the code; however, this doesn't help due to the fact that we cannot have certainty which line produces the error. You have neither provided any test input data; so, we cannot run the code in its entirety.

Please tell us which line produces the error. Also show the output of str() run on the objects that are used in the command that produces the error.

ADD REPLY
0
Entering edit mode

This is the line that is showing the error: snp_ref$variant_id<-paste("snp_",snp_ref$CHROM, "_", snp_ref$POS, sep="")

ADD REPLY
0
Entering edit mode

I am not able to attach the input data file. Can you please tell me how to attach the input file?

ADD REPLY
0
Entering edit mode

I see, what is the output of str(snp_ref) just before you run that line of code?

ADD REPLY
0
Entering edit mode
str(snp_ref)
spec_tbl_df [9 x 5] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ CHROM   : num [1:9] 1 1 1 1 1 1 1 1 1
 $ POS     : num [1:9] 10038 10042 10050 10054 10062 ...
 $ chromEnd: num [1:9] 10039 10043 10051 10055 10063 ...
 $ rsid    : chr [1:9] "rs978760828" "rs1008829651" "rs1052373574" "rs892501864" ...
 $ X5      : chr [1:9] NA NA NA NA ...
 - attr(*, "problems")= tibble [9 x 5] (S3: tbl_df/tbl/data.frame)
  ..$ row     : int [1:9] 1 2 3 4 5 6 7 8 9
  ..$ col     : chr [1:9] NA NA NA NA ...
  ..$ expected: chr [1:9] "5 columns" "5 columns" "5 columns" "5 columns" ...
  ..$ actual  : chr [1:9] "4 columns" "4 columns" "4 columns" "4 columns" ...
  ..$ file    : chr [1:9] "'C:/Users/Desktop/variant_mapping_small.txt'" "'C:/Users/Desktop/variant_mapping_small.txt'" "'C:/Users/Desktop/variant_mapping_small.txt'" "'C:/Users/Desktop/variant_mapping_small.txt'" ...
 - attr(*, "spec")=
  .. cols(
  ..   CHROM = col_double(),
  ..   POS = col_double(),
  ..   chromEnd = col_double(),
  ..   rsid = col_character(),
  ..   X5 = col_character()
  .. )
ADD REPLY
0
Entering edit mode

Oh, your object is a weird format.. spec_tbl_df ... please change it to a data frame via data.frame(snp_ref)

ADD REPLY
0
Entering edit mode

Okay, I will do that

ADD REPLY
0
Entering edit mode

I have one more question. When I am mapping vcf file and dbsnp file so that vcf file can contain rsid, my vcf file size is getting reduced.

ADD REPLY
0
Entering edit mode

That makes sense. Please view the VCF via bcftools view my.vcf | more, and then you'll observe why (not all of your variants will have an rs ID).

ADD REPLY
0
Entering edit mode

The dbSNP file is a file coming from the UCSC Genome Browser that would have a list of all the known SNPs with their rsid. The GEUVADIS .vcf files I used were older and did not have an appropriate rsid in the ID column of that file. So I made a script to update .vcf file to assign them an appropriate modern rsid. However, the record count had dropped by a lot from the original .vcf file. Basically SNPs are getting lost from the .vcf file when merged with the dbSNP file.

ADD REPLY
0
Entering edit mode

It could be variants in GEUVADIS that have no known dbSNP rs ID, I suppose

ADD REPLY

Login before adding your answer.

Traffic: 1514 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6