Question: why my ROH file result does not include some chromosome
1
gravatar for mary
8 months ago by
mary210
Bologna university
mary210 wrote:

Hi,

I am using plink to calculate runs of homozygosity. For 50k ovine data. After Quality control I use folow comand:

plink --file mydata  --homozyg-snp 40 --homozyg-density 100 --homozyg-kb 1000 --sheep --out mydata_roh

I got Plink.result.hom, it just have ROH for fist 11 chromosome. I check my ped and map file, it correct. I dont know what I can not have roh for chr 12 to 26.

Can anyone please tell me why it dose happen?

ADD COMMENTlink modified 8 months ago by Kevin Blighe48k • written 8 months ago by mary210
1
gravatar for Kevin Blighe
8 months ago by
Kevin Blighe48k
Kevin Blighe48k wrote:

The density of genotyping SNPs on chr12 - 26 may be too low, the result being that no data is being shown. Try to modify your input parameters. For example, --homozyg-snp 40 seems too high for such a low density microarray ('50k ovine').

Also try BCFtools ROH.

ADD COMMENTlink written 8 months ago by Kevin Blighe48k

Hi Kevin I have 15 sheep breeds, when I check number of SNP in most of them, they are same and I have ROH in all studied breed, unless this breed that I ask question. so I dont think it was main problem. I download my data from https://www.datadryad.org/. may be the data had problem?

ADD REPLYlink modified 8 months ago • written 8 months ago by mary210

Okay. You could, literally, look at the data for these chromosomes to see how it looks. You could also derive summary statistics for these chromosomes (using PLINK). Maybe that will enlighten you further. Also check the PLINK logs to see if any data was automatically removed.

ADD REPLYlink written 8 months ago by Kevin Blighe48k

Hi . I check statistics for chr12 for sample and I check log and data. no noticeable thing was found; see my log file below:

Options in effect: --file scott_corrected2 --chr 12 --noweb --out rest12 --missing

38884 (of 38884) markers to be included from [ scott_corrected2.map ]

ADD REPLYlink modified 8 months ago • written 8 months ago by mary210

Indeed, nothing appears to be filtered out. I am limited by what I can do without actually having the data here.

ADD REPLYlink written 8 months ago by Kevin Blighe48k

nothing to be filtered because I did quality control before that . the data is free. you can download from https://datadryad.org/resource/doi:10.5061/dryad.8f191. after downloading, I make ped and map file and use for my research. I worry about may be I make some mistake when contract ped and map file. but when I check every thing is ok

ADD REPLYlink modified 8 months ago • written 8 months ago by mary210

If you want to show all of the commands that you used, it may help.

ADD REPLYlink written 8 months ago by Kevin Blighe48k

yeah, I did as below

1- tail -n +2 mapv2.txt > scottisheep.map

2- tail -n +2 ped.txt > 6firstcol_scot

3- tail -n +2 genotypes.txt > genotype_pedfile

4- perl -ne '($id, $tmp) = split( / /, $_, 2 ); $tmp =~ s/ //g; print "$id "; print join(" ", split( //, $tmp ) );' scott752_1.ped > scott752_2.ped (separate genotype each other)

5- I forgot how I paste 6firstcol_scot to scott752_2.ped (I cant remember)

6-I remap map file for Oar_v4.0 by my_script as below

6a- python remap_position.py scottisheep.map sheep-50k-map-oar_v4.0 > remap_pos

6b-python remap_chr.py scottisheep.map sheep-50k-map-oar_v4.0 > remap_chr&pos

python script for remap position

import sys

import os

f=open(sys.argv[1], 'r')

l=f.readline()

f2=sys.argv[2]

while l!= "":

snp_line=l.split("\t")

snp1=snp_line[1]

grepped=os.popen('grep "'+snp1+'" '+f2, 'r').readline()

if grepped=="":

    print l.strip()

else:

    pos=grepped.split()[3]

    print snp_line[0]+'\t'+snp_line[1]+'\t'+snp_line[2].strip()+ '\t'+pos

l=f.readline()

python script for remap

import sys

import os

f=open(sys.argv[1], 'r')

l=f.readline()

f2=sys.argv[2]

while l!= "":

snp_line=l.split("\t")

snp1=snp_line[1]

grepped=os.popen('grep "'+snp1+'" '+f2, 'r').readline()

if grepped=="":

    print l.strip()

else:

    chr=grepped.split()[0]

    print chr+'\t'+ snp_line[1]+'\t'+snp_line[2]+'\t'+snp_line[3].strip()

l=f.readline()
ADD REPLYlink modified 8 months ago • written 8 months ago by mary210

I note that there are 38884 markers in your dataset; however, I downloaded the data and I see ~53000 markers. Also, can you literally look at your map file to investigate some of the entries from chr12?

I tried to work with the data but it is formatted in a bad way, and I do not have the time to work on it. If you wish, and you still believe there should be ROH calls for chr12-26, then you could try the plink2-users Google Group: https://groups.google.com/forum/#!forum/plink2-users

ADD REPLYlink written 8 months ago by Kevin Blighe48k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1974 users visited in the last hour