Question: Select random lines from the given bed files following the same distribution as given input file
0
gravatar for Naresh D J
2.7 years ago by
Naresh D J60
Turku/BTK
Naresh D J60 wrote:

Hi,

How to randomly select lines from a bed file? More specifically, I want to create a smaller bed file of genomic regions (chip-seq peaks) from a larger one, while maintaining the relative proportion of lines from each chromosome. For example if my input file has 1000 lines and want to select 100 lines randomly but maintaining the chromosome proportions relatively same.

It seems that this question was asked earlier here but I did not find the right solution? (How To Randomly Sample A Subset Of Lines From A Bed File)

Can you suggest me some tools or using awk or based on shell script.

Thank you, Naresh D J

chip-seq bedtools • 816 views
ADD COMMENTlink modified 2.7 years ago • written 2.7 years ago by Naresh D J60

In which way does the previous post not answer your question ?

ADD REPLYlink written 2.7 years ago by Jean-Karim Heriche18k

The answers given in the previous post were based on choosing the fixed number of lines from each chromosome and not maintaining the relative proportions.

ADD REPLYlink written 2.7 years ago by Naresh D J60

As I read the first answer there, it does what I understand you want: say you want 100 random lines from your bed file while preserving the proportion of each chromosome in these 100 lines, that's what I understand the solution provided does.

ADD REPLYlink written 2.7 years ago by Jean-Karim Heriche18k
0
gravatar for Pierre Lindenbaum
2.7 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum117k wrote:

awk '($0 ~ /^#/ || rand()<0.1)' input.vcf ?

ADD COMMENTlink written 2.7 years ago by Pierre Lindenbaum117k

How to input our desired number of lines ?

ADD REPLYlink written 2.7 years ago by Naresh D J60

Finally, I wrote a couple of lines in R and seems the best solution.

ADD REPLYlink written 2.7 years ago by Naresh D J60
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1368 users visited in the last hour