Question: Extracting sample ID and their genotypes
0
gravatar for Siavash Salek Ardestani
22 months ago by

How can extract these individuals "A", "B", "C" and their genotypes in this dataset? Is there a command in linux for this?

A 11112121112121121102111121110
B 11211112112110211111211121110
C 20222222202020220202222222220
D 11111112112110211111211121110
E 20222222202020220202222222220
F 11112121112121121102111121110
G 11211112112110211111211121110
H 11211112112110211111211121110

Thanks!

snp chip-seq next-gen • 451 views
ADD COMMENTlink modified 22 months ago by paolo002160 • written 22 months ago by Siavash Salek Ardestani20

Hello siyavash_damdar ,

please explain the data format a bit more. Also show an example how the output should look like.

Thanks!

fin swimmer

ADD REPLYlink written 22 months ago by finswimmer13k

Dear finswimmer, Actually, it is a small example and my real data is much bigger than this dataset. The format is text (txt). Here the first column is included the individuals and the second column is their genotypes. I want extract the"A","B" and "C" genotypes in text file like this:

A 11112121112121121102111121110
B 11211112112110211111211121110
C 20222222202020220202222222220
ADD REPLYlink written 22 months ago by Siavash Salek Ardestani20
1

You can use grep command with -f option providing the file of IDs which you want to extract. For e.g.

grep -f ID_to_extract.txt complete_genotype_file.txt >selected_genotype.txt
ADD REPLYlink modified 22 months ago by finswimmer13k • written 22 months ago by toralmanvar900

Dear siyavash_damdar ,

you are just repeating the things you've already said in your first post. Unfortunately this doesn't help me to understand what you are trying to do. So please rephrase.

Do you just want to have the second column? Do you want one file per sample? ...

fin swimmer

ADD REPLYlink written 22 months ago by finswimmer13k

Dear finswimmer, In this data first column is included samples (A,B,C,etc), the second column is included genotypes (each number per SNP). So, I need to know, for example, how can I extract A B C samples and their genotypes together (all data in A B C rows) in a text file.

ADD REPLYlink written 22 months ago by Siavash Salek Ardestani20
0
gravatar for paolo002
22 months ago by
paolo002160
paolo002160 wrote:

Is you data a vcf file? If you want to extract individuals from vcf file you can use bcftools and provide a txt file with IDs of those individuals

ADD COMMENTlink written 22 months ago by paolo002160

No, my data format is not VCF.

ADD REPLYlink written 22 months ago by Siavash Salek Ardestani20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1179 users visited in the last hour