Changing To Reverse Complementary Snp Nucleotides
3
0
Entering edit mode
9.3 years ago
bibb77 ▴ 90

Hello everyone, I would like to change allele codes to the reverse complement, this is the file that I have, is composed of Allele 1, Allele2 and Strand Orientation:

A    T    +
T    T    +
A    T    - 
C    T    -
G    G    -
G    C    +
A    G    -

I need to flip only the ones oriented to the minus strand, so the result would be like this:

A    T    +
T    T    +
T    A    +
G    A    +
C    C    +
G    C    +
T    C    +

The file is tab separated and has arround 580k SNPs, I will really appreciate if you can help me with some nice awk/perl or any code to do that :)

snp • 3.2k views
ADD COMMENT
2
Entering edit mode
9.3 years ago
JC 13k

POL:

perl -pe 'tr/ACGT-/TGCA+/ if (/-/)' < SNP.list > SNP.for
ADD COMMENT
0
Entering edit mode
9.3 years ago
Rob Syme ▴ 540

Maybe some awk:

BEGIN{OFS="\t"}
function complement (nuc) {
    switch (nuc) {
        case /[aA]/:
            return "T"
        case /[tT]/:
            return "A"
        case /[cC]/:
            return "G"
        case /[gG]/:
            return "C"
        default:
            return "N"
    }
}

$3 != "-" {print $1, $2, $3}
$3 == "-" {print complement($1), complement($2), "+"}

Saved to a file complement_alleles.awk and run with:

awk -f complement_alleles.awk < input.txt
ADD COMMENT

Login before adding your answer.

Traffic: 1335 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6