Deleted:GTF file structure
0
0
Entering edit mode
2.9 years ago
storm1907 ▴ 30

Hello, I try to intersect GTF files with tped files with purpose to switch chromosomal location to gene location in tped files. I created some topics before about steps needed: conversion from genomic to gene coordinates in a large file bedtools intersect combined with filtering after particular word converting spaces to tabs in gtf files conversion from genomic to gene coordinates in a large file tped files to bed with awk

So, I did steps suggested:

1) Got gtf from 

    https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/001/405/GCF_000001405.39_GRCh38.p13/GCF_000001405.39_GRCh38.p13_genomic.gtf.gz

2) extracted only genes annotation with 

    awk -F"\t" '$3=="gene"' GCF_000001405.39_GRCh38.p13_genomic.gtf.gz > aaa.gtf

3) converted nc chromosome symbols from NC_xxx to X with 

    sed -r 's/^NC_[0]*//;s/\.[0-9]\+//;/^N[WT]_|12920/d' aaa.gtf > bbb.gtf

4) converted to bed 

gtf2bed < bbb.gtf > ccc.bed
5) intersected with sample bed file (obtained from Plink tped file) Sample bed file:

chr1    183189  183189
chr1    609407  609407

bedtools intersect -wb -a ccc.bed-b sample.bed >  ddd.bed

5) removed unwanted symbols from final intersected file with

    cat test sed 's/"//g' | sed 's/;//g' ddd.bed > eee.bed  


6) printed out only the needed columns from intersected file

    cat fintest  | awk '{print $4,$33,$35,$34,$36,$37}' eee.bed  > fff.bed   

In the end I have file with different content of columns:

KLHL17 0 C C  
KLHL17 0 A A   
PLEKHN1 966179 0 966179 A A
PLEKHN1 966227 0 966227 G G
PERM1 "protein_coding" "C1orf170" gene_synonym 1 976215
PERM1 "protein_coding" "C1orf170" gene_synonym 1 976669
However the desired structure should be like this:

PLEKHN1 966179 0 966179 A A
PLEKHN1 966227 0 966227 G G
PLEKHN1 966272 0 966272 G G

Is it because of inconsistency in GTF files? It should be succesfully intersected with my sample file, so I can switch NC position to gene name and position in sample file and save tped file structure.

Thank you!

GTF plink • 367 views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 2423 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6