Entering edit mode
5.8 years ago
star
▴
350
I have a .gtf file as below I would like to add "chr" to the first column of the file but not in first 5 rows?
#!genome-build GRCh37.p13
#!genome-version GRCh37
#!genome-date 2009-02
#!genome-build-accession NCBI:GCA_000001405.14
#!genebuild-last-updated 2013-09
1 ensembl chromosome 1 300239041 . . . ID=1;Name=chromosome:AGPv1:1:1:300239041:1
1 ensembl exon 3 104 . + . Parent=GRMZM2G060082_T01;Name=GRMZM2G060082_E07
output:
#!genome-build GRCh37.p13
#!genome-version GRCh37
#!genome-date 2009-02
#!genome-build-accession NCBI:GCA_000001405.14
#!genebuild-last-updated 2013-09
chr1 ensembl chromosome 1 300239041 . . . ID=1;Name=chromosome:AGPv1:1:1:300239041:1
chr1 ensembl exon 3 104 . + . Parent=GRMZM2G060082_T01;Name=GRMZM2G060082_E07
I used the foloww cods but it add "chr" to the first 5 lines, as well.
cat Homo_sapiens.GRCh37.gtf | sed 's/^/chr/' > chr.gtf
Check if a line starts with a number (and X/Y/MT) and only then add
chr
usingsed
.Could be done using R, but this is more suitable to command line, sed, awk,...