Entering edit mode
                    4.5 years ago
        salman_96
        
    
        ▴
    
    70
    Hi I have hg19 snps file which has some extra rows that I do not need and looks like this below
##INFO=<ID=COMMON,Number=1,Type=Integer,Description="RS is a common SNP.  A common SNP is one that has at least one 1000Genomes population with a minor allele of frequency >= 1% and for which 2 or more >
##INFO=<ID=TOPMED,Number=.,Type=String,Description="An ordered, comma delimited list of allele frequencies based on TOPMed, starting with the reference allele followed by alternate alleles as ordered in>
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO
1       10019   rs775809821     TA      T       .       .       RS=775809821;RSPOS=10020;dbSNPBuildID=144;SSR=0;SAO=0;VP=0x050000020005000002000200;GENEINFO=DDX11L1:100287102;WGT=1;VC=DIV;R5;ASP
1       10039   rs978760828     A       C       .       .       RS=978760828;RSPOS=10039;dbSNPBuildID=150;SSR=0;SAO=0;VP=0x050000020005000002000100;GENEINFO=DDX11L1:100287102;WGT=1;VC=SNV;R5;ASP
1       10043   rs1008829651    T       A       .       .       RS=1008829651;RSPOS=10043;dbSNPBuildID=150;SSR=0;SAO=0;VP=0x050000020005000002000100;GENEINFO=DDX11L1:100287102;WGT=1;VC=SNV;R5;ASP
1       10051   rs1052373574    A       G       .       .       RS=1052373574;RSPOS=10051;dbSNPBuildID=150;SSR=0;SAO=0;VP=0x050000020005000002000100;GENEINFO=DDX11L1:100287102;WGT=1;VC=SNV;R5;ASP
1       10055   rs892501864     T       A       .       .       RS=892501864;RSPOS=10055;dbSNPBuildID=150;SSR=0;SAO=0;VP=0x050000020005000002000100;GENEINFO=DDX11L1:100287102;WGT=1;VC=SNV;R5;ASP
I only want to keep anything from this row using either R or Linux
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO
What have you tried? The logic you need is to exclude all lines that begin with
##.grepshould help you achieve this. Use google to find out how to exclude lines that start with a pattern using grep.I used
sedto remove first 55 rowsThat approach has many pitfalls:
grepwould tell you what content you deleted, and given that number of lines is not important as long as the nature of the content is known, you should focus on documenting that.