I have a tab sep file that looks like this. I need to prepare it so I can import it in R. The file is genome hg19 file.
1 344544 rs30540 2 284783 rs34560 14 384643 rs30567 19 584643 rs31110 Genome_phase,common=1,19,genomes=hg19 11 222643 rs30543 44 544643 rs32345 Genome_phase,common=1,23,genomes=hg19
I want to keep only the rows that start with numbers and drop all others that begin with characters. It is a huge file of a few gbs.Any way to do that in linux
Any assistance will be appreciated. Regards