Remove all rows that begin with characters and keep only those that begin with numbers in Linux.
2
0
Entering edit mode
5 months ago
salman_96 ▴ 50

I have a tab sep file that looks like this. I need to prepare it so I can import it in R. The file is genome hg19 file.

1      344544     rs30540

2      284783     rs34560

14     384643     rs30567

19     584643     rs31110

Genome_phase,common=1,19,genomes=hg19
11    222643     rs30543

44    544643     rs32345

Genome_phase,common=1,23,genomes=hg19

I want to keep only the rows that start with numbers and drop all others that begin with characters. It is a huge file of a few gbs.Any way to do that in linux

Any assistance will be appreciated. Regards

hg19 linux • 332 views
ADD COMMENT
1
Entering edit mode
5 months ago
$ awk -v FS="\t" -v OFS="\t" '($1 ~ /^[0-9]/)' in.txt > out.txt
ADD COMMENT
1
Entering edit mode
5 months ago
$ sed -n '/^[0-9]/p' test.txt
$ grep '^[0-9]' test.txt
ADD COMMENT

Login before adding your answer.

Traffic: 1888 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6