Question: Script for extracting atomic position of nucleotide base
0
gravatar for vahapel
4.1 years ago by
vahapel160
Turkey
vahapel160 wrote:

Hi everyone.
I have a tab-delimated tabular file (indicated below) including information about the atomic positions of nucleotid bases. My question is that how can i get first 10 lines of every 20000 lines in a datasheet has 10^7 lines. Basically, is there any script for such a purpose ?

BaseAtomNumber        atomic distances    NumberofNeighbour    IndexofAtom
1                                          1.94895                        655                         153   
1                                          2.34545                        566                         543
.
.
.
.

Many Thanks in advance for your help !

 

next-gen assembly genome • 916 views
ADD COMMENTlink modified 4.1 years ago • written 4.1 years ago by vahapel160
2
gravatar for george.ry
4.1 years ago by
george.ry1.1k
United Kingdom
george.ry1.1k wrote:

Assuming your files have a single line header that needs stripping first, as shown, then something like:

tail -n+2 <yourfile> \
| split -l 20000 - <yourprefix> \
&& find <yourprefix>* -exec bash -c 'head -n10 {}' \; \
> <youroutfile> \
&& rm <yourprefix>*

Strips the header, splits the file into separate files of size 20k lines, takes the top 10 rows of each to an output file and then deletes the intermediate files afterwards (make sure nothing else shares <yourprefix>*, or it'll be deleted too).

ADD COMMENTlink modified 4.1 years ago • written 4.1 years ago by george.ry1.1k
0
gravatar for vahapel
4.1 years ago by
vahapel160
Turkey
vahapel160 wrote:

We tried this and it works well, thanks for your help.

ADD COMMENTlink written 4.1 years ago by vahapel160
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1667 users visited in the last hour