Question: sorting a csv file with 3 columns
0
gravatar for dimitrischat
29 days ago by
dimitrischat120
dimitrischat120 wrote:

Hello all,

I am trying to sort a csv file containing 3 values. The first one is the one that matters. It contains many times the number 1, 2, 3, 4 etc etc. But each number doesnt appear the same number of times as the other number. So, i wanna sort the csv file from its 1st column, from less times of appearance of each number to the highest number of appearces.

for example

1
1
1
2
2
2
2
2
3
4
4

->

3
4
4
1
1
1
2
2
2
2

using mac but i can get access to a linux

next-gen • 146 views
ADD COMMENTlink modified 27 days ago by Jorge Amigo12k • written 29 days ago by dimitrischat120

how is it related to bioinformatics ?

ADD REPLYlink written 29 days ago by Pierre Lindenbaum131k
1

we could give him the chance that the numbers are autosomal chromosomes ;)

ADD REPLYlink written 27 days ago by Jorge Amigo12k

yes they are, i just changed the chr"x" to number for convenience.

ADD REPLYlink written 27 days ago by dimitrischat120
2
gravatar for Jorge Amigo
27 days ago by
Jorge Amigo12k
Santiago de Compostela, Spain
Jorge Amigo12k wrote:

There are multiple ways you could do it. Here's my suggestion:

perl -ne 'BEGIN { open IN, "input.csv" or die $!;
while (<IN>) { /^(\S+)/ and $d{$1}++ } close IN }
/^(\S+)/ and print "$d{$1}\t$_"' input.csv \
| sort -k1,1n -k2,2V | cut -f2-

The idea behind this code is to read the file once to count the number of times the first column appears, and to read the file twice to create a new first column with those previous counts. You just then need to sort numerically by that new first column and remove it at the end.

ADD COMMENTlink modified 27 days ago • written 27 days ago by Jorge Amigo12k

is there any way to do this without perl? thanks a lot for your help!!

ADD REPLYlink written 26 days ago by dimitrischat120
1

What's your problem with using Perl? (Not being rude or anything.)

ADD REPLYlink modified 26 days ago • written 26 days ago by Dunois150
1

I see. Would you find yourself more confortable by implementing that same rationale on Excel perhaps?

How to sort a column by occurrence count in Excel?

ADD REPLYlink modified 26 days ago • written 26 days ago by Jorge Amigo12k

i ll try in excel. but the command above doesnt work for my file : http://www.mediafire.com/file/v0azlrb3sx9r5x7/example.csv/file

ADD REPLYlink written 26 days ago by dimitrischat120

Do you have R available? If yes, I can offer you a small R script that'll do the sorting for you.

ADD REPLYlink written 26 days ago by Dunois150

yea i do, i use R studio (if that helps)

ADD REPLYlink written 26 days ago by dimitrischat120
1
gravatar for Dunois
26 days ago by
Dunois150
Dunois150 wrote:

Here this script should help then. (Let me know if you can't access it.)

Looks like I've also reached my post limit for the next six hours, so if you need help with the script, just reply to this comment, and I'll edit my replies in here.

ADD COMMENTlink modified 26 days ago • written 26 days ago by Dunois150
1

yes, finally!!! worked! thank you!!

ADD REPLYlink modified 26 days ago • written 26 days ago by dimitrischat120
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1521 users visited in the last hour