UBUBTU csplit command
0
0
Entering edit mode
2.4 years ago

I have a file, like below containing 50000000 rows. I want to split it chromosome wise, like output file 1 will contain Chr 1 and output file 2 will contain Chr 2 and so on.

V1    V2 V3 V4 V5   V6
1 chr1 10469  +  3  3 TCGC
2 chr1 10470  - 25 30 GCGA
3 chr1 10471  +  1  5 GCGG
4 chr1 10472  - 13 39 CCGC
5 chr1 10484  +  0  6 CCGG


I am using UBUNTU platform and csplit command. I could not figured it out. Could you please help me what will be the syntax?

Thanks Shrinka

UBUNTU csplit command • 806 views
0
Entering edit mode

It can be as simple as grep chr1 yourfile > chr1file, grep chr2 yourfile > chr2file etc. Add the header at top if you need it.

0
Entering edit mode

Thanks for your reply. I have used that. It is producing 0 kb output file, may be it is memory related issue. My RAM size is not good to tackle, as my file size is big. So I thought csplit command can be useful in the memory constrained situation

Thanks Shrinka

0
Entering edit mode

If your example file above it correct then the above command should work. Did you copy the file over to unix from a windows machine by any chance?

0
Entering edit mode

By using this I loaded UBUNTU and I am using that. I have Windows 10 in my laptop https://crashcourse.housegordon.org/split-fasta-files.html

I used precisely this command

grep "chr1" B19818.CEMT_178.Bisulfite-Seq.hg38.B19818_2_lanes_dupsFlagged.q5.5mC.CpG

It is generating 0 KB files

If needed I can send one file to you

Regards

Shrinka

0
Entering edit mode

grep don't use any memory, please provide what exact command are you typing and which OS

0
Entering edit mode

By using this I loaded UBUNTU and I am using that. I have Windows 10 in my laptop https://crashcourse.housegordon.org/split-fasta-files.html

I used precisely this command

grep "chr1" B19818.CEMT_178.Bisulfite-Seq.hg38.B19818_2_lanes_dupsFlagged.q5.5mC.CpG

It is generating 0 KB files

If needed I can send one file to you

Regards

Shrinka

0
Entering edit mode

you need to use output redirection:

grep -w chr1 B19818.CEMT_178.Bisulfite-Seq.hg38.B19818_2_lanes_dupsFlagged.q5.5mC.CpG > B19818.CEMT_178.Bisulfite-Seq.hg38.B19818_2_lanes_dupsFlagged.q5.5mC.CpG.chr1

the new file is generated as B19818.CEMT_178.Bisulfite-Seq.hg38.B19818_2_lanes_dupsFlagged.q5.5mC.CpG.chr1

0
Entering edit mode

Nope the same problem remain

0
Entering edit mode

Put an example file up (does not need to be complete file) at pastebin.com.

0
Entering edit mode