Any python script to merge files
3
0
Entering edit mode
7.2 years ago
BehMah ▴ 50

Hi Guys,

I have a list of genes and their RPMs for some samples (bed file) and wanna merge all genes and their RPMs from bed files into one new bed file (for differential expression analysis). Can anybody please give me a python script to do it?

Really thanks

RNA-Seq • 2.6k views
ADD COMMENT
0
Entering edit mode
7.2 years ago
BehMah ▴ 50

its like the following: if gene_id in file a.txt and b.txt is identical add value1 into outfile

A.txt: gene_id value1

B.txt: gene_id value1

output: gene_id value1(A.txt) value1(B.txt)

ADD COMMENT
0
Entering edit mode
7.2 years ago

csvtk join of csvtk. supposing a.txt and b.txt are tab-delimited files.

csvtk join -H -t -f gene_id a.tsv b.tsv
ADD COMMENT
0
Entering edit mode
7.2 years ago
BehMah ▴ 50

hi shenwei356,

Thanks for your reply and this useful tool. just was wondering if it is still doable to run it for 40 input files? and also *.txt files have multiple tabs in addition to value 1, then can I specify only value1 ?

ADD COMMENT
0
Entering edit mode

Please reply an answer by clicking ADD COMMENT below the answer.

csvtk join supports joining 2+ files.

If you have more columns, you can firstly retrieve the columns gene_id and value to a separated files, and then join.

Does the file has column names? If so, you can cut by column names (i.e., GENE, VALUE..), if not, switch flag -H on and use column fields (i.e., 1, 2...)

for f in *.txt; do \
    csvtk cut -H -t -f "1,2" "$f" > "$f.gene2value.tsv; \
done

Then join:

csvtk join -H -t -f 1 *.gene2value.tsv
ADD REPLY
0
Entering edit mode

Thanks do much shenwei356. When I run the first script it retrieve only fist line of the field 1,2. seems like it doesn't do any loop ??

ADD REPLY

Login before adding your answer.

Traffic: 1545 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6