Question: Merging specific columns from different txt files in a unit file
0
gravatar for jivarajivaraj
5 months ago by
jivarajivaraj40 wrote:

Hi,

I counted reads in each bam file by featureCounts, now I have many count.txt files. how I can merge column 7th of each txt file and the first column (gene id) from one file in a unit txt file in mac OS?

$ join 7 counts24.txt counts25.txt counts26.txt counts27.txt counts28.txt counts29.txt counts30.txt counts31.txt counts32.txt counts33.txt counts34.txt counts35.txt counts36.txt counts37.txt counts38.txt counts43.txt counts44.txt counts45.txt counts46.txt counts47.txt counts48.txt counts49.txt counts50.txt counts51.txt counts52.txt counts53.txt counts54.txt counts55.txt counts56.txt counts57.txt > out.txt
*usage: join [-a fileno | -v fileno ] [-e string] [-1 field] [-2 field]
            [-o list] [-t char] file1 file2*
software • 277 views
ADD COMMENTlink modified 5 months ago by Alex Reynolds26k • written 5 months ago by jivarajivaraj40
1

simple search and you get multiple hits as to how you can use multiple bam files with featureCounts and generate one matrix with all samples for a expression matrix.

combining quantification (featureCounts) result files into a single dataset

https://support.bioconductor.org/p/64932/

Finally read the manual of featureCounts first. It supports multiple bams. There is no harm reading a software usage manual. They are designed for effective usage .

ADD REPLYlink written 5 months ago by vchris_ngs4.5k

Why is this a "software error"? Please use sensible tags.

What have you tried to solve this issue? Users will help you sooner if you show some effort yourself.

ADD REPLYlink written 5 months ago by WouterDeCoster32k

Sorry, I googled and tried the above code by which I obtained error. That is why I taged with software error

ADD REPLYlink written 5 months ago by jivarajivaraj40

can you provide first few lines any two files?

ADD REPLYlink written 5 months ago by cpad01129.3k
4
gravatar for venu
5 months ago by
venu5.6k
Germany
venu5.6k wrote:

FYI, featureCounts accepts many bam files at once and generates one count table for all BAMs.

Regarding your error, I think that's not the proper way to use join. I guess order of genes will be same in all counts.txt files from featureCounts. You can simply do paste *counts.txt with little preprocessing (i.e. keep gene_id and counts column in each file).

ADD COMMENTlink modified 5 months ago by h.mon20k • written 5 months ago by venu5.6k
3
gravatar for Alex Reynolds
5 months ago by
Alex Reynolds26k
Seattle, WA USA
Alex Reynolds26k wrote:

One way is to paste together a bunch of process substitutions, each of which that cut the desired column from its file:

$ paste <(cut -f1 somegeneid.txt) <(cut -f7 counts24.txt) <(cut -f7 counts25.txt) ... <(cut -f7 counts57.txt) > out.txt

Fill in ... with the rest of the substitutions for counts26.txt through counts56.txt. A script could programmatically generate and run this command for you if your files have a consistent naming scheme.

ADD COMMENTlink modified 5 months ago • written 5 months ago by Alex Reynolds26k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2137 users visited in the last hour