Question: fastq file data
0
gravatar for Sam
16 months ago by
Sam100
Sam100 wrote:

Hello

how I can show nucleotide distribution and sequence length of multi fastq file (1.fq, 2.fq,...,6.fq) in just 2 separate graph?

Thanks

rna-seq fastq • 745 views
ADD COMMENTlink modified 16 months ago by Istvan Albert ♦♦ 79k • written 16 months ago by Sam100

what have you tried/found so far ?

ADD REPLYlink written 16 months ago by Pierre Lindenbaum118k

fastx-tool kit but it just for one lib and I want merge all lib data in one geraph

ADD REPLYlink written 16 months ago by Sam100

what about merging the fastq files before running fastx ?

ADD REPLYlink written 16 months ago by Pierre Lindenbaum118k

I don't want merge all reads , just want show sequence length of multi fq file , separately (for instance according color ) in one graph

ADD REPLYlink written 16 months ago by Sam100

I don't want merge all reads

http://hannonlab.cshl.edu/fastx_toolkit/commandline.html

"Tools can read from STDIN "

I think you should consider this option.

ADD REPLYlink written 16 months ago by Pierre Lindenbaum118k

should use output of fastx_quality_stats , as input for FASTA/Q Nucleotide Distribution

, so is it a right command ?

fastx_nucleotide_distribution_graph.sh -i file1.TXT file2.TXT file3.TXT [-t TITLE]  [-o OUTPUT]
ADD REPLYlink written 16 months ago by Sam100
1

No harm is trying the command out :)

ADD REPLYlink written 16 months ago by genomax64k

I think it should be something like

gunzip -c *.fq.gz | fastx_nucleotide_distribution_graph.sh - o OUTPUT
ADD REPLYlink modified 16 months ago • written 16 months ago by Pierre Lindenbaum118k

This is probably about Illumina data, but it never hurts to mention this!

ADD REPLYlink written 16 months ago by WouterDeCoster37k

above commands not work , except hurts and harm topic fast reply! could you help to find a way ?

ADD REPLYlink written 16 months ago by Sam100

You didn't tell us where you got the data from. So, Illumina data?

above commands not work

You'll have to tell us a bit more about how those don't work.

ADD REPLYlink written 16 months ago by WouterDeCoster37k

above commands not work

ADD REPLYlink written 16 months ago by Pierre Lindenbaum118k

but I explained all story, fastx_nucleotide_distribution_graph.sh just take TXT out put file of fastx_quality_stats as input, with one txt input file fastx_nucleotide_distribution_graph.sh works well but with two input file( -i file1.txt file2.txt) I got this error:

gnuplot> set term png size 1048,768
                  ^
         line 0: unknown or ambiguous terminal type; type just 'set terminal' for a list

WARNING: Plotting with an 'unknown' terminal.

It's illumina fq files, could you introduce other scripts ?

ADD REPLYlink modified 16 months ago • written 16 months ago by Sam100

Anything wrong with FastQC?

ADD REPLYlink written 16 months ago by WouterDeCoster37k

no all thing is OK with FASTQC , I can post only 5 post per 6hr, is there any way to improve it ?

ADD REPLYlink written 16 months ago by Sam100

I can post only 5 post per 6hr, is there any way to improve it ?

After you have been on Biostars for a while this restriction will be removed.

ADD REPLYlink written 16 months ago by WouterDeCoster37k

fastx calls gnuplot ; it seems that your version is not complete:

https://stackoverflow.com/questions/22816030

ADD REPLYlink written 16 months ago by Pierre Lindenbaum118k

it's works without any problem with -i file1.text , and I have a nice plot in out put , problem is fastx_nucleotide_distribution_graph.sh is not compatible with more than one input file. so I should find other alternative scripts for merge nucleotide_distribution_graph from distinct fq file.

ADD REPLYlink modified 16 months ago • written 16 months ago by Sam100

If you need just one plot for the entire dataset then cat'ing the fastq files together to generate one inout (per read, R1/R2) may be the way to go.

ADD REPLYlink written 16 months ago by genomax64k

I think to a awk code to have a sequence length frequency in each fq file and then merge them in Excel to have a unique sequence length graph with different color for each lib.

ADD REPLYlink modified 16 months ago • written 16 months ago by Sam100
0
gravatar for Istvan Albert
16 months ago by
Istvan Albert ♦♦ 79k
University Park, USA
Istvan Albert ♦♦ 79k wrote:

FastQC produces both plots.

ADD COMMENTlink written 16 months ago by Istvan Albert ♦♦ 79k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1996 users visited in the last hour