Question: multiple Fastq file alignment
0
gravatar for me.sr1510
18 months ago by
me.sr15100
me.sr15100 wrote:

can we align two fastq file over one another without using reference? what are the other options apart from the comparing the two vcf files generated from the samtools. these are result of illumina whole genome sequencing.

kindly advice.

alignment • 1.1k views
ADD COMMENTlink modified 17 months ago by Biostar ♦♦ 20 • written 18 months ago by me.sr15100
1

What is your end goal?

ADD REPLYlink written 18 months ago by Devon Ryan88k

to align the fastq files and get the uncommon genes between them

ADD REPLYlink written 18 months ago by me.sr15100

The goal is fine but the way you want to go about it seems very odd. You could remove duplicate reads comparing the two files and then look at what is left, though that approach is not without its own problems.

ADD REPLYlink modified 18 months ago • written 18 months ago by genomax62k

I would like to know is there any way to compare the two files with each other

ADD REPLYlink written 18 months ago by me.sr15100

Comparing raw data is not going to be of much help. Your best bet is to follow @Brian's suggestion below.

ADD REPLYlink written 18 months ago by genomax62k

Define "uncommon genes". Are these genes that have differences in coverage, difference in sequence, exist in one but not the other according to de novo assembly, something else completely...?

ADD REPLYlink written 18 months ago by Devon Ryan88k

Difference in sequence

ADD REPLYlink written 18 months ago by me.sr15100
1

Not to mention for the average fastq file which contains thousands, if not millions of separate reads you'd have an impossibly large dataset to analyse even if you could align them all against one another.

ADD REPLYlink written 17 months ago by jrj.healey10k

It seems like it makes more sense for your goal to assemble both datasets, call and annotate the genes, and then compare the annotated genes.

Please update your question with more specifics about what kind of organisms and data you have, and your specific goal.

ADD REPLYlink written 18 months ago by Brian Bushnell16k

Since you do not have reference (or do not want to use reference), it would be denovo assembly. You would not get genes as there is no reference and thus you would get scaffolds and/or contigs. Then you may get differential scaffolds and/or contigs. Aligning one raw file against another raw file (esp reads) do not make sense to me (i.e to my limited knowledge)

ADD REPLYlink written 18 months ago by cpad011211k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1400 users visited in the last hour