Entering edit mode
5.2 years ago
sneha.preha7
•
0
Hello All,
I am facing this issue I have treated my file with awk cmd but still the reads are non identical How can I run it please help me
9.7G file_1_awk.fastq
9.7G file_2_awk.fastq
8.3G FP_awk.fastq
62M RU_2_awk.fastq
253M FU_awk.fastq
8.3G RP_2_awk.fastq
[sneharl@compute-0-8 nep11]$ head FU_1_awk.fastq
@1_2/2
AAAGGGCATGGATTTGTATTTCAAGAGGCATNATGGGCAAGCTGTAACTTGTGAAGATTTNTTTGCTGCCATGAGAGATGCAAATGANGCA
+
ADHIIEHIIJIIJGIJAFIIGGIIIFJJJJI#07DHHIDGIHGIHGEHGEGEHEEEEEFE#,;ACACDDDDDDCCCDBDCCCCDDDD#+8?
@1_3/2
CTCTCCGCGTTGAAACCCTAAACCTACCCCCTACCTCAGGAATCGCCATGAAAGGAGGCAAATCGAAGGCTGAGCCGAAAAAGGCCGAC
+
HFGGIJIJJHIIJEGIJIB>DDGIC>FHIJJGDHIGIEEHHECCDEF?AC@CDDBB@@BBBDCA?8??<ACA@BCD<3>BB?>?ABB>B
@1_4/2
GTTGACTTCTCAAAGAGCAGTAAGTGTGCCCTTCAATGGGCGATCGATAATCTGGCCAACAAGGGAGATACCACACTCTTCATCATCCATG
[sneharl@compute-0-8 nep11]$ head RU_2_awk.fastq
@2_78/2
AAGCATCACGTCAAATGAACAGCCGTACAATACGCAGCGCACCTTATTCCAACGCCTTTTCTCGTCAACGATTTACGATTGCAAATTATCA
+
HHHJJJIJJJJJJJJJJIJJJJJJJIJIJIIJEHJJIJJIHIHHHHHHFFFFFDDDDDDDDDDDDDDDDDDDDDDEBDDD?CCDDDDDEDC
@2_682/2
AATACAAGAAAATTTCGTCTCATTCAAAAGTCCCTT
+
A?D<<<:AFF3<FEFI@+A8?4?E@ECFC3*1?**0
@2_735/2
ACTATGACAGATATCGATACCGATATTTTCATCCATCCACCGGACCCAAAATATACTACCAAAAAGGAAATGATTTCTCTTCACTTCGTTC
Error, pairs.K25.stats is empty. Be sure to check your fastq reads and ensure that the read names are identical except for the /1 or /2 designation. at /share/apps/trinityrnaseq-Trinity-v2.8.3/util/insilico_read_normalization.pl line 921.
Error, cmd: /share/apps/trinityrnaseq-Trinity-v2.8.3/util/insilico_read_normalization.pl --seqType fq --JM 96G --max_cov 200 --min_cov 1 --CPU 10 --output /state/partition1/sneha2/nep11/trinity_out_dir/insilico_read_normalization --max_pct_stdev 10000 --left /state/partition1/sneha2/nep11/SRR2551776_1.renamed.fastq --right /state/partition1/sneha2/nep11/SRR2551776_2.renamed.fastq --pairs_together --PARALLEL_STATS died with ret 512 at /share/apps/trinityrnaseq-Trinity-v2.8.3/Trinity line 2684.
main::process_cmd("/share/apps/trinityrnaseq-Trinity-v2.8.3/util/insilico_read_n"...) called at /share/apps/trinityrnaseq-Trinity-v2.8.3/Trinity line 3230
main::normalize("/state/partition1/sneha2/nep11/trinity_out_dir/insilico_read_"..., 200, ARRAY(0x7f334980a5f0), ARRAY(0x7f334980a5d8)) called at /share/apps/trinityrnaseq-Trinity-v2.8.3/Trinity line 3177
main::run_normalization(200, ARRAY(0x7f334980a5f0), ARRAY(0x7f334980a5d8)) called at /share/apps/trinityrnaseq-Trinity-v2.8.3/Trinity line 1314
Hi sneha.preha7, first of all your question is lacking essential details. What is
awk cmd
, so exact command and purpose of this is required in order to understand your question. Please also see Brief Reminder On How To Ask A Good Question. Second, your data are our of order, probably because of theawk cmd
causing massive deletion of reads in the reverse file. Also, why istrimmomatic
a tag, as you did not mention it in the post? Please edit your question and provide details to reproduce the problem.Sorry for the inconvenience,
Question:- "Is there any way to make reverse read Identical, Should I merge both files with fastqjoiner and again split it will it affect the quality of data ??" What to do to make reverse read identical and the size of my reads (Reverse has decrease drastically)
I am trying to Assemble few SRA Fastq files, I have run the cmd awk in order to make the both reads identical (Read1 and Read 2 )
Here is the cmd which I use :
Before Awk cmd the file size was 12 Gb and after it is 9.7GB (I have checked the read quantity after running the fastqc it is same, only the white space has been removed from the files)
after this I have perform the Triimomatic cmd for trimming but in the out put I got very less number of reverse read here is a cmd
Output:-
after this the size of the file is
After this I tried to run the trinity and it was aborted stating the following error
later i have checked tried to see the head so I found this
sneha.preha7 : Please use
ADD REPLY/ADD COMMENT
when responding to existing posts to keep threads logically organized.How did you download the SRA files? Did you use
-F
option (to restore original illumina command lines) then? It looks like either your original fastq files had odd headers or yourawk
manipulation may have done that.No I have downloaded using wget cmd and provided the link of ENA