The data I used for analysis is for the solid platform and was paired sequentially. The problem is with the length of the sequence of readings in the forward and reverse. My procedure was to align the csfastq file with the bowtie and the genome reference (for color space). But I aligned the forward and reverse files separately. Is this method correct?
The text of the article related to this data ....
""The samples were sequenced using the 50625 paired-end protocol, generating 75 nt+35 nt (Paired-End)+5 nt (Barcode) sequences. Quality data were measured using software SETS parameters (SOLiD Experimental Tracking System). For both reads, forwards and reverse, the seed was the first 25 nucleotides with a maximum of 2 mismatches.""
I have a few questions, please help me.
1- Surely I have to align the first 25 nucleotides similar to the article?
2.5 barcode nucleotides When I executed fastqc, the analysis result was not observed. Do I still need to delete the first 5 nucleotides of each reading?
My workflow is as follows
csfasta/ .qual files -------> alignment by bowtie (colorspase index/hg19) (f/r files seprately) ----------> sam to fastq files by 123fastq software ------> fastQC -------> trimmomatic ----------> alignment by hisat2( grch38// F,R files together ) ----------->htseq count
is it true?
I appreciate your help beforehand.