I want to perform alignment of the human sequence reads with the reference genome. I need reads length 150 and more (500) to test some algorithm. Where can I find such type of reads, both single and paired-end. I got reads from 1000 genome project around 100 length, however I want reads of length more.
Use Sequence Read Archive advanced search and provide read length as 150 and species as Homo sapiens. You can add more filters if you want. The query should look something like this:
(150[ReadLength]) AND "Homo sapiens"[orgn:__txid9606]
This is a very basic requirement and a lot of tools are available to simulate artificial reads from the genome under question. More interestingly, you can define the number, length and quality of reads also. One such well documented program ( ArtificialFastqGenerator )is here
Here is the publication link