Tips for Reading genotype files into python
0
0
Entering edit mode
3.6 years ago
mdmurphy18 • 0

I am very new to python. I am trying to upload numericalized genotypic files into python to run some simulations. I am trying to read my genotype file of 279 inbred lines with 49335 SNPs, but I am only getting 16000 SNPs, not all of them. Can anybody give me some tips on how to fully read my genotypic file into python? Anything helps!

Here is what my genotype file as a text file looks like.

"ss96500460" "ss196420262" "ss196420260" "ss196501209" "ss196501212" "ss196419561" "ss196503512" "ss196503516" "ss196500466" "ss196522002" "ss196522004" "ss196417490" "ss196418517" "ss196521998" "ss196500468" "ss196500470" "ss196500472" "ss196500474" "ss196500476" "ss196500478" "ss196500480" "ss196500484" "ss196500482" "ss196500486" "ss196500488" "ss196500492" "ss196500494" "ss196500496" "ss196500498" "ss196500500" "ss196500504" "ss196500506" "ss196500508" "ss196500510" "ss196500512" "ss196500514" "ss196500516"
"3316" 2 2 2 2 2 2 2 2 2 0 0 2 2 2 2 2 2 0 2 2 2 2 2 2 0 2 2 2 0 1 2 2 0 2 2 2 0 2 2 2 2 0 2 2 2 2 2 0 2 2 1 2 2 0 2 2 1 2 1 2 2 2 2 2 0 2 1 0 2 2 1 2 0 0 2 1 1 2 2 2 2 2 2 2 0 0 2 2 0 2 0 2 2 0 0 0 0 0 2 0 2 2 0 0 2 2 1 0 0 2 2 1 0 0 1 0 0 1 2 2 1 2 0 2 2 0 0 0 0 2 2 0 2 2 2 1 1 2 1 1 0 0 2 2 0 0 0 0 1 0 2 2 0 2 2 1 2 2 2 2 0 0 0 0 0 0 2 0 2 1 0 0 0 0 0 2 2 2 0 0 2 0 2 2 0 1 2 2 2 2 2 0 2 0 0 2 2 2 2 0 0 2 0 0 2 2 1 0 2 0 2 1 2 0 2 0 2 0 2 2 0 2 2 0 2 1 2 2 2 0 2 2 0 2 2 2 0 2 0 2 1 0 0 1 0 0 0 2 2 2 2 2 2 0 2 1 1 2 0 1 1 2 2 0 2 0 0 0 0 2 0 0 2 1 0 0 2 0 0 0 2 2 2 2 2 0 2 2 2 2 2 0 2 2 0 2 0 2 2 2 0 0 0 0 2 2 2 1 1 0 0 0 0 0 0 0 2 2 0 0 0 0 2 0 0 2 2 0 2 2 0 2 0 2 2 2 2 0 0 2 2 2 0 2 0 2 2 2 2 2 1 0 2 0 2 2 2 2 2 1 0 2 2 0 0 2 2 0 2 2 2 0 2 2 0 0 0 0 2 2 2 2 2 0 0 0 0 2 0 2 2 2 0 0 2 2 0 2 2 2 1 2 2 0 0 2 2 1 2 2 2 2 0 2 0 2 2 1 1 1 1 2 2 0 0 2 0 2 0 2 0 2 1 2 2 0 0 0 1 0 0 0 1 0 0 1 2 2 2 2 2 0 1 2 0 2 0 2 0 2 0 2 2 2 0 1 1 2 2 1 2 2 0 0 2 2 0 0 2 2 2 1 2 2 2 2 0 1 2 2 0 2 0 0 0 1 0 2 0 0 2 1 0 0 2 0 2 2 2

Here is a function that I have been trying to use

genotype = np.loadtxt("Zm55K_geno_4_py.txt")

I ultimately want this as an array, preferably with snps as the columns and inbred lines as the rows (observations).

python • 1.1k views
ADD COMMENT
0
Entering edit mode

Please post input details and desired output and the code you've used too.

ADD REPLY
0
Entering edit mode

Wooops! That may actually be more helpful. Updated!

ADD REPLY
0
Entering edit mode

Please do not cross post into more forum at the same time. This is considered a bad practice.

ADD REPLY
0
Entering edit mode

What's the input format? TSV? Are you expecting an output with SNPs as header and observations (?) as rows? If so, maybe would be better to load into a Dataframe with Pandas' module read_csv

ADD REPLY

Login before adding your answer.

Traffic: 2841 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6