FPKM values derived from mapping of fastq files will depend on (1) the genome (fasta), (2) the gene descriptions (gtf), and (3) the aligner. For your comparison, you've changed at least two of these things. Thus, it is expected that some values will change. Different genomes are different, and they will affect the mapping process. Different sets of gene annotations may also be different, and will affect how reads are counted for genes. People will argue about the best values for comparison, but either way, FPKM values are best compared within a highly constrained universe and are not absolute measurements of abundance. If you were to plot all values from your comparisons against each other, you would find that mostly they will align along a diagonal and be virtually identical, but you will see scatter and some outliers. You can look into the details of why some individual genes differ....but what is it that you actually want to achieve? You're better using a consistent set of resources to answer a defined set of questions.
So: (1) Yes, it is normal to get different results when you change the resources used to answer your questions.
(2) No, I would not recommend comparing FPKM values this way, as they were generated in different contexts.
(3) Different versions of genomes have different regions available for mapping, and depending on your alignment parameters this will affect how reads are assigned to genes. In addition different genomes require different sets of gene descriptions. A given gene ID may have a different set of descriptions (transcripts) between genome versions. This is a common problem.
(4) You need to check the GTF, as well as the genome (fasta), if you want to find out why a given gene changes values between genome versions.
(5) You don't say explicitly that the same aligner, and indeed the same version of the aligner, was used between comparisons. Aligners, and the parameters they are called with, can make a big difference in FPKM values for some genes.
If you really want to get to the bottom of the differences, you should reproduce each result set in your own hands, so you know all the parameters, and can examine the differences in an isolated fashion - just like any good experiment.
Thanks a lot seidel, your reply is very informative.
Original study (against GRch37) used BWA whereas topmed (against GRch38) used STAR aligner.
Topmed used GENCODE v30 gtf, whereas orginal study used GENCODE v13 annotations.
The .fasta used for Topmed is GRCh38 reference from the Broad Institute, whereas the original study used GRCh37
Once again, thanks a lot for taking the time to respond to my query