Haplotype Structure

  • Yu Zhang
  • Tianhua NiuEmail author


This chapter consists of three parts. In the first part, we provide definitions for important terms and concepts used in studies of population haplotype structures. In the second part, we introduce the user to valuable publicly available genotype/haplotype databases, such as databases generated by the International HapMap Project. In addition, we provide concise guides to the user on how to download genotype data from the HapMap web site, how to use the Haploview program, as well as how to perform haplotype simulation. In the third part, we provide guides to several widely used haplotype inference Inference methods, including the Clark’s algorithm, PHASE, HAPLOTYPER, and CHB. Furthermore, we present to the user two popular software packages, LDhat and HOTSPOTTER, for estimation of recombination rates.


Markov Chain Monte Carlo Recombination Rate Genotype Data Haplotype Structure Much Recent Common Ancestor 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.



We thank Professor Jun Liu (Department of Statistics, Harvard University) for his encouragement and valuable discussions. This work was supported in part by National Institutes of Health grants R01 HG002518.


  1. 1.
    Gilks WR, Richardson S, Spiegelhalter DJ (eds) (1996) Markov chain Monte Carlo in practice. Chapman & Hall, LondonGoogle Scholar
  2. 2.
    Liu JS (2001) Monte Carlo strategies in scientific computing. Springer, New YorkGoogle Scholar
  3. 3.
    Clark A (1990) Inference of haplotypes from PCR-amplified samples of diploid populations. Mol Biol Evol 7:111–122PubMedGoogle Scholar
  4. 4.
    Reich DE, Cargill M, Bolk S, Ireland J, Sabeti PC, Richter DJ, Lavery T, Kouyoumjian R, Farhadian SF, Ward R, Lander ES (2001) Linkage disequilibrium in the human genome. Nature 411:199–204CrossRefPubMedGoogle Scholar
  5. 5.
    Stephens M, Donnelly P (2000) Inference in molecular population genetics. J R Stat Soc B 62:605–655CrossRefGoogle Scholar
  6. 6.
    Stephens M, Smith NJ, Donnelly P (2001) A new statistical method for haplotype reconstruction from population data. Am J Hum Genet 68:978–989CrossRefPubMedGoogle Scholar
  7. 7.
    Stephens M, Donnelly P (2003) A comparison of bayesian methods for haplotype reconstruction from population genotype data. Am J Hum Genet 73:1162–1169CrossRefPubMedGoogle Scholar
  8. 8.
    Stephens M, Scheet P (2005) Accounting for decay of linkage disequilibrium in haplotype inference and missing data imputation. Am J Hum Genet 76:449–462CrossRefPubMedGoogle Scholar
  9. 9.
    Scheet P, Stephens M (2006) A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am J Hum Genet 78:629–644CrossRefPubMedGoogle Scholar
  10. 10.
    Niu T, Qin ZS, Xu X, Liu JS (2002) Bayesian haplotype inference for multiple linked single-nucleotide polymorphisms. Am J Hum Genet 70:157–169CrossRefPubMedGoogle Scholar
  11. 11.
    Zhang Y, Niu T, Liu JS (2006) A coalescence-guided hierarchical Bayesian method for haplotype inference. Am J Hum Genet 79:313–322CrossRefPubMedGoogle Scholar
  12. 12.
    Gelman A, Rubin DB (1992) Inference from iterative simulation using multiple sequences. Stat Sci 7:457–472CrossRefGoogle Scholar
  13. 13.
    Hudson RR (1987) Estimating the recombination parameter of a finite population without selection. Genet Res 50:245–250CrossRefPubMedGoogle Scholar
  14. 14.
    Hudson RR, Kaplan R (1985) Statistical properties of the number of recombination events in the history of a sample of DNA sequences. Genetics 111:147–164PubMedGoogle Scholar
  15. 15.
    Li N, Stephens M (2003) Modeling linkage disequilibrium, and identifying recombination hotspots using SNP data. Genetics 165:2213–2233PubMedGoogle Scholar
  16. 16.
    Rabiner LR (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77:257–286CrossRefGoogle Scholar
  17. 17.
    Hudson RR (2002) Generating samples under a Wright–Fisher neutral model of genetic variation. Bioinformatics 18:337–338CrossRefPubMedGoogle Scholar
  18. 18.
    Fisher RA (1930) The genetical theory of natural selection. Clarendon Press, Oxford.Google Scholar
  19. 19.
    Wright S (1931) Evolution in Mendelian populations. Genetics 16:97–159PubMedGoogle Scholar
  20. 20.
    McVean G, Awadalla P, Fearnhead P (2002) A coalescence-based method for detecting and estimating recombination from gene sequences. Genet 160:1231–1241Google Scholar
  21. 21.
    Hudson RR (2001) Two-locus sampling distributions and their application. Genet 159:1805–1817Google Scholar
  22. 22.
    Griffiths RC, Marjoram P (1996) An ancestral recombination graph. In: Donnely PJ, Tavare S (eds) IMA volume on mathematical population genetics. Springer-Verlag, Berlin, pp. 257–270Google Scholar
  23. 23.
    Fearnhead P, Donnelly PJ (2001) Estimating recombination rates from population genetic data. Genet 159:1299–1318Google Scholar
  24. 24.
    Romero R, Kuivaniemi H, Tromp G, Olson J (2002) The design, execution, and interpretation of genetic association studies to decipher complex diseases. Am J Obstet Gynecol 187:1299–1312CrossRefPubMedGoogle Scholar
  25. 25.
    The International HapMap Consortium (2003) The international hapMap project. Nature 426:789–796CrossRefGoogle Scholar
  26. 26.
    International HapMap Consortium (2005) A haplotype map of the human genome. Nature 437:1299–1320CrossRefGoogle Scholar
  27. 27.
    Arnheim N, Calabrese P, Nordborg M (2003) Hot and cold spots of recombination in the human genome: the reason we should find them and how this can be achieved. Am J Hum Genet. 73:5–16CrossRefPubMedGoogle Scholar
  28. 28.
    Daly MJ, Rioux JD, Schaffner SF, Hudson TJ, Lander ES (2001) High-resolution haplotype structure in the human genome. Nat Genet 29:229–232CrossRefPubMedGoogle Scholar
  29. 29.
    Fu YX, Li WH (1993) Statistical tests of neutrality of mutations. Genetics 133:693–709PubMedGoogle Scholar
  30. 30.
    Gabriel SB, Schaffner SF, Nguyen H, Moore JM, Roy J, Blumenstiel B, Higgins J, DeFelice M, Lochner A, Faggart M, Liu-Cordero SN, Rotimi C, Adeyemo A, Cooper R, Ward R, Lander ES, Daly MJ, Altshuler D (2002) The structure of haplotype blocks in the human genome. Science 296:2225–2229CrossRefPubMedGoogle Scholar
  31. 31.
    Goldstein DB (2001) Islands of linkage disequilibrium. Nat Genet 29:109–111CrossRefPubMedGoogle Scholar
  32. 32.
    Greenspan G, Geiger D (2004) High density linkage disequilibrium mapping using models of haplotype block variation. Bioinformatics 20(Suppl 1):I137–I144CrossRefPubMedGoogle Scholar
  33. 33.
    Hein J, Schierup MH, Wiuf C (2005) Gene genealogies, variation and evolution. Oxford University Press, LondonGoogle Scholar
  34. 34.
    Jeffreys AJ, Kauppi L, Neumann R (2001) Intensely punctate meiotic recombination in the class II region of the major histocompatibility complex. Nat Genet 29:217–222CrossRefPubMedGoogle Scholar
  35. 35.
    Johnson GC, Esposito L, Barratt BJ, Smith AN, Heward J, Di Genova G, Ueda H, Cordell HJ, Eaves IA, Dudbridge F, Twells RC, Payne F, Hughes W, Nutland S, Stevens H, Carr P, Tuomilehto–Wolf E, Tuomilehto J, Gough SC, Clayton DG, Todd JA (2001) Haplotype tagging for the identification of common disease genes. Nat Genet 29:233–237Google Scholar
  36. 36.
    Kimura M (1969) The number of heterozygous nucleotide sites maintained in a finite population due to steady flux of mutations. Genetics 61:893–903PubMedGoogle Scholar
  37. 37.
    Kingman JFC (1982a) The coalescent. Stochastic Processes Applications 13:235–248CrossRefGoogle Scholar
  38. 38.
    Kingman JFC (1982b) On the genealogy of large populations. J Appl Probab 19A:27–43CrossRefGoogle Scholar
  39. 39.
    Luikart G, England PR, Tallmon D, Jordan S, Taberlet P (2003) The power and promise of population genomics: from genotyping to genome typing. Nat Rev Genet 4:981–994CrossRefPubMedGoogle Scholar
  40. 40.
    Neuhauser C, Krone SM (1997) The genealogy of samples in models with selection. Genetics 145:519–534PubMedGoogle Scholar
  41. 41.
    Patil N, Berno AJ, Hinds DA, Barrett WA, Doshi JM, Hacker CR, Kautzer CR, Lee DH, Marjoribanks C, McDonough DP, Nguyen BT, Norris MC, Sheehan JB, Shen N, Stern D, Stokowski RP, Thomas DJ, Trulson MO, Vyas KR, Frazer KA, Fodor SP, Cox DR (2001) Blocks of limited haplotype diversity revealed by high–resolution scanning of human chromosome 21. Science 294:1719–1723CrossRefPubMedGoogle Scholar
  42. 42.
    Rosenberg NA, Nordborg M (2002) Genealogical trees, coalescent theory and the analysis of genetic polymorphisms. Nat Rev Genet 3:380–390CrossRefPubMedGoogle Scholar
  43. 43.
    Schneider JA, Peto TE, Boone RA, Boyce AJ, Clegg JB (2002) Direct measurement of the male recombination fraction in the human beta-globin hot spot. Hum Mol Genet 11:207–215CrossRefPubMedGoogle Scholar
  44. 44.
    Tajima F (1989) Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585–595PubMedGoogle Scholar
  45. 45.
    Wang N, Akey JM, Zhang K, Chakraborty R, Jin L (2002) Distribution of recombination crossovers and the origin of haplotype blocks: the interplay of population history, recombination, and mutation. Am J Hum Genet 71:1227–1234CrossRefPubMedGoogle Scholar
  46. 46.
    Zhang K, Jin L (2003) HaploBlockFinder: haplotype block analyses. Bioinformatics 19: 1300–1301CrossRefPubMedGoogle Scholar
  47. 47.
    Zhang K, Qin Z, Chen T, Liu JS, Waterman MS, Sun F (2005) HapBlock: haplotype block partitioning and tag SNP selection software using a set of dynamic programming algorithms. Bioinformatics 21:131–134CrossRefGoogle Scholar
  48. 48.
    Stein LD, Mungall C, Shu S, Caudy M, Mangone M, Day A, Nickerson E, Stajich JE, Harris TW, Arva A, Lewis S (2002) The generic genome browser: a building block for a model organism system database. Genome Res 12:1599–1610CrossRefPubMedGoogle Scholar
  49. 49.
    Barrett JC, Fry B, Maller J, Daly MJ (2005) Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21:263–265CrossRefPubMedGoogle Scholar
  50. 50.
    Niu T (2004) Algorithms for inferring haplotypes. Genet Epidemiol 27:334–347CrossRefPubMedGoogle Scholar
  51. 51.
    Marchini J, Cutler D, Patterson N, Stephens M, Eskin E, Halperin E, Lin S, Qin ZS, Munro HM, Abecasin GR, Donnelly P, International HapMap Consortium (2006) A comparison of phasing algorithms for trios and unrelated individuals. Am J Hum Genet 78:437–450CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  1. 1.Department of Statisticsthe Pennsylvania State UniversityUniversity ParkUSA
  2. 2.Department of Psychiatry and Neurobehavioral SciencesUniversity of VirginiaCharlottesvilleUSA

Personalised recommendations