Allergen Bioinformatics

  • Bernett T.K. Lee
  • Vladimir Brusic


Allergies are a growing health problem in developed and developing countries that result in increased healthcare expenditures. This problem is further compounded by increasing number of allergens found in genetically modified (GM) food and allergens found in unexpected sources (hidden allergens). The importance of allergies has prompted the use of new methods like genomics, proteomics, and microarray in understanding the nature of allergies. These methods have generated large amounts of data that have to be stored, retrieved, and analyzed using bioinformatics approaches. Several specialized public databases have been created in response to increasing allergen data. These specialized databases integrate the various information found in general databases into a coherent set of data and provide bioinformatics tools suitable for further analysis. The resources provided by these databases have paved the way for the creation of specialized bioinformatics tools that allow for the prediction of allergenicity. These prediction tools are crucial in view of the new sources of allergens, namely, hidden allergens and potential allergens in the form of recombinant proteins in GM food. Here we review the bioinformatics resources and tools available for the study of allergenicity.


Genetically Modify Sequence Similarity Search Allergy Immunol Source Database Protein Data Bank Structure 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Aalberse, R. C., and van Ree, R. (1996) Cross-reactive carbohydrate determinants. Monogr. Allergy 32:78-83.PubMedGoogle Scholar
  2. Aalberse, R. C. (2000) Structural biology of allergens. J. Allergy Clin. Immunol. 106:228-238. Altschul, S. F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. (1990) Basic local alignment search tool. J. Mol. Biol. 215:403-410.PubMedCrossRefGoogle Scholar
  3. Bailey, T. L., and Elkan, C. (1994) Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc. Int. Conf. Intell. Syst. Mol. Biol. 2:28-36.Google Scholar
  4. Bateman, A., Coin, L., Durbin, R., Finn, R.D., Hollich, V., Griffiths-Jones, S., Khanna, A., Marshall, M., Moxon, S., Sonnhammer, E.L., Studholme, D.J., Yeats, C., and Eddy, S.R. (2004) The Pfam protein families database. Nucleic Acids Res. 32:D138-D141.Google Scholar
  5. Benson, D. A., Karsch-Mizrachi, I., et al. (2003) GenBank. Nucleic Acids Res. 31:23-27.PubMedCrossRefGoogle Scholar
  6. Bourne, P. E., Addess, K. J., Bluhm, W.F., Chen, L., Deshpande, N., Feng, Z., Fleri, W., Green, R., Merino-Ott, J.C., Townsend-Merino, W., Weissig, H., Westbrook, J., and Berman, H.M. (2004) The distribution and query systems of the RCSB Protein Data Bank. Nucleic Acids Res. 32:D223-D225.PubMedCrossRefGoogle Scholar
  7. Brusic, V., Millot, M., Petrovsky, N., Gendel, S.M., Gigonzac, O., and Stelman, S.J. (2003) Allergen databases. Allergy 58:1093-1100.PubMedCrossRefGoogle Scholar
  8. Cantani, A. (1999) Hidden presence of cow’s milk proteins in foods. J. Invest. Allergol. Clin. Immunol. 9:141-145.Google Scholar
  9. FAO/WHO (2001) Allergenicity of Genetically Modified Foods. Food and Agriculture Organization of the United Nations, Rome, Italy, Scholar
  10. FAO/WHO (2003) Codex Principles and Guidelines on Foods Derived from Biotechnology. Food and Agriculture Organization of the United Nations, Rome, Italy, http://www. codexalimentarius .net web/more_info.jsp?id_sta=10007.Google Scholar
  11. Fiers, M. W., Kleter, G.A., Nijland, Peijnenburg, Nap, and van Ham (2004) Allermatch, a webtool for the prediction of potential allergenicity according to current FAO/WHO Codex alimentarius guidelines. BMC Bioinformatics 5:133.Google Scholar
  12. Gendel, S. M. (1998) Sequence databases for assessing the potential allergenicity of proteins used in transgenic foods. Adv. Food Nutr. Res. 42:63-92.PubMedCrossRefGoogle Scholar
  13. Hileman, R. E., Silvanovich, A., Goodman, R.E., Rice, E.A., Holleschak, G., Astwood, J.D., and Hefle, S.L. (2002) Bioinformatic methods for allergenicity assessment using a comprehensive allergen database. Int. Arch. Allergy Immunol. 128:280-291.PubMedCrossRefGoogle Scholar
  14. Ichikawa, K., Vailes, L. D., Pomes, A., and Chapman, M.D. (2001) Identification of a novel cat allergen-cystatin. Int. Arch. Allergy Immunol. 124:55-56.PubMedCrossRefGoogle Scholar
  15. Ivanciuc, O., Schein, C. H., and Braun, W. (2003) SDAP: Database and computational tools for allergenic proteins. Nucleic Acids Res. 31:359-362.PubMedCrossRefGoogle Scholar
  16. Iyer, L. M., Koonin, E. V., and Aravind, L. (2001) Adaptations of the helix-grip fold for ligand binding and catalysis in the START domain superfamily. Proteins 43:134-144.PubMedCrossRefGoogle Scholar
  17. Izumi, H., Sugiyama, M., Matsuda, T., and Nakamura, R. (1999) Structural characterization of the 16-kDa allergen, RA17, in rice seeds. Prediction of the secondary structure and identification of intramolecular disulfide bridges. Biosci. Biotechnol. Biochem. 63:2059-2063.PubMedCrossRefGoogle Scholar
  18. Jansen, J. J., Kardinaal, A. F., Huijbers, G., Vlieg-Boerstra, B.J., Martens, B.P., and Ockhuizen, T. (1994) Prevalence of food allergy and intolerance in the adult Dutch population. J. Allergy Clin. Immunol. 93:446-456.PubMedCrossRefGoogle Scholar
  19. Kanny, G., Moneret-Vautrin, D. A., Flabbee, J., Beaudouin, E., Morisset, M., and Thevenin, F. (2001) Population study of food allergy in France. J. Allergy Clin. Immunol. 108:133-140.PubMedCrossRefGoogle Scholar
  20. Kaufman, L., and Rousseeuw, P. J. (1990) Finding Groups in Data: An Introduction to Cluster Analysis, John Wiley & Sons, Brussels, Belgium.Google Scholar
  21. King, T. P., Hoffman, D., Lowenstein, H., Marsh, D.G., Platts-Mills, T.A., and Thomas, W. (1994) Allergen nomenclature. WHO/IUIS Allergen Nomenclature Subcommittee. Int. Arch. Allergy Immunol. 105:224-233.PubMedCrossRefGoogle Scholar
  22. Kleter, G. A., and Peijnenburg, A. A. (2002) Screening of transgenic proteins expressed in transgenic food crops for the presence of short amino acid sequences identical to potential, IgE -binding linear epitopes of allergens. BMC Struct. Biol. 2:8.PubMedCrossRefGoogle Scholar
  23. Krishnan, A., Li, K. B., and Issac, P. (2004) Rapid detection of conserved regions in protein sequences using wavelets. In Silico Biol. 4:0013.Google Scholar
  24. Kulikova, T., Aldebert, P., Althorpe, N., Baker, W., Bates, K., Browne, P., van den Broek, A., Cochrane, G., Duggan, K., Eberhardt, R., Faruque, N., Garcia-Pastor, M., Harte, N., Kanz, C., Leinonen, R., Lin, Q., Lombard, V., Lopez, R., Mancuso, R., McHale, M., Nardone, F., Silventoinen, V., Stoehr, P., Stoesser, G., Tuli, M.A., Tzouvara, K., Vaughan, R., Wu, D., Zhu, W., and Apweiler, R. (2004) The EMBL Nucleotide Sequence Database. Nucleic Acids Res. 32:D27-D30.PubMedCrossRefGoogle Scholar
  25. Larche, M. (2000) Specific immunotherapy. Br. Med. Bull. 56:1019-1036.PubMedCrossRefGoogle Scholar
  26. Li, K. B., Issac, P., and Krishnan, A. (2004) Predicting allergenic proteins using wavelet transform. Bioinformatics 20:2572-2578.PubMedCrossRefGoogle Scholar
  27. Malandain, H. (2004) Basic immunology, allergen prediction, and bioinformatics. Allergy 59:1011-1012.PubMedCrossRefGoogle Scholar
  28. Malone, D. C., Lawson, K. A., Smith, D.H., Arrighi, H.M., and Battista, C. (1997) A cost of illness study of allergic rhinitis in the United States. J. Allergy Clin. Immunol. 99:22-27.PubMedCrossRefGoogle Scholar
  29. Mari, A., and Riccioli, D. (2004) The Allergome Web site - A database of allergenic molecules. Aim, structure, and data of a Web-based resource. J. Allergy Clin. Immunol. 113:S301.Google Scholar
  30. Mari, A., Scala, E., Palazzo, P., Ridolfi, S., Zennaro, D., and Carabella, G. (2007) Bioinformatics applied to allergy: Allergen databases, from collecting sequence information to data integration. The Allergome platform as a model. Cell. Immunol Apr 13; [Epub ahead of print].Google Scholar
  31. Mills, K. L., Hart, B. J., Lynch, N.R., Thomas, W.R., and Smith, W. (1999) Molecular characterization of the group 4 house dust mite allergen from Dermatophagoides pteronyssinus and its amylase homologue from Euroglyphus maynei. Int. Arch. Allergy Immunol. 120:100-107.PubMedCrossRefGoogle Scholar
  32. Miyazaki, S., Sugawara, H., Ikeo, K., Gojobori, T., and Tateno, Y. (2004) DDBJ in the stream of various biological data. Nucleic Acids Res. 32:D31-D34.PubMedCrossRefGoogle Scholar
  33. Notredame, C., Higgins, D. G., and Heringa, J. (2000) T-Coffee: A novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 302:205-217.PubMedCrossRefGoogle Scholar
  34. O’Donovan, C., Martin, M. J., Gattiker, A., Gasteiger, E., Bairoch, A., and Apweiler, R. (2002) High-quality protein knowledge resource: SWISS-PROT and TrEMBL. Brief Bioinform. 3:275-284.PubMedCrossRefGoogle Scholar
  35. Pearson, W. R. (1994) Using the FASTA program to search protein and DNA sequence databases. Methods Mol. Biol. 24:307-331.PubMedGoogle Scholar
  36. Soeria-Atmadja, D., Zorzet, A., Gustafsson, M.G., and Hammerling, U. (2004) Statistical evaluation of local alignment features predicting allergenicity using supervised classification algorithms. Int. Arch. Allergy Immunol. 133:101-112.PubMedCrossRefGoogle Scholar
  37. Stadler, M. B., and Stadler, B. M. (2003) Allergenicity prediction by protein sequence. FASEBJ. 17:1141-1143.Google Scholar
  38. Thompson, J. D., Higgins, D. G., and Gibson, T.J. (1994) CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:4673-4680.PubMedCrossRefGoogle Scholar
  39. Venkatarajan, M. S., and Braun, W. (2001) New quantitative descriptors of amino acids based on multidimensional scaling of a large number of physical-chemical properties. J. Mol. Model. 7:445-453.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC. 2008

Authors and Affiliations

  • Bernett T.K. Lee
    • 1
  • Vladimir Brusic
    • 2
    • 3
  1. 1.Department of BiochemistryYoon Loo Lin School of Medicine, National University of Singapore8 Medical DriveSingapore 117597
  2. 2.Institute for Infocomm Research21 Heng Mui Keng TerraceSingapore 119613
  3. 3.Cancer Vaccine Center, Dana-Farber Cancer InstituteBostonUSA

Personalised recommendations