Advertisement

A Semantic Enrichment of Data Tables Applied to Food Risk Assessment

  • Hélène Gagliardi
  • Ollivier Haemmerlé
  • Nathalie Pernelle
  • Fatiha Saïs
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3735)

Abstract

Our work deals with the automatic construction of domain specific data warehouses. Our application domain concerns microbiological risks in food products. The MIEL++ system [2], implemented during the Sym’Previus project, is a tool based on a database containing experimental and industrial results about the behavior of pathogenic germs in food products. This database is incomplete by nature since the number of possible experiments is potentially infinite. Our work, developed within the e.dot project, presents a way of palliating that incompleteness by complementing the database with data automatically extracted from the Web. We propose to query these data through a mediated architecture based on a domain ontology. So, we need to make them compatible with the ontology. In the e.dot project [5], we exclusively focus on documents in Html or Pdf format which contain data tables. Data tables are very common presentation scheme to describe synthetic data in scientific articles. These tables are semantically enriched and we want this enrichment to be as automatic and flexible as possible. Thus, we have defined a Document Type Definition named SML (Semantic Markup Language) which can deal with additional or incomplete information in a semantic relation, ambiguities or possible interpretation errors. In this paper, we present this semantic enrichment step.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Arasu, A., Garcia-Molina, H.: Extracting structured data from web pages. In: Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pp. 337–348. ACM Press, New York (2003)CrossRefGoogle Scholar
  2. 2.
    Buche, P., Dibie-Barthélemy, J., Haemmerlé, O., Houhou, M.: Towards flexible querying of xml imprecise data in a dataware house opened on the web. In: Christiansen, H., Hacid, M.-S., Andreasen, T., Larsen, H.L. (eds.) FQAS 2004. LNCS (LNAI), vol. 3055, pp. 28–40. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  3. 3.
    Cimiano, P., Handschuh, S., Staab, S.: Towards the self-annotating web. In: WWW 2004: Proceedings of the 13th international conference on World Wide Web, pp. 462–471. ACM Press, New York (2004)CrossRefGoogle Scholar
  4. 4.
    Doan, A., Lu, Y., Lee, Y., Han, J.: Profile-based object matching for information integration. Intelligent Systems, IEEE 18(5), 54–59 (2003)CrossRefGoogle Scholar
  5. 5.
    e.dot, Progress report of the e.dot project (2004), http://www-rocq.inria.fr/gemo/edot
  6. 6.
    Rahm, E., Bernstein, P.A.: A survey of approaches to automatic schema matching. The VLDB Journal 10(4), 334–350 (2001)zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Hélène Gagliardi
    • 1
  • Ollivier Haemmerlé
    • 1
  • Nathalie Pernelle
    • 1
  • Fatiha Saïs
    • 1
  1. 1.LRI (UMR CNRS 8623 – Université Paris-Sud) / INRIA (Futurs)OrsayFrance

Personalised recommendations