HETEROGENEOUS BIOLOGICAL DATABASES INTEGRATION

Authors Avatar

HETEROGENEOUS BIOLOGICAL DATABASES INTEGRATION

HETEROGENEOUS BIOLOGICAL DATABASES INTEGRATION

  



ABSTRACT

This research report involves about integration of various sources of data to solve biological problems. Our philosophy is different types of data sources gives us more information than a single. By combining data sources in an intelligent way, we can obtain a more complete picture of the problem. Biological data sources are known for their heterogeneous. These aspects include data formats, physical location and their query capabilities. These sources of data can be easily integrated so that researchers can access, query and get results. This article tries to look at the dissimilarities of the main sources of biological data and some major approaches in dealing with the integration of multiple biological databases. Mediator approach ontology is also discussed in this work. 

The integration of databases is based on many existing collections and databases of resources by placing them on the integrated data for management platform. The data integration in molecular biology is important for large users cannot find all the relevant data with one data source. Another advantage of data integration is the ability to obtain information from a new quality and semantic relationships between the integrated data from various sources. In current environment is difficult to find and analyze data, because a user has to visit all sources, but integration approaches can prepare a single location that provide effective data access from many sources. Here the Researcher will illustrate the concept of Ontology for Biological heterogeneous database integration.

The main components of ontology are concepts, relations, attributes, instances and axioms. For integration and extraction of meaningful information from accumulated biological databases. We have to define these components; such components can be interpreted as meta-data (domain specific, task specific or generic). The metadata provides a way to reduce both cost of integrating new data sources and the cost of access to integrated data sources. In result integrated data are easier to understand and allows complex queries and the number of site boundaries.

This paper is structured as follows: In Introduction, earlier research of biological databases integration approaches used by database warehouse managers and database consultants. In Discussion, the general debate of architecture and its various attributes with some examples. The Research gives suggestions for different analysis, feature work suggestion. In conclusion, the research of work and learning is given.

INTRODUCTION

Biological database such as Swissprot, Genbank, EMBL, DDBJ or enzyme (P. Lambrix and V. Jakoniene, 2000) type of store different formats of biological data, located in different places and have different user interfaces. Integration is necessary to put together this database to make it look like a single database. There are several approaches that are being used to integrate several databases such as SRS (L. Wong, 2002 and T. Hernandez and S. Kambhampati, 2004), TAMBIS (NW Paton, R. Stevens, P. Baker, CA Goble, S. Bechhofer and A. Brass, 1999), BACIIS (ZB Miled, N. Li, GL Kellett, B. and O. Sipesand Bukhres, 2002), Kleisli (SB Davidson, J. Crabtree, B. Brunk, J. Schug , V. Tannen, C. Overton and C. Toeckert, 2001) and DiscoveryLink (Haas LM, AM Schwarz, P. Kodali, E. Kotlar, JE Rice and WC Swope, 2001). Despite the efforts within these approaches, yet there is no single approach that fits all needs in bioinformatics (L. Wong, 2002).

Join now!

Here the Researcher has described the approach navigation, data warehousing approach and focus Mediator. This document provides further details on Biological Data sources and integration approaches now widely used by the database administrator and consultant.

BIOLOGICAL DATA SOURCES

Table 1: Heterogeneity of biological data sources.

This Table 1 shows the heterogeneity of data sources available to biologists for their research. The indifference among the sources of data is shown in the format of data and user interfaces. These data sources store information about the nucleotide sequences, protein sequences, 3D structures of macromolecules and protein families. This information usually can ...

This is a preview of the whole essay