Data Management, Ontology and Ecoinformatics
In a post about Data Basin, I mentioned the agency partnerships that my colleagues at State of the Salmon are coordinating along North American West Coast.
Project goals in these partnerships are, broadly stated, to deepen fisheries agency capacity for data management and to build a foundation for increased collaboration on data interoperability across borders.
As part of my work on this project, I have been catching up on some background reading on ontologies and ecoinformatics. An ontology, in this context, is a formal representation of concepts and relationships in a field of study. Ecoinformatics is a new field that aims to facilitate ecological research and ecosystem management by developing ways to manage, access and integrate ecological data.
The paper "Advancing ecological research with ontologies," by a group of authors at the National Center for Ecological Analysis and Synthesis, offers an excellent overview:
The use of ontologies has proliferated in recent years in the molecular biology and biomedical communities, providing benefits to those disciplines by facilitating closer collaboration and better synthetic analyses owing to precise and unified descriptions of their fields’ data contents. ... Ecology stands to benefit in similar ways by developing ontologies to control and clarify terms, and thereby enhance data-sharing capabilities. ...
Ontologies hold great promise as a unifying mechanism for representing knowledge because they are interpretable by both humans and computer applications, and subsequently facilitate the use of automated reasoning for helping with arduous data management tasks that scientists deal with on a daily basis. ...
Obtaining benefits from ontologies, however, requires that ecological concepts be formally represented. For example, the ecological concept ‘community’ has multiple interpretations and by formally defining these different usages, scientists and computer applications can begin to resolve these differences. The process of ontology construction can be challenging however, and should involve collaboration between ecologists and computer scientists to build topical and logically consistent ontologies. ...
Despite the potential for ontologies to enhance ecological information management, there are few examples of such systems in use. There are several reasons why this is probably the case.
First, ecological data are still typically collected by small groups of individuals for use within their respective projects. Traditionally, data owners are the only intended users, and information about their data regarding its structure, content and appropriate usage is often not recorded. This situation is no longer tenable, because as ecological research becomes holistic and integrative, better approaches are needed for locating and interpreting all relevant data that can inform a topic. Second, current data practices in ecology are not particularly amenable to data sharing and re-use. The prevalent ‘spreadsheet’ model and even sophisticated database frameworks typically lack the requisite information to facilitate effective long-term preservation and interpretation of data. However, although software products are available and used by scientists to manage data using these approaches, software is still under development for similar ontology-based approaches. Thus, the adoption of ontologies is hindered both by the familiarity of current practices and the lack of tools to readily migrate to improved practices. Third, developing comprehensive and consistent ontologies is challenging, especially within ecology, which is a complex and multidisciplinary field with concepts spanning many spatiotemporal scales, and multiple levels of organization and processes.