OmegaWiki

 Is OmegaWiki interesting to you?


 * If you are interested in collaboratively annotating complex ontologies: yes.
 * If you want to build the world's largest dictionary and thesaurus: yes.
 * If you want to manage data like Wikipedia's infoboxes in a collaborative, language-independent repository: yes.
 * If you're sick of English-only and semantically poor tagging systems: yes.
 * If you want to help build an open source alternative to Freebase: yes.
 * If you want rock-solid technology that is absolutely mature and will not break: not yet.
 * If you want to annotate flat-text with additional semantics: not yet.
 * If you want a simple way to add forms to your wiki: no.
 * If you want an enterprise "webtop" tool to manage spreadsheets or corporate data: no.

OmegaWiki is a community-maintained ontology database with forms-based editing and a relational database backend. We call the underlying technology "Wikidata". While it hooks into MediaWiki, the Wikidata functions and tables are relatively well-separated. Wikidata supports organizing ontologies on multiple levels:


 * 1) Data-sets: A data-set is essentially an instance of OmegaWiki with its own permission settings and tables, but residing in the same database. Concepts in data-sets can be cross-linked as being identical.
 * 2) Collection: A collection is a concept container. They are used for multiple purposes: to simplify identify a concept as originating from a particular source, to group all the classes in a particular ontology, to store mappings with other data-sets, etc.
 * 3) Class: A class can define the permitted properties of a concept. Class membership is optional.
 * 4) DefinedMeaning: A concept with a definition and at least a single defining expression (word or short phrase). The purpose of the defining expression is to avoid semantic drift across multiple definitions in different languages.
 * 5) Expression: A single string representation, resolving to one or multiple DefinedMeanings.

The data is fully versioned and can be collaboratively edited. Great care is taken to ensure that multilingual support is present on all levels, as one of the primary goals is to describe all words in all languages.

The Wikidata technology has been used to import complex ontologies, including the freely licensed levels of UMLS and a large subset of SwissProt, two biomedical databases. The first database imported was GEMET, an environmental terminology database. OmegaWiki is, principally, open to importing all useful technology, as long as the process can be properly resourced.

Ontologies can be imported into a read-only data-sets. Copies can then be made into the community data-set for collaborative editing. This way, institutions can experiment with the wiki approach without committing to it. They can observe how the data gets changed by the community, choose to implement selected changes, or fully give up control.

The three key partners behind the project are Open Progress, a Dutch non-profit, KnewCo, an American start-up company in the biomedical domain, and University of Bamberg's language faculty. KnewCo also operates WikiProtein, a custom skin on the same database used by OmegaWiki, specialized for protein annotation.

Wikidata is fully open source and all data in the "community" data-set is available under the CC-BY license; anyone can set up their own instance. KnewCo also develops some proprietary technologies on top of Wikidata, including a concept recognition engine that feeds back into OmegaWiki/WikiProtein.

As of October 2007, the project is still very much in development. Key priorities for the months ahead include:
 * further improvements to the user interface, both aesthetically and functionally
 * code refactoring and documentation
 * better data copying and merging tools
 * web service API
 * support for inflections
 * cool showcase application, e.g. tagging of Wikimedia Commons images with DefinedMeanings
 * better and more visible integration of KnewCo's data-mining features
 * move to a larger data-center