Bio4j
Bio4j is a bioinformatics graph based DB including most data available in UniProt KB (SwissProt + Trembl), Gene Ontology (GO), UniRef (50,90,100), RefSeq, NCBI taxonomy, and Expasy Enzyme DB.
(Manuscript in preparation).
Check this presentation at slideshare for a general overview of the project.
Some numbers
The current version of Bio4j (0.7) includes:
- Relationships: 530.642.683
- Nodes: 76.071.411
- Relationship types: 139
- Node types: 38
technology
Bio4j uses Neo4j technology.
scalability
Since Bio4j is based on Neo4j graph-based DB it is highly scalable. New data sources and features will be added from time to time and what it's more important, the Java API allows you to easily incorporate your own data to Bio4j so you can make the best out of it.
performance
In Bio4j data is organized in a way semantically equivalent to what it represents thanks to the graph structure. That means that queries which would even be impossible to perform with a standard Relational DB, just take a couple of seconds with Bio4j.
licensing
Bio4j is an open source platform released under AGPLv3.