Data storage possibilities for LS

From GEANT2-JRA1 Wiki

Contents

Data storage possibilities for LS

Short comparsion between MySQL/PostgreSQL, eXist and Berkley DB XML

Feature MySQL/postgreSQL
(Relational Database)
eXist
(Native XML Database)
Berkley DB XML
(Native XML Database)
Distribution Binary version (versions for Windows, Linux, Solaris, BSD and other) Java sources and classes (OS independent, tested on Mandrake Linux, Windows 2000/XP/XP) C++ Sources - need to be compiled (Support for Windows, Linux, UNIX and other OS)
API for languages Java, C/C++, Perl, Python, PHP, ... Java, other (via XML:DB, SOAP and so on) C++, Java, TCL, Perl, Python or PHP
APIs JDBC API (Java), ODBC, APIs for a lot of languages -- XML:DB API Core Level 1;
-- DOM (via XML:DB) - direct access to the data;
-- SAX (via XML:DB)
XML:DB, ...
Network Protocols Dependant on vendor; (i.e.MySQL protocol, Postgres protocol) -- XML-RPC (used by XML:DB),
-- HTTP/REST,
-- SOAP (Axis),
-- WebDAV (Xicon, partially)
Berkeley DB XML is not a client/server database management system; it is a C++ library linked into your application
Query language SQL XQuery 1.0 / XPath 2.0 XQuery 1.0 / XPath 2.0
Modification language SQL Document-level and node-level updates (XUpdate and eXist XQuery update extension) Own API, does not support XUpdate (?)
Related standards SQL -- XPath 2.0,
-- XQuery 1.0,
-- XSLT,
-- XInclude (partially),
-- XUpdate
-- XPath 2.0,
-- XQuery 1.0,
Deployment Standalone application Web server (Jetty) and Cocoon are included in the distribution, but can also run without them.
Java >= 1.4
Built on Berkeley DB
Data storage Tables in binary files Native XML data store based on B+-trees and paged files. Document nodes are stored in a persistent DOM.
Data records Must have the same structure (fixed columns) Record structure may vary (Two XML nodes -- records may have different sub-nodes -- parameters). Record is actually XML document the same as eXist
Input validation Yes  ? DTD or XScheme


Indexes Yes Yes
Authorization Privilages for users Unix-like access permissions
Transaction Support Yes (Postgres) No (but planned)



Licence GNU GPL
or other
GNU LGPL
Documentation Very good Good
More details Mysql vs postgres Facts Berkley DB XML overview
WWW www.mysql.com www.postgresql.org/ exist.sourceforge.net Homepage
Examples and testing Mysql vs postgres code How to use XML:DB API in Java, XML DB tests -

Relational DB Management System (SQL)

Pros Cons
Fast,
reliable,
querying by well known SQL interface
Need to maintain external DB server
All records must have the same structure (fixed columns)



There are also some Java relational databases:

  • HSQL DB - Pure Java SQL database. HSQLDB 1.8.0 is the database engine in OpenOffice.org 2.0 (as a recommendation)
  • One$DB - One$DB is a standards based (J2EE-certified, JDBC 3.0 and SQL 99 compliant), platform independent, footprint size database that can be embedded into any application and requires zero or minimal administration. Daffodil DB is the first Java database that has shown the capability to take on enterprise databases with its high performance in real time environments
Pros Cons
Pure Java,
querying by well known SQL interface
All records must have the same structure (fixed columns)
May be slower than C/C++ databases (?)

Native XML Database

"As defined by the members of the XML:DB mailing list, a native XML database is one that:

  • Defines a (logical) model for an XML document -- as opposed to the data in that document -- and stores and retrieves documents according to that model. At a minimum, the model must include elements, attributes, PCDATA, and document order. Examples of such models are the XPath data model, the XML Infoset, and the models implied by the DOM and the events in SAX 1.0.
  • Has an XML document as its fundamental unit of (logical) storage, just as a relational database has a row in a table as its fundamental unit of (logical) storage.

Is not required to have any particular underlying physical storage model. For example, it can be built on a relational, hierarchical, or object-oriented database, or use a proprietary storage format such as indexed, compressed files.." More information: www.rpbourret.com/xml/ProdsNative.htm


Some of them are

  • eXist - eXist is an Open Source native XML database featuring efficient, index-based XQuery processing, automatic indexing, extensions for full-text search
  • Berkley DB XML - Berkeley DB XML is the native XML database engine for your product. Berkeley DB XML provides XQuery access into a database of document containers. XML documents are stored and indexed in their native format using Berkeley DB as the transactional database engine. Berkeley DB XML is not a client/server database management system; it is a C++ library linked into your application. There is no client server network overhead. There is no need for a DBA. Berkeley DB XML is quite simply the fastest and most reliable native XML database engine available today. (info from homepage)

There are also some native XML databases using MySQL/PostgreSQL (or other). See The full list of Native XML Databases.


Pros Cons
Can store XML records, more flexible query through XQuery, Record structure may vary (Two XML nodes /records/ may have different sub-nodes /parameters/) How fast is it?

Other possibilities

Object Database

  • db4o is the open source object database that enables Java and .NET developers to slash development time and costs and achieve unprecedented levels of performance. The unique design of db4o's native object database engine makes it the ideal choice to be embedded in mobile devices, in packaged software or in real-time control systems.

UDDI

Universal Description, Discovery and Integration.

Pros Cons
Part of Web-Services technology, application ready to use! Probably hard to develop if doesn't have all features we need
Personal tools