DataBases OpenSource ¿Cual?


Bueno ahora que MySQL esta en peligro y que nadie sabe como va a ser el futuro, todo el mundo busca una salida, además se une que ha habido un gran movimiento en las bases de datos se ha creado el movimiento NOSQL y se han creado distintas bases de datos nuevas.

Al trabajar con pentaho me están empezando a pedir algunas bases de datos no corrientes, últimamente quieren hadoop en vede Oracle y cosas así a continuación pongo  algunas bases de datos libre que pueden ser de interés.

Siento que estén en ingles pero no tengo tiempo de traducir , también he puesto algunas presentaciones para complementar.

MongoDB

MongoDB

It is an open source, high-performance, scalable, schema-free & document-oriented (JSON-like data schemas) database.

There are ready to use drivers for most popular programming languages like PHP,Python, Perl, Ruby, JavaScript, C++ + more.

Hypertable

Hypertable

Hypertable is a high performance distributed data storage system designed to support applications requiring maximum performance, scalability, and reliability.

It is modeled after Google’s BigTable and mostly focuses on large-scale datasets.

Apache CouchDB

Apache CouchDB

A document-oriented database that can be queried and indexed in a MapReduce fashion using JavaScript.

CouchDB offers a RESTful JSON API which can be accessed from any environment allowing HTTP requests

Neo4j

Neo4j Graph Database

It is an embedded, disk-based, and fully transactional Java persistence engine that stores data structured in graphs rather than tables.

Neo4j offer a massive scalability. It can handle graphs of several billion nodes/relationships/properties on a single machine and can be scaled across multiple machines.

Riak

Riak

Riak is a very ideal database for web applications as it combines:

  • a decentralized key-value store
  • a flexible map/reduce engine
  • a friendly HTTP/JSON query interface.

Oracle Berkeley DB

Oracle Berkeley DB

It is an embeddable database engine that provides developers with fast, reliable, local persistence with zero administration.

Oracle Berkeley DB is a library that links directly into your application & enables you to make simple function calls rather than sending messages to a remote server for a better performance.

Apache Cassandra

Apache Cassandra

Cassandra is a highly scalable second-generation distributed database that is used by giants like Facebook, Digg, Twitter, Cisco & more..

It aims to provide a consistent, fault-tolerant & highly available environment for storing data.

Memcached

Memcached

Memcached is an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering.

It is intended for use in speeding up dynamic web applications by alleviating database load.

Firebird

Firebird

Firebird is a relational database that can run on Linux, Windows & various UNIX platforms.

It offers high performance and powerful language support for stored procedures and triggers.

Redis

Redis

Redis is an advanced fast key-value database written in C which can be used like memcached, in front of a traditional database, or on its own.

It has support for many programming languages & used by popular projects like GitHub or Engine Yard.

There is also a PHP client named Rediska for managing Redis databases.

HBase

Hadoop HBase

HBase is a distributed & column-oriented store which can also be called as the Hadoop database.

The project aims to host very large tables like “billions of rows, millions of columns”.

It has a  REST-ful web service gateway that supports XML, Protobuf, and binary data encoding options.

Keyspace

Keyspace

It is a consistently replicated, fault-tolerant key-value store that works in Windows OS.

Keyspace offers high availability by masking server/network failures & appearing as a single, highly available service.

4store

4store

4store is a database storage and query engine that holds RDF data.

It is written in ANSI C99, designed to run on UNIX-like systems & offers a high performance, scalable & stable platform.

MariaDB

MariaDB

MariaDB is a backward compatible, drop-in replacement branch of the MySQL® Database Server.

It includes all major open source storage engines + the Maria storage engine.

Drizzle

Drizzle

It is a fork of MySQL that focuses on being a reliable database optimized for Cloud and Net applications.

HyperSQL

HyperSQL

It is a SQL relational database engine written in Java.

HyperSQL offers a small & fast database engine which has in-memory and disk-based tables, supports embedded/server modes.

Also, it has tools such as a command line SQL tool & GUI query apps.

MonetDB

MonetDB

MonetDB is a  database system for high-performance applications in data mining, OLAP, GIS, XML Query, text & multimedia retrieval.

Persevere

Persevere

It is an object storage engine and application server (running on Java/Rhino) that provides storage of dynamic JSON data for rapidly develop data-driven JavaScript-based rich internet applications.

eXist-db

eXist-db

eXist-db is built using XML technology. It stores XML data according to the XML data model & features efficient, index-based XQuery processing.

Glasious

Gladius DB is a fast and efficient PHP flatfile database engine written in pure PHP; its SQL syntax is compatible with a subset of intermediate SQL92. You will not need any specific extension to have it work, and it is bundled with an adoDB lite driver.

Cloudstore

Web-scale applications require a scalable storage infrastructure to process vast amounts of data. CloudStore (formerly, Kosmos filesystem) is an open-source high performance distributed filesystem designed to meet such an infrastructure need:
CloudStore is implemented in C++ using standard system components such as STL, boost libraries, aio, log4cpp.
CloudStore is integrated with Hadoop and Hypertable. This enables applications bult on those systems to seamlessly use CloudStore as the underlying data store.
CloudStore is deployed on Solaris and Linux platforms for storing web log data, crawler data, etc.

SmallSQL

  • SmallSQL is a 100% pure Java database (RDBMS) for Java desktop application. Currently it is not certificated as pure Java. But you can execute the pure Java test.
  • We want create SmallSQL with a very small footprint of 200 – 300 KB for the runtime jar file. We think that the database runtime should not be the largest part of a desktop  application.
  • SmallSQL should be most compatible to other popular databases like MS Access, MS SQL Server, Oracle or DB2. We want that you can migrate very easy to our product. The ideal case is that you need only to change the JDBC URL in your application.
  • SmallSQL has no network interface and you can’t share the database. This make the SmallSQL runtime very small and fast. A Java desktop application need no network interface.
  • SmallSQL required JDK 1.4.x or higher. We want implements the JDBC 3.0 API in a future version that’s we need JDK 1.4. Currently we see no cause why we should support an older version of the JDK for a completely new product. If you can say a good cause then we can anew think about it.

LucidDB

LucidDB is the first and only open-source RDBMS purpose-built entirely for data warehousing and business intelligence. It is based on architectural cornerstones such as column-store, bitmap indexing, hash join/aggregation, and page-level multiversioning. Most database systems (both proprietary and open-source) start life with a focus on transaction processing capabilities, then get analytical capabilities bolted on as an afterthought (if at all). By contrast, every component of LucidDB was designed with the requirements of flexible, high-performance data integration and sophisticated query processing in mind. Moreover, comprehensiveness within the focused scope of its architecture means simplicity for the user: no DBA required.

HyperGraphDB

HyperGraphDB is a general purpose, extensible, portable, distributed, embeddable, open-source data storage mechanism. It is a graph database designed specifically for artificial intelligence and semantic web projects, it can also be used as an embedded object-oriented database for projects of all sizes.

InfoGrid

InfoGrid is an Internet Graph Database with a many additional software components that make the development of REST-ful web applications on a graph foundation easy. InfoGrid is open source.

ApacheDerby

Apache Derby, an Apache DB subproject, is an open source relational database implemented entirely in Java and available under the Apache License, Version 2.0. Some key advantages include:

  • Derby has a small footprint — about 2.6 megabytes for the base engine and embedded JDBC driver.
  • Derby is based on the Java, JDBC, and SQL standards.
  • Derby provides an embedded JDBC driver that lets you embed Derby in any Java-based solution.
  • Derby also supports the more familiar client/server mode with the Derby Network Client JDBC driver and Derby Network Server.
  • Derby is easy to install, deploy, and use.

HamsterDB

hamsterdb Embedded Storage is a lightweight embedded “NoSQL” key-value store. It is in development for more than five years and concentrates on ease of use, high performance, stability and scalability.

H2

  • Very fast, open source, JDBC API
  • Embedded and server modes; in-memory databases
  • Browser based Console application
  • Small footprint: around 1 MB jar file size

EyeDB

EyeDB

EyeDB is an Object Oriented Database Management System (OODBMS) based on the ODMG 3 specification, developed and supported by the French company SYSRA.

EyeDB provides an advanced object model (inheritance, collections, arrays, methods, triggers, constraints, reflexivity), an object definition language based on ODMG ODL, an object query and manipulation language based on ODMG OQL and programming interfaces for C++ and Java.

EyeDB is free software, distributed under the terms of the GNU Lesser General Public License.

db4objects

Get a head start for your products by leveraging db4o’s cutting edge technology to achieve unprecedented levels of performance and flexibility. Simply embed db4o’s open source object database engine into your application and store and retrieve even the most complex object structures with only one line of code.

TokyoCabinet

Tokyo Cabinet:

Tokyo Cabinet is a library of routines for managing a database. The database is a simple data file containing records, each is a pair of a key and a value. Every key and value is serial bytes with variable length. Both binary data and character string can be used as a key and a value. There is neither concept of data tables nor data types. Records are organized in hash table, B+ tree, or fixed-length array.

ProjectVoldemort

Project Voldemort

A distributed database.

  • Data is automatically replicated over multiple servers.
  • Data is automatically partitioned so each server contains only a subset of the total data
  • Server failure is handled transparently
  • Pluggable serialization is supported to allow rich keys and values including lists and tuples with named fields, as well as to integrate with common serialization frameworks like Protocol Buffers, Thrift, and Java Serialization
  • Data items are versioned to maximize data integrity in failure scenarios without compromising availability of the system
  • Each node is independent of other nodes with no central point of failure or coordination
  • Good single node performance: you can expect 10-20k operations per second depending on the machines, the network, the disk system, and the data replication factor
  • Support for pluggable data placement strategies to support things like distribution across data centers that are geographically far apart.

It is used at LinkedIn for certain high-scalability storage problems where simple functional partitioning is not sufficient. It is still a new system which has rough edges, bad error messages, and probably plenty of uncaught bugs. Let us know if you find one of these, so we can fix it.

northscale

Best practice web application architecture today is “scale-out” in nature — simply add more commodity web servers behind a load balancer to support more users. Scaling out is also a key tenet of the emerging “cloud computing” model.

In contrast, the RDBMS, the default choice for holding the data behind web applications, is a “scale-up” technology. To handle more users, one must get a bigger server (adding CPUs, memory and/or IO capacity). Bigger servers mean more cost and complexity compared to the low-cost, commodity hardware typically deployed for web- and cloud-based architectures. And, ultimately, there is a limit to how big a server one can purchase, even given an unlimited willingness and ability to pay.

In a nutshell, RDBMS technology optimized for 1970s-era applications is incongruent with the operating realities of modern web-delivered software systems, creating a scale-out, scale-up mismatch, as detailed below.

Membase is an open-source (Apache 2.0 license) distributed, key-value database management system optimized for storing data behind interactive web applications. These applications must service many concurrent users; creating, storing, retrieving, aggregating, manipulating and presenting data in real-time. Supporting these requirements, membase processes data operations with quasi-deterministic low latency and high sustained throughput. It scales linearly from a single-server deployment to a cluster of thousands of machines. And because membase does not require creation of a schema before storing data, it is a flexible, cost-effective place to Store Lots of Stuff. The original membase source code was released as Open Source by NorthScale, Zynga and NHN to membase.org in June 2010.

Anuncios

4 comentarios en “DataBases OpenSource ¿Cual?

  1. Muy chulo el post. Aún así, molaría otro con menos sistemas pero que se analice para qué situaciones es útil cada BD, porque yo de nombre las conozco todas (y alguna más), pero no me quedan claros los ámbitos de aplicación de algunas 😉

    1. Hola JC, hombre el post iba sobre la cantidad de posibilidades en BD open source que no sean las típicas.
      Si conoces alguna mas ya sabes es manda y actualizo.
      Lo intentare a ver si saco tiempo

Responder

Introduce tus datos o haz clic en un icono para iniciar sesión:

Logo de WordPress.com

Estás comentando usando tu cuenta de WordPress.com. Cerrar sesión / Cambiar )

Imagen de Twitter

Estás comentando usando tu cuenta de Twitter. Cerrar sesión / Cambiar )

Foto de Facebook

Estás comentando usando tu cuenta de Facebook. Cerrar sesión / Cambiar )

Google+ photo

Estás comentando usando tu cuenta de Google+. Cerrar sesión / Cambiar )

Conectando a %s