Defining In-Memory Databases

I was recently working on a project where the client wanted a single zip file to run the application. Yes, a single zip file. It’s probably not what you are thinking. The main issue was with the database, because RDBMS is something different/isolated from the application. I saw this as an opportunity for growth and explored some in-memory databases, because IMDBs can be embedded inside the application and require no separate database file or connection. It worked well, which made me recognize how useful IMDBs can be within a multitude of organizations. This post is  so I can demonstrate the working and architecture of IMDBs, and hopefully help them to become more pervasive in the technological world.

Defining IMDB

An in-memory database system is purely what we say: it requires no disk Input/Output whatsoever, while including cache management, file management and others. For this reason, an in-memory database is also faster than a conventional database that is either fully-cached or stored on a RAM-disk.

In-memory databases support formal data definition language (DDL) and database schemas. They also give support for relational, transaction logging, database indexes, client/server architectures, security features, etc.

History of IMDBs

Database scientist Jim Gray, working for IBM, conceptualized this technology 30 years ago when he built one of the first in-memory engines (IMS/VS FastPath) way back in 1978.

The first significant commercial IMDB offering was TimesTen (later acquired by Oracle). Since then, almost all top DB vendors boast an IMDB product, such as IBM’s SolidDB or Sybase’s ASE.

Differences Between IMDBs and Traditional Disk DBs

In-Memory DatabasesOn-Disk Databases
All data stored in main memory, no need to perform disk I/O to query or update data.All data stored on disk, disk I/O needed to move data into main memory when needed.
Data is persistent or volatile depending on the in-memory database product.Data is always persisted to disk.
Specialized data structures and index structures assume data is always in main memory.Traditional data structures like B-Trees designed to store tables and indices efficiently on disk.
Database size limited by the amount of main memory.Virtually unlimited database size.
Optimized for specialized workloads, i.e. communications industry-specific HLR/HSS workloads.Support very broad set of workloads, i.e. OLTP, data warehousing, mixed workloads, etc.

Architectural Differences Between IMDB and Traditional Disk DB

Defining In-Memory Database Defining In-Memory Database

Architectural Diagram of IMDB

Defining In-Memory Database

The above diagram is explained as follows:

  1. A request for data is made by Application form the database runtime through database API
  2. If data is stored in the physical disk, the database runtime instructs the file system to retrieve the data from the physical media
  3. The data is now cached by the file system and passes another copy to database
  4. Database keeps one copy in cache and passes the other to the application
  5. It is now modified by the application, which passes it back to database through database API
  6. Modified data is copied to the database cache
  7. Copy in database cache is written into file system, where it is updated in the system cache
  8. Finally, data is written back into the physical media

Digging the Architecture Deeper

Placing the entire on-disk database on a RAM disk will speed up both database reads and writes. These databases are embedded within the application itself. However, the database is still hard-wired for disk storage, and processes in the database to facilitate disk storage, such as caching and file I/O, will continue to operate even though they are now redundant.

IMDB usually runs with the application server in which it is embedded. Additionally, data in an on-disk database system must be transferred to numerous locations as it is used. In the above diagram, the handoffs required for an application to read a piece of data from an on-disk database modify it and write that record back to the database. These steps require time and CPU cycles and cannot be avoided in a traditional database, but when the data resides in the RAM, minimal CPU cycle and null disk I/O is required.

Key Benefits

  • Performance: faster access to data
  • Flexibility: store all or part of your database in-memory and/or on-disk
  • Reliability: ensure ACID compliance in diskless mode
  • Ease of Use: one additional keyword to your data definition tells your database to reside in-memory, making it trivial to move an on-disk database to in-memory
  • Multiple APIs and Drivers: d_API, Objective C API, SQL API, ODBC API, C++ API, ADO.NET Provider, JDBC Driver, ODBC Driver

 

Types of IMDB

  • VoltDB: Implements H-Store design
  • MemSQL: Proprietary SQL relational
  • SQLite: SQL database that supports in-memory with the memory connection string
  • HyperSQL Database: Leading SQL relational database software written in Java
  • McObject eXtremeDB: This database combines exceptional performance, reliability and developer efficiency in a proven real-time embedded database engine.

 

Implementation/Usage of HSQLDB (IMDB) with Sample Java Program

Defining In-Memory Database
Connection String
<property name="driverClassName" value="org.hsqldb.jdbcDriver" />
<property name="url" value="jdbc:hsqldb:file:/home/vikask/elmo/db/elmo;" />
				<property name="username" value="sa" /> 
				<property name="password" value="" />

In this simple Java project to demonstrate Hibernate, HSQL and Maven using Java Annotations, the HSQL database is used to make the project simple. This is because we can use in-memory database and we would only need a JAR file to be included in our project.

Starting HSQLDB Server:

Defining In-Memory Database
  • To connect to an embedded HSQLDB database, select the JDBC (HSQLDB embedded) connection type from the connection type list
  • Enter any login information if applicable, and then specify whether to use an existing embedded database or to have HSQLDB create a new embedded database
  • If the embedded database already exists, browse to the directory where the database files are located (such as database_name.log, database_name.script, and database_name.properties) and select the database_name.script file
  • If the database doesn’t exist, type in or browse to create a new location for the HSQLDB database. HSQLDB will then create the necessary files with the prefixed database name typed in. For example: if you’re typing /home/vikask/sample as the location of the database, HSQLDB will create a file called sample.properties, sample.log, etc.

Connection Driver(s) and Parameters:

Defining In-Memory Database
  • Driver Class: org.hsqldb.jdbcDriver
  • Driver Location: Simply provide the location of the jar or zip file that contains the HSQLDB drivers. This file can be downloaded from HSQLDB. See HSQLDB’s site for more information on obtaining the HSQLDB drivers.
  • JDBC URL Format: jdbc:hsqldb:file:/<path to db>. If the default port is being used by the database server, the :<port> value of the JDBC URL can be omitted.
  • Example: jdbc:hsqldb:file:/home/vikask/elmo/db/elmo

HSQL Database Manager:

Defining In-Memory Database
  • Displays information in a tree format for databases, schemas, tables, views, system tables, functions, sequences, users, and indexes
  • Displays HSQLDB table information such as column name, column type, column length, column nullability, and primary and foreign key information
  • Easily view table contents or database object information via the View Contents

Market Trends for In-Memory Database and the Forecasts

In-memory databases are among the top ten technologies impacting the IT sector. The speed in which application infrastructure technologies are maturing brings us faster and more affordable semi-conductors, meaning that mainstream use of in-memory computing (IMC) is more likely in the future.

Two major segments in IMDB:

  1. In-memory data management system
  2. In-memory application platform

At present, the IMDB system has greater market share and the in-memory application platform is projected to increase significantly. According to Research and Markets, in-memory data management is expected to reach $1 Billion by 2016.

This forecast is based off of a sharp rise in transactional and analytical requirements for voluminous big-data. IMDB solutions are quickly finding homes in almost every database warehousing and analytics application. Today, database management and analysis technologies are ubiquitous. From private sector organizations to public/government sector, data management and analysis technologies play a vital role in achieving operational efficiency. Current organizations are finding a need for technologies that can store more data in less space while analyzing at higher speeds.

Vikas Kukreti

Vikas Kukreti

Technical Lead

Vikas Kukreti is working as a Technical Lead in Development Engineering at 3Pillar Global. Vikas has 9 years of database design, ETL, DWH, AWS, development and Data Science experience. He has database architecture experience in areas such as ETL, Data warehousing, Oracle Database, Hadoop, Hive, and Data Modeling. He has strong knowledge of Big Data, Data Science and Cloud Computing. Vikas is passionate about big data technology and Data Science, especially Hadoop and MongoDB. Vikas is a graduate of Uttar Pradesh Technical University (UPTU), India, and is a keen reader of classic novels and a movie freak.

One Response to “Defining In-Memory Databases”
  1. Anvesh on

    Nice Article !

    Really this will help to people of Database Community.
    I have also prepared small note on this, What is in memory database?

    http://www.dbrnd.com/2015/12/database-theory-what-is-in-memory-database/

    Reply
Leave a Reply

Related Posts

3Pillar CEO David DeWolf Quoted in Enterprise Mobility Excha... David DeWolf, Founder and CEO of 3Pillar Global, was recently quoted in a report by Enterprise Mobility Exchange on the necessity of understanding and...
High Availability and Automatic Failover in Hadoop Hadoop in Brief Hadoop is one of the most popular sets of big data processing technologies/frameworks in use today. From Adobe and eBay to Facebook a...
How the Right Tech Stack Fuels Innovation – The Innova... On this episode of The Innovation Engine podcast, we take a look at how choosing the right tech stack can fuel innovation in your company. We'll talk ...
The Road to AWS re:Invent 2018 – Weekly Predictions, P... For the last two weeks, I’ve been making predictions of what might be announced at AWS’ upcoming re:Invent conference. In week 1, I made some guesses ...
Building a Microservice Architecture with Spring Boot and Do... This is the fourth blog post in a 4-part series on building a microservice architecture with Spring Boot and Docker. If you would like to read the pre...