Enterprise IT Consultant Views on Technologies and Trends

Oct 4 2010   5:41AM GMT

Key-value Pair (KVP) – the backbone of NoSQL

Sasirekha R Profile: Sasirekha R

Key-value Pair (KVP) – the backbone of NoSQL

Most NOSQL databases- BigTable, Hadoop, CouchDB, SimpleDB, memcached, Redis – use key-value pair (KVP) concept. A key-value pair (KVP) is a set of two linked data items:

1. a key (say account number or part number), which is a unique identifier for some item of data, and

2. the value, which is either the data that is identified or a pointer to the location of that data.

Key/Value stores have been there for a long time – and Unix’s dbm, gdbm and Berkley DB are key/value stores. Key-value pair concept has been frequently used in traditional RDBMS applications for lookup tables, hash tables and configuration files.

KVP has an important restriction, namely being able to access results by key alone. This restriction results in huge performance gains, massive speed improvements enabling partition of data over multiple nodes without impacting the ability to query each node in isolation.

As moving away from normalization meant compromising on consistency, using KVPs with key based access only restriction compromises on the rapid retrieval capability provided by relational databases and makes reporting (especially ad hoc ones) difficult. 

The key/value stores are the simplest NoSQL databases where a key points to a value that is typically an arbitrary string. The operation of finding the value associated with a key is called a lookup (or indexing) and the relationship between a key and its value is called a mapping (or binding).

Most NoSQL key/value stores are a bit more than a simple Key/Value Store and have advanced features. Within the key/value stores, the in-memory variants retain their data in memory for improved performance (useful as distributed cache mechanism), and the on-disk versions save their data directly to disk (useful as data storage).

Memcached is a simple key value store where the items are made up of a key, an expiration time, optional flags, and raw data. The server does not understand data structures and the data uploaded must be pre-serialized. Some commands (incr/decr) may operate on the underlying data, but the implementation is simplistic.

Redis is an advanced key-value store, where the keys must be simple strings and the values, which can be of the following types:

  • Strings – Most basis type and the single elements in other data types are strings. Redis strings are binary safe, in the sense, it can contain any kind of data (JPEG image, serialize Ruby object etc.).
  • Lists – Lists are list of Redis Strings, sorted by insertion order.
  • Sets – Sets are unordered collection of non-repeating Strings. Sets (as expected) does not allow repeated members and adding same element multiple times results in the set having a single copy of the element (and doesn’t require a check for exists). Sets support a wide range of operations – Union, Intersection and difference.
  • Sorted sets – Collection of non-repeating elements ordered using an associated score (every member hash a score that is used to take the member in the right order).
  • Hashes – Unordered map of strings between fields and values. Hashes provide a simple way for a key holding an object composed of different fields. For instance web applications users can be represented by a Hash containing fields such username, encrypted password, last logic etc.

The document stores – like CouchDB and MongoDB – also used KVP where the value is stores documents of any length and allow for retrieval based on the document content.

1  Comment on this Post

There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when other members comment.
  • erichansen1836
    I like using the public domain SDBM database of key/value pairs, hash table storage device (tied to a program hash table), to index the fixed-length records of a flat file database.  I can store multi-millions of records in my flat file, with instantaneous lookup to any record. The Key is one or more fields and/or partial fields contained within the records of the flat file, and the value is the byte offset location to each record to set the file pointer for random access.  Edits to records are made "in place" overwriting the record in its entirety or only overwriting the contents of one field.  I call this Joint Database Technology.  I have code examples at Perlmonks.org.
    10 pointsBadges:

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

Share this item with your network: