Posted by: Greg Luck
Big Data, Big Fast Data, BigMemory, in-memory, In-memory data management, Uncategorized
JCache, being specified in JSR107, provides a common way for Java programs to interact with caches. Terracotta has played the leading role in JSR107 development, acting as specification lead, and final approval is expected later this year. BigMemory, Terracotta’s flagship product for managing big, fast data, will be fully compliant with the specification early next year.
Open source caching projects and commercial caching vendors have been out there for over a decade. The distributed kind, which is often called a Distributed Cache, has entered wide adoption. Each vendor uses a very similar map-like API for basic storage and retrieval yet each vendor uses their own API.
With the introduction of the JSR107 specification developers can program to a standard API, instead of being tied to a single vendor, eliminating a major inhibitor to the mass adoption of in-memory technology. Other areas of Java have also solved this problem through standards. Some successful examples are JDBC, JPA and JMS.
In fact, the analyst firm Gartner reported last year that the lack of a standard in this area was the single biggest inhibitor to mass adoption.
A cache is a place where you put a copy of data that is intended to be used multiple times. Caching implementations, being in-memory, are much faster than the original source of the content. This means you get a performance benefit from using a cache instead of the original source. You also offload the resource that the original data came from.
Caching is an important technique used within the Big Fast Data family. Because caches are key value stores held in memory, cache operations are lightning fast. They are also very simple and, consequently, have far fewer features than a database.
To be effective, data needs to be used multiple times. There is no value in caching data that is written once and never read or only read once. The efficiency of the cache can be measured by the hit ratio, which is defined as the cache hits / cache requests.
To provide the maximum offload, caches need to be distributed so that the work done by one application server gives benefit to the others and eliminates any duplicate requests for the same data to the underlying resource.
Finally, the affordability of servers with memory capacities of 1TB and higher, combined with vendor innovation to utilize that memory for cache storage, is resulting in a new trend where the cache has increased operational significance. Instead of just caching part of a dataset, the entire dataset is placed in cache and is used as an authoritative source of information – the cache in essence becomes the operational store for the application. In this use case, the cache is often referred to as a Data Grid.
Each of these areas has requirements that the standard must deal with.
From a design point of view, the basic concepts are 1.) There is a CacheManager that holds and controls a collection of Caches and 2.) Those Caches in turn have entries with keys and values. The API can be thought of as map-like with the following additional features:
- Atomic operations, similar to java.util.ConcurrentMap
- Read-through caching
- Write-through caching
- Cache event listeners
- Caching annotations
- Full generics API for compile time safety
- Storage by reference (applicable to on heap caches only) and storage by value
In addition there are a number of optional features that not all implementations might provide. These are:
- storeByReference (storeByValue is the default)
Whether they are present can be determined with a call to the capabilities API: CachingProvider.isSupported.
Where Can I Use It?
JCache will work with Java SE 6 and higher and will run in Java EE 6 and higher, Spring and Guice enterprise environments.
For further information visit JSR107’s project home page.