Posted by: Karthik Lalithraj
Big Data, BigMemory, Ehcache, Hibernate, in-memory, In-memory data management, Real-time Big Data, Uncategorized
One of the great benefits of persistence frameworks like Hibernate is that they allow architects and developers to mange data in ultra-fast machine memory, or RAM. By default, a first-level cache — at the Hibernate session level — is always enabled. A second-level cache, at the session-factory tier, is optional, but can result in huge performance gains at scale. Additionally, Hibernate allows for query-level caching.
Terracotta BigMemory (Ehcache) is the default query- and second-level cache for Hibernate, and it can keep terabytes of data in memory with as few as two changes to a configuration file. Understanding how BigMemory works with Hibernate makes designing your enterprise applications much easier, so in this post I’ll share tips and best practices for using Terracotta BigMemory as a query cache and a second-level Hibernate cache.
1. How do I get started?
Documentation on using BigMemory with Hibernate is here: http://ehcache.org/documentation/user-guide/hibernate
Enabling the second-level cache or query cache requires only a single line of config in your hibernate.cfg file:
<property name=”hibernate.cache.use_second_level_cache”>true</property><property name=” hibernate.cache.use_query_cache “>true</property>
I typically use the Ehcache Singleton factory:
2. How do I know my query is hitting the second-level cache?
The simplest and safest way is to set “show_sql” to “true” in your Hibernate property file. When you query the database, if the SQL query prints to the console, it is probably not using your second-level cache. In addition, you can use the Terracotta Monitoring Console (provided as part of BigMemory Enterprise kit) or any Hibernate profiler (http://www.hibernatingrhinos.com/products/hprof) and look at the hits and misses against your cache.
3. How do I specify custom cache regions?
By default, Hibernate always points to the default ehcache.xml and the default cache region. This implies that Hibernate manages the cache regions for you.
Let’s take an example. Say you have two Hibernate objects, Account and Customer. By default, the settings of the default cache will be applied to these objects. Hibernate will create a cache with the fully qualified domain path (e.g. com.company.domain.Account with be the name of the cache)
For more control, you can also specify custom cache regions, you can do this in two different ways:
- Specify this as a cache region in your Hibernate cfg or using Hibernate annotations
e.g. @Cache(region = “Account”)
- Specify the cache region in your Hibernate domain configuration
Note that if you use query cache, Hibernate creates two caches internally for its purpose:
The StandardQueryCache has the query that is executed as part of the key itself. The updateTimeStampsCache is used to track the timestamps for updates to particular tables.
Note that in case you want to cluster your query cache, you will need to specify the above as 2 separate cache regions to your ehcache.xml and cluster them using the terracotta tag.
4. My application is not using Hibernate, so why do I get the error java.lang.NoClassDefFoundError: org/hibernate/cache/CacheKey ?
You probably have different applications loaded by the same classloader that uses Hibernate. Separate your CacheManager config/ ehcache.xml as follows:
a) All Hibernate-related objects that require Ehcache as second-level cache should be defined in in ehcache.xml.
b) Plain old Ehcache objects will be defined in ehcache-nonHibernate.xml. Use CacheManager(“ehcache-nonHibernate.xml”).getInstance() to get a reference to this CacheManager.
5. How do I evict specific cache regions when I execute my hql?
Hibernate allows you to specify a synchronize tag within the class. This lets you specify the table(s) you are updating, and it will only clear the cache for the specified table(s). If you do not specify any table(s), it will clear the cache for all tables.
Here is a link and an example on the Hibernate forums on how this is accomplished:
select item.name, max(bid.amount), count(*)
join bid on bid.item_id = item.id
group by item.name
Shows item and bid are accessed but since it is native SQL, Hibernate has no idea what is tables/entities are being touched. Synchronize informs Hibernate so it can deal with possible flush initiation, caching, etc., depending on how summary is being used.
6. How do I use Hibernate criteria, and what is the Native SQL “gotcha”?
To take advantage of the second-level cache, you will need to use Hibernate criteria. To take advantage of the query cache, you can use HQL.
If you run a stored procedure or issue an executeUpdate or execute Native SQL, there are two side effects of which you should be aware:
1. The second-level cache will not be used
2. The second-level cache will be completely purged in certain circumstances
Whenever a Query.executeUpdate() is run, for example, Hibernate invalidates affected cache regions (those corresponding to affected database tables) to ensure that no stale data is cached. This should also happen whenever stored procedures are executed.
Furthermore, if you run Native SQL through Hibernate you entire second level cache will be purged.
7. What about object lifecycles and read/write modes?
Hibernate likes to control the entire lifecycle of the object, from inception to destruction.
In read/write mode, Hibernate considers itself the owner of data, and tries to provide high-consistency guarantees. Without getting too technical here, let’s just say that Hibernate takes charge of all lock management and transaction management in this mode. As a result, if you need to enable rejoin/non-stop, you can only do so under non-strict read/write mode.
8. How do I cluster and enable BigMemory when using Hibernate?
This is simple! It takes just two lines of config to cluster using Terracotta:
- Specify the terracotta config url (where is Terracotta Server deployed?)
- Specify which cache regions need to be clustered (use the <terracotta/> tag)
You can continue to use ARC with Hibernate http://ehcache.org/documentation/2.5/arc/index
9. How do I cluster cache regions across multiple Hibernate session factories?
You don’t need to do anything. Just cluster the cache region as above and you will be set. Having separate Hibernate session factories should not matter. For a cache region to be clustered, it should belong to the same cacheManager-cacheRegion combination.
10. Can I use writebehind with Hibernate?
While you cannot configure a cacheWriter to work with Hibernate (due to the transaction semantics identified above), you can configure this using Ehcache putWithWriter and Ehcache writebehind. Use Hibernate as part of your CacheWriter interface implementation to define your persistence strategy. More documentation is here: http://ehcache.org/documentation/apis/write-through-caching
I hope these tips contribute to making your experience with Hibernate and BigMemory easier and more fruitful. If you have additional questions, please post them to the comments.