Drilling deeper into VSAM RLS with focus on performance
As discussed earlier, VSAM RLS that enables simultaneous access of VSAM files across multiple CICS region and batch (to a limited extent) is quite relevant today. To get good performance of VSAM RLS, it is necessary to understand better how it works and also the parameters that can be used for tuning.
VSAM RLS performance and tuning involves focusing on the performance path in RLS – referred to as VSAM RLS I/O path. VSAM RLS I/O path has four main components:
1. VSAM record management (VRM) – VRM provides the interfaces to the VSAM interfaces like GET, PUT, POINT, ERASE using with the application communicates with RLS. When we code one of these macro interfaces, the parameters to VRM is passed as an RPL control block.
2. Storage Management Locking Services (SMLS) – SMLS interfaces with VRM above and with XCF, the MVS component which provide locking services in the coupling facility. The lock structure in the couple facility is called IGWLOCK000.XCF is called to obtain, release or alter lock.
3. Ses Cache Manager (SCM) – SCM calls XCF caching services to obtain directory elements, or to read or write to the cache structure in the coupling facility. SCM also interfaces with the next component, BMF
4. Buffer Manager Facility (BMF) – BMF interfaces between VRM and SCM to locate the buffers and move them to local buffer pool if necessary. When BMF reads buffers into buffer pool, those buffers are kept there even after the data sets are closed. This is a powerful function, as it enables the use of the buffers read in previously when the data sets are re-opened and the impact is significant when the millions of records are involved. BMF manages the buffer space using the Least Recently Used (LRU) manager.
The four subcomponents that make up the I/O path typically behave as follows:
- When an application has issued a VSAM request (GET or PUT), VRM gets control.
- VRM then calls SMLS for performing record locking request.
- SMLS in turn calls XCF to obtain a lock for the particular record
- Then, BMF is called to find some buffers for the request.
- If the record cannot be found in the buffer pool, SCM is invoked to check if it is in the cache.
- If the record is not in the cache, the record is retrieved from DASD (the actual I/O happens).
Obviously, the shortest path leads to best performance. Skipping the last step (which involves retrieval from DASD) would result in significant performance improvements. The elapse time is improved by around one hundred times if you can get buffer hit. IBM test data shows the following results:
|Get request in which all CIs were found in the local buffer pool: .0001xx – .0002xx seconds|
|Get request in which at least the one CI is read from DASD: .01xxxx – .02xxxx seconds|
Details of the whole path is clearly elaborated in: http://publib.boulder.ibm.com/infocenter/ieduasst/stgv1r0/topic/com.ibm.iea.zos/zos/1.0/DFSMS/zOSV1R0_DFSMS_RLSPerformance_Tuning_Overview.pdf
Using the SMF type 64 and 42 records as well as RMF reports, you can collect the data on current performance of VSAM RLS. SMF 42 records have five subtypes – 15 to 19, each keeping different data about RLS statistics:
- Subtype 15 keeps statistics by storage class.
- Subtype 16 is statistics by each individual dataset.
- Subtype 17 is all statistics related to locking SMLS.
- Subtype 18 is caching statistics, which involves with the SCM component.
- Subtype 19 is BMF statistics, which involves the LRUs.
Tuning the RLS parameters described below based on the application need would result in significant performance improvements.
At the system level, RLS keeps all the parameters in SYS1.PARMLIB(IGDSMSxx) member of PARMLIB.
- RLS_MAX_POOL_SIZE specifies the maximum size of the SMSVSAM local buffer pool. Default size is 100 MB. It is a sysplex wide parameter. SMSVSAM attempts to not exceed the buffer pool size specified, although more storage might be temporarily used.
- RlsAboveTheBarMaxPoolSize specifies the total size of the buffer management facility (BMF) above the 2-gigabyte bar. This parameter can be mentioned for each system enabling different systems in a sysplex to have different values.
- RlsFixedPoolSize specifies the amount of real storage (both above and below the 2-gigabyte bar) to be dedicated to VSAM RLS buffering. This is used for page-fixing buffers – that can be used for critical data – that are not paged out by LRU. Using fixed buffers does provide significant performance improvements.
- RLS_MaxCFFeatureLevel specifies the method VSAM RLS caching uses to determine the size of the data that is placed in the CF cache structure. Value of Z indicates that RLS will cache CIs less than 4096 only. This options saves space in the CF cache structures and is useful if the data is read-only and remains valid in the local buffer pool. Value of A indicates that RLS will cache CIs up to 32768. This option requires more space in the RLS CF cache structures and is useful when shared data is updated across the sysplex.
- DEADLOCK_DETECTION specifies the interval for detecting deadlocks between systems.
In the dataset level, the following two parameters that can be specified in the data class are relevant:
1. RLSAboveTheBar that specifies if the SMSVSAM address space can take advantage of 64-bit addressable virtual storage during VSAM RLS buffering. The value of YES is recommended for high volume applications for best performance.
2. RLSCFCACHE specifies the amount of data that will be cached and can have the following values: ALL (default) specifies that RLS to cache both the data and index CI; NONE indicates that only the index data will be cached; Index CI that is always searched is more important than the data CI and hence has a higher priority than data CI. If you need to conserve space in coupling facility, this option of caching the index alone can be used. UPDATESONLY indicates that only WRITE requests will be placed in the cache structure;
While it is obvious that higher the buffer size set, better the performance, consideration should be given to the amount of real storage available. If the real storage available is limited and buffers specified are very high, frequent paging by LRU may occur.
Understanding how LRU works would be useful for setting these buffer sizes at optimal level. Refer http://publib.boulder.ibm.com/infocenter/ieduasst/stgv1r0/topic/com.ibm.iea.zos/zos/1.0/DFSMS/zOSV1R0_DFSMS_RLSPerformance_Tuning.pdf.
As summarized by IBM:
- The VSAM “I/O” read path can see 100 times improvement when valid buffers are located in the local buffer pool.
- VSAM RLS 64 bit buffering allows for larger local buffer pools and increased buffer hits.
- Whether it is 31 bit or 64 bit pool, RLS Cache structures must be increased to accommodate the larger local buffer pool sizes.
- Adequate real storage must be available to accommodate the larger local buffer pool sizes.
In addition to this, the applications can do their bit for performance enhancements by having appropriate requests.
At the RPL level, using OPTCD you can specify if the request is asynchronous (SRB) or synchronous (TCB). Asynchronous request are better from performance point of view, as SRBs cannot be interrupted all the time and hence runs quicker. Note that in case of asynchronous requests, the CPU time of the request is going to be associated with SRB time and with the client space, not SMSVSAM (important factor in case of cost distribution).
Similarly the performance is impacted depending on whether the read is direct or sequential. In case of direct requests, RLS constantly searches the index to find the data CI and hence is more costly that a sequential read or horizontal read through the index.
When issuing GET or PUT or while opening the dataset, you can use the RLSREAD parameter to indicate the read integrity required. If read integrity is not required, specify NRI (No Read Integrity) which skips the step of calling SMLS for locking and hence provides improved performance.