Improve performance for SQL query reading file having milions of recods (AS/400 – SQLRPGLE program)

65 pts.
Tags:
AS/400
AS400 - SQL Queries
SQLRPGLE
I am fetching only 14 records each time from a customer file(physical file)  having milions of records. This file has 46 fields. Fetch is done in the follwoing manner, but it takes longer time -

SELECT * FROM  CUSTOMER A WHERE A.CLINM1 LIKE 'B%' AND (CLINM1 <> '' OR  CLIPR1 <> '') ORDER BY A.CLINM1 ASC, A.CLIPR1 ASC

This is the only one example given to shorlist clients which starts with "B". There can be different selection criteria based on differet fields from customer file.

How to improve the performance ?

One more thing noticed isfor any selection criteria, 1st fetch takes longer time than the next fetch.



Software/Hardware used:
AS/400 - SQLRPGLE program
ASKED: December 23, 2011  5:23 AM
UPDATED: March 17, 2012  6:00 AM

Answer Wiki

Thanks. We'll let you know when a new response is added.

I have 210 millon+ record file that gets approx. 1 million records added per day
Have indexes built over common access paths

Fairly sure for fetch 2 thru how ever many is access path is cached from first query

Discuss This Question: 26  Replies

 
There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when members answer or reply to this question.

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
  • TomLiotta
    There can be different selection criteria... Please show the various selection criteria. There's no way to know what might be good methods without knowing the range of possibilities. Is this expected to be a dynamic WHERE clause? Tom
    125,585 pointsBadges:
    report
  • AnjaliR
    Hello Tom, WHERE clause is dynamic.. i mean based on search criteria (like, client code, phone no, email id.. etc..), conditions in WHERE clause keep chainging based on search criteria.
    65 pointsBadges:
    report
  • BigKat
    As it is dynamic selection criteria, improvements for query performance might be gained by:
    • Make sure you have indexes that cover all your selection criteria possibilities (at least the reasonable combinations).
    • Take advantage of Encoded Vector Indexes for fields that have a fixed number of values like State, Gender, Yes/No attributes, or Code Types.
    • LIKE criteria are very costly. If you are always doing a 'first character is x' type of search with the LIKE ('x%') consider adding a field that could be populated with the first character (either by your data maintenance programs, or even by a trigger) and create an Encoded Vector Index over it
    Note: improving query performance can cost you during data maintenance performance (more indexes to update, triggers to run, etc)
    7,585 pointsBadges:
    report
  • BigKat
    also if you are always wanting to fetch upto a fixed number of rowsfor a process, you might get some performance benefit by doing a multi-row fetch and get all 14 (or as many as meet your criteria, if fewer exist) in one shot.
    7,585 pointsBadges:
    report
  • deepu9321
    Hi Anjali, You can have a try by creating Indexes as bigKat Suggested. But, You might need to think of the Complexity in maintaining the program. And, You have mentioned, The First Fetch Statement is taking much time. I guess, You have observed the time including the Fetch Statement and Bulding Cursor. I dont think you would require all 46 fields in your program. You can mention the fields what ever you will require in your program. This will save time on your Fetch Statement. Pradeep.
    3,605 pointsBadges:
    report
  • philpl1jb
    How long does the first fetch take? How long for a subsequent fetch? Phil
    48,575 pointsBadges:
    report
  • TomLiotta
    WHERE clause is dynamic. Each potential WHERE clause needs different "optimization". BigKat's discussions are about as good as you're going to get. Set up indexes that will be useful Pradeep's mention of replacing SELECT * with SELECT col1, col2,... is also a good idea. Only select columns that you actually use. That's a good practice even if you really do want all 46 columns. Be aware that some WHERE clauses won't make total sense. E.g.:
    WHERE A.CLINM1 LIKE 'B%' AND (CLINM1 <> '' OR  CLIPR1 <> '')
    That has CLINM1 <> '' in a parenthesized OR condition. If CLIPR1 <> '' is TRUE, then it doesn't matter for the parenthesized condition if CLINM1 <> '' or not. But that somewhat conflicts with CLINM1 LIKE 'B%'. Testing for CLINM1 LIKE 'B%' and for CLINM1 <> '' at the same time is effectively redundant, especially when CLIPR1 <> '' is TRUE. Without knowing how the WHERE clause gets constructed, I don't know of any way to ensure good performance. If users get to specify the WHERE clause, there might not be much hope. Tom Tom
    125,585 pointsBadges:
    report
  • graybeard52
    You may want to try adding this clause to your SQL:
    OPTIMIZE FOR 14 ROWS 
    This will cause the SQL optimizer to consider what's the fastest result for a few records rather than the fastest result all records it might return.
    3,115 pointsBadges:
    report
  • AnjaliR
    Current process is like this- There are two programs A and B. Program A is subfile program which displays 10 client records on screen and this program also has screen for customer short list. When short list fields are entered in program A, it calls program B to do selction of client records . Program B does following thisng- -Uses two customer files client header and clint detail file for records selection. - Prepare/open/ - Fetch for 14 records only from client header file, but WHERE CLAUSE can have both client files for selection of reords. These fetch records are written into one file. - Fetched records are sent to program A through DATQ Then program A does following things: -Reads data from client header file data from DTAQ. Get other information from client detail file and load into subfile. ================================================================ hello bigcat/Tom, Following fields are used for record selection- customer number, customer name , surname, zip code, client cretrion date, modificaton date, creation store, modification store, clietnt type(around 10 client type exists), nationality , country, phone no, Email , Age, passport. From these fields client phone and email are form client detaik file and rest of the fields are from client header file. I have around 11 index files already present on client header file, but all these index files fields are used curentky in other RPG programs as key fields. So i shoule create 2 more indexes, one index file to cover client header fields and other to cover client detail fields. Please advise. ================================================================ Hi Pradeep, As told by you 1st fetch for same selection takes time because it has to prepare /open cursor, but next fetch (if we do page down on subfile) doens not take more time, because at this time only FETCH is executed. I tried selecting only few fields from client header file, but did not find much difference in response. I have around 11 index files already present on client header file, but all the indexes are
    65 pointsBadges:
    report
  • philpl1jb
    I think there are tools to hlep you find the indexes used and/or created while the program runs. Phil
    48,575 pointsBadges:
    report
  • deepu9321
    Hi Anjali, How you are going with it? Have you created two more indexes or found any similar existing Indexes? Did you found any improvement in performance by using these indexes? Pradeep.
    3,605 pointsBadges:
    report
  • AnjaliR
    Hello All, As i told some indexes already exist in my application, so i created one index which to cover the fields, which are not covered in existing indexes, but still program uses one of the existing index. Newly created index not used for reason code 4/ 5.. key not matich or costly. If we search only with clinm1 like 'A%', INDEX CLIENT1 is used which has CLINM1 , CLIPR1 as key fields. But search takes almost 1 minut to display 14 records.. For this SQLQuery remains same, what i posted before. I mean if i enter client name as "A" , then 1 minut is taken just to diaplay 14 records.
    65 pointsBadges:
    report
  • philpl1jb
    I doubt you're going to do better than 1 minute on clinm1 like ‘A%'. You might get a little better peformance with where clinm1 between 'A' and 'B'. The system is going to look for an index where the first,second,third fields are useful to it's logic, then first, second then first field. It cannot use an index where the thrid field is useful but the first and second aren't or two indexes where the first fields are useful. Phil
    48,575 pointsBadges:
    report
  • TomLiotta
    If your LIKE condition always has the form 'A%' where the wild character is at the right-most end, then use BETWEEN instead. Replace the wild character with a low-value in the first expression and a high-value in the second expression. The result becomes BETWEEN 'Al' and 'Ah' with the lower-case 'l' standing for *LOVAL and the lower-case 'h' standing for *HIVAL. You can scan for the position of the '%' character. It might not help much, but should be better than using LIKE when it's not really necessary. Tom
    125,585 pointsBadges:
    report
  • BigKat
    also, when I said build indexes to cover all your selection possibilities, I didn't mean ONE monster with 14 keys, but more like14 indexes with one key each, and maybe a handful more with some common two or three key combinations.
    7,585 pointsBadges:
    report
  • AnjaliR
    What i noticed is, if i remove ORDER BY clause, query runs very fast..but i need to display records on the scrren with order only.
    65 pointsBadges:
    report
  • philpl1jb
    Try this format, it should pick the records first in the subselect. The Order by will only apply to the results. Select * From (SELECT * FROM CUSTOMER A WHERE A.CLINM1 LIKE 'B%' AND (CLINM1 <> '' OR CLIPR1 <> '')) b ORDER BY B.CLINM1 ASC, B.CLIPR1 ASC
    48,575 pointsBadges:
    report
  • philpl1jb
    but the temporary indexes may remain available for awhile and throw off subsequent tests.
    48,575 pointsBadges:
    report
  • AnjaliR
    Hello Philip, What is "b", specified iat the end of WHERE clause? WHERE A.CLINM1 LIKE ‘B%’ AND (CLINM1 <> ” OR CLIPR1 <> ”)) b
    65 pointsBadges:
    report
  • philpl1jb
    You can read the primary select as Select * From b ORDER BY B.CLINM1 ASC, B.CLIPR1 ASC This will only have to sort the few recorfds provided from the subquery b is the result of the subquery (SELECT * FROM CUSTOMER A WHERE A.CLINM1 LIKE ‘B%’ AND (CLINM1 <> ” OR CLIPR1 <> ”)) as b As we've discussed the subquery performance might be improved by using between rather than like. It's hard to imagine that the SQL processor would be programmed to determine that LIKE values beginning with a characger can be found with the index So you might get better performance with A.CLINM1 between 'B' and 'BZZZZ' Phil
    48,575 pointsBadges:
    report
  • TomLiotta
    It’s hard to imagine that the SQL processor would be programmed to determine that LIKE values beginning with a characger can be found with the index That was a possibility that occurred to me since LIKE can have so many variations (practically unlimited.) But isolating the sub-select from the ORDER BY clause could have the biggest impact if the ORDER BY is already known to be a big performance drag. Tom
    125,585 pointsBadges:
    report
  • philpl1jb
    Tom Did you just agree with me? Robin: You can't get away from Batman that easy! Batman: Easily. Robin: Easily. Batman: Good grammar is essential, Robin. Phil
    48,575 pointsBadges:
    report
  • TomLiotta
    ...more like14 indexes with one key each, and maybe a handful more with some common two or three key combinations. And if any "two or three key combinations" are created, either don't create additional single-key indexes that match the first keys or do create the combinations first. An existing access path can be shared if it fully includes a new index. Otherwise separate access paths will be created and maintained with no real value. (Note that EVIs don't participate in 'shared' access paths, but that's not their purpose anyway.) Did you just agree with me? @Phil: Yes. I usually do, though there's often no point in adding a comment about it. But it becomes more important to say so, IMO, as a thread gets complex. Tom
    125,585 pointsBadges:
    report
  • philpl1jb
    I've designed the "unlimited" search screen in a number of environments for a number of valid situations. By the time I had built the query, I had at least an idea of how long the verious querys would take. If it was a 1 minute query, I'd flash a message .. this will take about 1 minute, please wait. If I suspected it would take longer .. I might display a question, Do you want to continue, search might take 2 hours? (Y/N). For the really long ones you could run the query in batch, send a message to the user when it's ready with a Keyvalue to recover the data. Phil
    48,575 pointsBadges:
    report
  • AnjaliR
    Thank You all for tips/ sugessions. Reaponse time for my search is now improved with following things- -I have modified SQL queries. Using Sub-queriesinstead of one single query , when two files are involved. - Re-veiwed the SQL statments to remove unwanted conditions. - Use of OPTIMIZE keyword. - Use of RPGLE READ when selection done on either name or surname -Use of query suggested by Philippe for ORDER BY condition. Regards, Anjali
    65 pointsBadges:
    report
  • philpl1jb
    Thanks for the feedback it will be valuable to others finding this thread.
    48,575 pointsBadges:
    report

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

To follow this tag...

There was an error processing your information. Please try again later.

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Thanks! We'll email you when relevant content is added and updated.

Following