Database Issues – Trying to prove that something has to be done

470 pts.
Tags:
AS/400
Database
Database issues
IBM iSeries
Indexes
iSeries Navigator
Logical Files
PF
SQL
SQL tables
Hello Everyone, I'm currently trying to prove a concept on why we should re-create our old existing PFs into SQL defined tables (Portability, performance, etc.). After taking a look at some files, I noticed an even larger problem that would ruin my chance of proving this concept. Unless some major changes happen, we won't see any benefits from the change. Here's my situation: 88 Million records with over 150 Logical Files over one Physical File. Am I crazy? Or, is that just ridiculous? Apparently, this isn't the only file. I've found at least 5 to 6 more with the same problem. I looked at some of our smaller files and they have an average of 20 - 30 logical files over that data. They also don't seem to think this isn't a major problem and we shouldn't spend a lot of time on it. So, what I'm asking for is opinions and ideas on how to prove to that spending the time (And it will be a LOT of time) to analyze our database and get rid of all these problems. Here's my idea for an approach to each file:
  • Analyze all of the logical files and remove any not currently used
  • Check all programs using the logical files and remove if no longer used
  • Identify programs that should be re-written to eliminate logical files
  • Re-write said programs
  • Create history files to reduce the number of records
  • Convert PFs and LFs to Tables and indexes
  • Just taking a quick look at the Indexes(LFs) using iSeries Nav, I found a lot of information. One thing I noticed in the "Last Build" column was they were last built in 2002. I'm guessing even if I just rebuild those logical files we would see a performance boost. But, what is considered rebuilding? Re-compiling? I apologize for the lengthy post. I have a lot more questions. But, I just wanted to get a feel for some Ideas. Thank you, Dave

    Answer Wiki

    Thanks. We'll let you know when a new response is added.

    Questions:
    The 88 Million records
    – are they current?
    – Needed for periodic processing?
    – Does a purge process exist?

    150 Logical Files — is it the PF with 88 million records???
    – when considering the effect on updates/writes to the system
    discount any with Maintenance *DELAY or *ReBUILD

    – are updates/writes to this file unacceptably slow?
    – remember some of the logicals may be accessed by Queries

    I think you should tackle ways to reduce the number of logicals before you consider more radical changes.

    Phil

    //////////////////////////////////////////////////

    2005 may seem like ancient history but on one system that I worked on, I had to be able to provide customer statements back to “the dawn of time” (the word time should be reverberating). That didn’t mean that they had to stay in the original files but a whole bunch of issues come out of having them in different files. So check out the retention rules.

    Logicals are often overused and, in your case, when you do go to purge those old records, you can expect really poor performance because of all the index updates.

    Phil

    Discuss This Question: 15  Replies

     
    There was an error processing your information. Please try again later.
    Thanks. We'll let you know when a new response is added.
    Send me notifications when members answer or reply to this question.

    REGISTER or login:

    Forgot Password?
    By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
    • WilsonAlano
      Hi Dave, IMO, what you planed to do is the path to follow and in my experience, I saw a LOT of LF unnecessarily created just because programmers don't understand LF uses and concept. For example, I saw a LF with a key of 3 fields, Branch, Account and Currency and another LF with 2 fields key Branch and Account. It's totally redundant and this kind of error is frequent. So, you have a lot of work to do but I'm sure it will worth effort. Regards, Wilson
      2,600 pointsBadges:
      report
    • Gilly400
      Hi Dave, 150 logical files over one PF is absolute madness IMHO. Sounds to me like people have built temporary logicals for specific tasks and then never removed them. Do you have any reporting tools that may be building logicals (something like business objects or showcase vista) - this might be worth checking? Do you use any program generators/3GL's/Case tools (synon/lansa/asset/etc) - these could also be building unnecessary logicals? Many of the systems I've worked with (in all different sorts of business - banking/finance, manufacturing, retail, etc) don't have 150 logical files on the whole system, let alone over one PF. Your approach looks sound to me. I've never gone down the road to convert to tables and indexes and I'm not sure what the benefits are of this, but reducing the number of logicals and the amount of data held in PF's is bound to give an improvement in performance (as well as a reduction in used disk space). Regards, Martin Gilbert.
      23,730 pointsBadges:
      report
    • Dcantwell
      @Phil: Most of the records are really old. A history file system was just never implemented. That's part of my idea to fix it. around 67 million of them are from 2005 on back. That's about a 70% reduction if we do it. I agree with you completely about the logical files. I've begun analyzing them and there are a lot of repeated keys. This biggest problem is the idea here goes like this: "Lets create a new logical so all we have to do is setll reade the data we want" Instead of compensating for it in the code. A ton of them have Select and Omits included in their definition. So if a new report program was needed, the programmer made a new logical to make his/her life easier. Never mind the consequences. There's also no one in charge of the databases (No DBA). @Martin: No reporting tools, no program generators. This is an insurance system I'm working with. We're constantly audited and constantly changing to bring business in the door. That's why it's going to be hard for me to prove that the time and effort spent on this will be worth it while continuing the support of the system. I believe a staged approach to this whole concept would work best for the business. Overall, we've been told to avoid using SQLRPGLE for programs because the initial time you use the program, it's slow. (In reality, it isn't compared to using RPGLE with logical files defined each time). I took a program that used to run 5 hours, re-wrote it in SQLRPGLE and it runs in 20 - 30 minutes. But, from some research, using SQL defined tables and indexes would fix the SQLRPGLE problem due to the fact that no real "indexes" are defined. If you look at the iSeries navigator, all the logical files right now are considered "views" and not "indexes". <<<< Please tell me if I'm wrong.
      470 pointsBadges:
      report
    • WilsonAlano
      Hi Dave, LF are real indexes. The difference between an SQL index and a LF is basically the page size (block of keys) used in each case. SQL indexes have bigger page size than LF and for sequential by key processes it's better than LF. For random access the difference is almost zero and in some cases it's worst. You must be aware that SQL indexes doesn't allow select/omit and, if the file will stay with several millions records in it after purge, may be a LF is your better solution. There a rule that if a LF with select/omit select 30% or less of records in the file, it's a good a idea to keep or create one. You can take a look at this article http://systeminetwork.com/article/performance-comparison-dds-defined-files-and-sql-defined-files I think it will help you. Regards, Wilson
      2,600 pointsBadges:
      report
    • WilsonAlano
      The link doesn't work! Here it again. http://systeminetwork.com/article/performance-comparison-dds-defined-files-and-sql-defined-files
      2,600 pointsBadges:
      report
    • Dcantwell
      Thank you very much for the article Wilson. I think it's an excerpt from This document by Dan Cruikshank This was the basis of my whole idea. But, as I look more into i don't now how much it will benefit us immediately. The boss did like the idea of making our entire database "portable" to another system using the SQL generated tables. One thing about the Select/Omit stuff after reading that article: Dan did mention using DYNSLT when changing the logical files. This is supposed to help improve the access path.
      470 pointsBadges:
      report
    • JennyMack
      Hey Dcantwell, Welcome to IT Knowledge Exchange. It looks like your question has created quite a dialogue here! Are you finding the information useful? Jenny Community Manager
      4,280 pointsBadges:
      report
    • Dcantwell
      Hello Jenny, Yes, everyone here is very knowledgeable and helpful. Actually, funny I should be bringing this topic up. System i News just published another article about the topic by Dan Cruikshank. Thank you, Dave
      470 pointsBadges:
      report
    • Yorkshireman
      I'd suggest you need to look *very* closely at the business case for doing this. Whilst it all makes sense from a technical point of view, given unlimited cost free resource, you are proposing to create an entirely new system which will require complete and thorough testing to much higher standard (probably) than the original code, and will therefore throw out all kinds of 'bugs' which were always there but now need fixing. This will taint your project. Unless you have that unlimited resource, I'd follow the paths outlined. First - get some good metrics together - how many files, records, disk space used. If you can gather some performance data by runing some 'typical' work under performance monitor, then do that. Examine the code - how many programs, lines, File definitions (F lines) of what type of file access Write a utility to extract the data from the LF's and determine how many are duplicated, or could be consolidated, and what the criteria for the select omits are. Where are they used - someone said that you may find data extractions were being done by creating a LF. Second - look at the archiving story - what can go, and how you preserve a means fo gaining access. How much disk real estate is regained, backup times improved, DR startup times improved. you only mention 1 file - there must be more which are referentially linked - you'll need a function of some sort - SQL/RPG that will extract the old stuff - into, say, a history library, and remove from live after a verification check. When you have numbers, you can estimate effort and detail benefits. 88 million isn't a big number for iSeries, and if it's performance now is perceived to be 'OK' then why spend the money? - unless you can show cost reductions in ownership of the box and performance improvements which will reduce/remove/delay the need for upgrades. . - just my thoughts..
      5,580 pointsBadges:
      report
    • graybeard52
      Another tip. If you haven't already done so, make sure the QAQQINI file has IGNORE_DERIVED_INDEXES set to *YES. This can improve SQL performance by 400% or more. In your case, any SQL query over this PF will never use the SQL engine, but the older (and slower) CQE engine, unless this value has been set.
      3,115 pointsBadges:
      report
    • Iseriesvet
      Your approach is very sound in initially identifying unused logicals. However, before doing that make sure you report on the current I/O load - and again after cleaning things up. I would guess that will show a significant reduction in both I/O and CPU. If you were to restore your DB for some reason, and you are not saving the access path data - the rebuilds of these LF's would probably take days on even the fastest system - and DR usually can be a good justification. FYI - removing the member from the file stops the maintenance, but allows you to recreate the LF simply by adding the member back in. Also, you can define the index to not be maintained, except when it is used - which eliminates the overhead throughout the rest of the day.
      20 pointsBadges:
      report
    • TomLiotta
      Note that 150 LFs might be both reasonable and preferred. Multiple elements can influence whether or not to create a new LF. First thing to review might be the hierarchy of shared access paths. If 140 of those share the access paths of the 10 others, then there might not be anything gained by removing them. Removal might even make things worse. If two (or more) LFs share an access path, then look at the differences in the field lists. Does one LF presents fields that aren't present in the other? Find out why the limited LF exists. Perhaps the purpose is to provide a restricted view into the PF because some fields shouldn't accessed by some users. Are there derived fields that are only present in one LF? Do some programs benefit from the reduced data movement? Take care in replacing DDS LFs with SQL INDEXes. SQL run-time performance can improve significantly due to much larger page sizes; but access path journaling can dramatically increase in size with those larger page sizes. Since many sites don't bother with this, it seldom comes up. Just don't be surprised if it does. Keep in mind that a DDS LF can replace both a SQL VIEW and an INDEX in a single object (until the most recent releases). It's possible to delete a LF and then need to create a VIEW and an INDEX to replace it. If an embedded SELECT statement selects column 1, column 5 and column 10 from a table with 15 columns, how different is that from a LF that presents those three columns -- except in the program's case the "view" is hidden from normal external observation and gets created when the program is called. If 150 programs SELECT slightly different column lists, it might not be far off from 150 views. Also, a LF can share an access path of an existing SQL INDEX. But a new SQL INDEX won't share an existing LF index even if they are defined with compatible keys. The physical page differences generally force new INDEXes to create new access paths. (Exception -- CREATE INDEX, then create LF to share the access path, then DROP INDEX; from there, subsequent SQL INDEXes can then share that access path.) If sharing will be planned, then try to create INDEXes first. The point of any of this is only that absolute statements can sometimes indicate an incorrect direction. Guidelines tell us what look for, but they shouldn't be assumed to be always correct. The corrections to a given mess might not be appropriate. Take a little time to investigate. Still... 150 LFs over a PF? I can't imagine that there isn't a lot of improvement possible! Tom
      125,585 pointsBadges:
      report
    • Dcantwell
      Hello Everyone, It's great to see people are still discussing this topic! It's been well over a year and I thought I'd give a status update. It looks like we have no plans on fixing this anytime soon. We're a small shop (7 programmers) and with the constantly changing industry, we need to roll out products to bring the company money. So, unfortunately, we haven't implemented any of these changes. It's also unfortunate for me because I'm currently in the process of re-writing an SQLRPGLE program over this file that needs to join a bunch of other large files. I have the SQL statement to do it, it just takes forever to test. I might have to break it down into smaller chunks and process each one at a time. Anyway, thank you everyone for all the helpful knowledge and information. To Tom: Yes a lot of them are exactly the same logical file. A lot of people just didn't look for one they could've used. Also, a lot are not even used at all. We just need to find the time to clean this stuff up. Thanks, Dave
      470 pointsBadges:
      report
    • TomLiotta
      Yes a lot of them are exactly the same logical file. Most likely, then, they aren't really causing problems for DB2. The DBMS won't be updating them since they are kind of nothing more than 'symbolic links' to the actual indexes. I can't be absolutely certain, but I'd bet that the query optimizer never actually looks at them. True consideration should only be given to the real underlying access path. They cause problems for you, though. They make it difficult for maintenance. They have individual ownership and authority connections. They clutter up libraries. They... etc. They're just messy and unnecessary. It’s great to see people are still discussing this topic! I go through old items to see if anything can be added. For me personally, the ideal case is when I add something and find that someone else can correct what I've written. I can't know that I'm wrong if I don't run across what is right. What was correct a couple years ago might be wrong today. I've probably learned as much from corrections by Bvining, or clarifying comments from him, as I have from the Information Centers in the past six months. A lot can happen in a year. It might even be that a project was put on hold and new info suddenly makes it feasible. Tom
      125,585 pointsBadges:
      report
    • MelanieYarbrough
      Dcantwell, Thanks for updating us on the situation! Let us know if anything changes and how it turns out. - Melanie
      6,345 pointsBadges:
      report

    Forgot Password

    No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

    Your password has been sent to:

    To follow this tag...

    There was an error processing your information. Please try again later.

    REGISTER or login:

    Forgot Password?
    By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

    Thanks! We'll email you when relevant content is added and updated.

    Following