Optimizing/Indexing to find last transaction

0 pts.
Tags:
Application development
AS/400
Database
DB2
DB2 Universal Database
Oracle
SQL
I'm hoping someone can help me out with this. I'm trying to write a query that will return the last transaction for each item in our warehouse. To do this, I'm using 2 files, first our location detail which has a list of all the locations in our warehouse. This file only has about 60,000 records, so I don't think indexing is a big impact. However, I'm joining this file to a 12,000,000 record transaction history file, and I'm having a horrible time getting the right logical built. basically, I'm doing the following : select min(loc) FirstLocaion, item, version, lasttran from locations left join (select item2, version2, max(trandate) lasttrans from transactions) on item=item2 and version=version2 group by item, version, lasttran I have a logical built on the transactions that keys item, version, and trandate, and then does several select and omit functions to keep only the transactions that indicate movement. Everything I try turns into a 10 min run to retrieve the information I'm looking for. I'm thinking my best bet might be to make the subselect a logical, but I can't find any DDS documentation on how to do that type of a select. Any ideas? thanks, Kevin
ASKED: April 3, 2007  12:03 PM
UPDATED: January 8, 2010  6:04 AM

Answer Wiki

Thanks. We'll let you know when a new response is added.

Most important, do you have an index on transaction that consists of item, version and lasttran? You need lasttran included to make the index work. I think that’s the problem.

What may be beneficial is to create a work file containing the last transaction of each item in the warehouse. Then create an index on that work file. Much smaller, yes? Then:

select min(loc) FirstLocaion, item, version, lasttran
from locations
left join (select item2, version2, lasttrans
from workfile) on item=item2 and version=version2

I don’t think you will need the grouping.
I don’t understand why the min(loc).

Put the solution here when you have it.

Discuss This Question: 5  Replies

 
There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when members answer or reply to this question.

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
  • MODMOD
    I can't index LastTran, last tran is from the subselect which selects item, version, max(tran) I do have a logical that indexes item, version, tran, and I get acceptable results if I run the subquery by itself, (because strsql will retrieve blocks of rows instead of the whole resultset I believe) the min(loc) was an artifact of me rewriting an anonymized, simplified version of what I'm pulling back. An item can be in more then one location, and the report is calling for the total on hand instead of individual lines for each location. However, to ease working the report, they wanted to know the lowest numbered location the item was in. I threw out summarizing the total onhand in the initial result set, and managed to rewrite the query to something like this : SELECT loc, item, version, Rec_Date, OH, max(TranDate) lasttran from locations left join transactions on on Item=Item2and Version=Version2 GROUP BY loc, item, version, rec_date, OH This runs very fast in strsql where it blocks the results, but, if I apply a sort, or try to retrieve the full set of records, it slows to a crawl. I know with SQL server, I could create an index Item, Version, trandate and then build a view that would only give me item, version, max(trandate) or a function to return the last transaction date either of which for the 60,000 records I'd be pulling back should be relatively effective. We have a very active warehouse with with a huge catalog of items. Occasional items come into the warehouse, but, never make it out. Those items then end up taking up space in prime locations. We're trying to isolate those items that aren't active, and relocate them into more appropriate locations. I also thought about trying to limit the transactions to just items that are currently in the warehouse. Unluckily, it trims suprisingly little off of what I'm comparing to (the logical I have reduces the record count to around 6 million (which is probably around 200,000 items)), and the temp file actually would probably slow things down due to lost indexes.
    0 pointsBadges:
    report
  • SheldonLinker
    This is one of those rare circumstances where redundant data may be your best bet. Consider writing records to a Last Transaction file, and reading from that later: Do the main transaction, without commit Update the appropriate last transaction record If the number of records updated is zero ___Insert the appropriate last transaction record Commit --- Sheldon Linker (sol@linker.com) Linker Systems, Inc. (www.linkersystems.com) 800-315-1174 (+1-949-552-1904)
    30 pointsBadges:
    report
  • BigKat
    assuming recdate and OH are in location file and are single values, and that the locations file is unique for loc, item, version. try with maxtrans as(SELECT item2, version2, max(TranDate) lasttran from transactions group by item2, version2) SELECT loc, item, version, Rec_Date, OH, lasttran from locations left outer join maxtrans on Item=Item2 and Version=Version2
    7,585 pointsBadges:
    report
  • MODMOD
    Unluckily the subquery actually takes longer in this situation. The transaction file that I'm reading is actually an indexed view that gives limited results, doing a subquery seems to cause the system to compile more of a list then it needs, whereas doing the left join saves time (for a specific 300 records, the left join takes 3 min, and the subquery takes 5 min)
    0 pointsBadges:
    report
  • TomLiotta
    The transaction file that I’m reading is actually an indexed view that gives limited results... Is that a hint that we haven't actually been told what the tables actually are? Are you saying that the original SELECT in the question is actually being processed over views rather than the real tables? So, we need to work out the problems of too-slow performance without knowing what's really being queried...? I don't see much hope. Tom
    125,585 pointsBadges:
    report

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

To follow this tag...

There was an error processing your information. Please try again later.

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Thanks! We'll email you when relevant content is added and updated.

Following