Find missing numbers with SQL

2555 pts.
Tags:
AS/400 SQL
SQL
SQL Server
V6R1
In a large logfile I have records containing the field INVNO (invoice-number). The lowest and highest values are easy to find, BUT it looks like some numbers in between are not there. Anyone got a trick with SQL, which can tell which numbers are missing within the numberrange (step 1) ?

Software/Hardware used:
i5 OS, Power System i, V6R1

Answer Wiki

Thanks. We'll let you know when a new response is added.

I’m not sure if this syntax will work on your platform:

<pre>SELECT invno – 1
FROM invoices i
WHERE NOT EXISTS (
SELECT 1
FROM invoices
WHERE invno = i.invno – 1);</pre>

Notes:
-The result will include a fictitious invoice number smaller than the lowest one.
-This query would perform better if an index exists on the invno field.
-This query will not detect contiguous blocks of missing invoice numbers.

See discussion below.

Discuss This Question: 20  Replies

 
There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when members answer or reply to this question.

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
  • TomLiotta
    Note that the suggested solution catches single missing INVNO values. Catching contiguous blocks of missing numbers requires a little more, but indications of single missing numbers might be sufficient. Tom
    125,585 pointsBadges:
    report
  • TomLiotta
    For listing all missing numbers at V5R4 and above, perhaps something like this:
    with allinv (allinvno) as (
                               select 1 from sysibm/sysdummy1 a
                         union all
                               select a.allinvno + 1 from allinv a, sysibm/sysdummy1 b
                                   where
                                       a.allinvno < (select max(invno) from invoices)
                              )
        select allinvno from allinv c
            where
                not exists (select 1 from invoices where invno = c.allinvno )
    If INVNO isn't numeric, it could require some conversion. Tom
    125,585 pointsBadges:
    report
  • DanTheDane
    I did some testing on Corlos's solution on a subset of the very big file I have to deal with: 1. I extracted all records from the range of invno's 997 to 1856 into a workfile. 2. I ran the SQL over this testfile with two invno's found, 996 and 1012. The 996 was what was expected re Carlos's first note. the 1012 invno was missing in the file, and a look into the file verified that this number was the only missing. 3. I then deleted 3 records from the tst file (1051 to 1053) to test Tom's note. The result recieved were the numbers 996, 1012 and 1053. This proves tht Tom were right that I will only get an indication on missing numbers, since only the high-value of missing consequtive numbers is found (1051 and 1052 were ot found). Since my task is to find concrete numbers (to be used for finding info in old invoice paper-copies), I have not (unfortunately) reached my goal. Just out of curiosity it would be nice to see a fully working sql-solution, but for now i'll write a small RPG program with the old-fashioned IPaccess method to do the job. Thanks for all comments on this Q. DanF
    2,555 pointsBadges:
    report
  • carlosdl
    Good observation Tom. Dan, please let us know if that is sufficient. Tom, how would you detect contiguous blocks ? Come to mind some probably very inefficient options using the above query as a subquery. Something like this:
    SELECT MAX(in.invno)+1 missing_from, m.invno missing_to
    FROM invoices in,
    	(SELECT invno - 1 
    	FROM invoices i 
    	WHERE NOT EXISTS 
    		( SELECT 1 	
    		FROM invoices 	
    		WHERE invno = i.invno - 1)) m
    WHERE m.invno > in.invno
    GROUP BY m.invno
    68,035 pointsBadges:
    report
  • carlosdl
    Sorry, I saw the last two comments from Tom and Dan after posting mine. Would like to know what you ended up doing, Dan.
    68,035 pointsBadges:
    report
  • DanTheDane
    Tom, thanks for your suggestion, which I have tried to understand and use with no success. As of now I believe there is a error, since I canot place the word "invoices" (last line of yur code) into any context of ths matter. Can you pls clarify (or correct) to let me undestand. DanF
    2,555 pointsBadges:
    report
  • TomLiotta
    I canot place the word “invoices” (last line of yur code) into any context of ths matter. Sorry, Dan. Not knowing the name of your table, I simply used "invoices" from Carlosdl's example. You would use whatever table the INVNO column comes out of. I can't fully test without knowing your table(s). It should be pretty close though. Tom
    125,585 pointsBadges:
    report
  • DanTheDane
    I tested Carlos's 10:25 posting: 1.
    SELECT invno - 1
    	FROM invoices i
    	WHERE NOT EXISTS
    		( SELECT 1
    		FROM invoices
    		WHERE invno = i.invno - 1)
    
    ....this part works. When part of the total SQL from Carlos, an error occurs: "[SQL0205] Column INVNO not in table M in *N." Still stuck, and getting more curious.. :-) DanF
    2,555 pointsBadges:
    report
  • carlosdl
    Sorry Dan, I think an alias is missing. Try this:
    SELECT MAX(in.invno)+1 missing_from, m.invno missing_to
    FROM invoices in,
    	(SELECT invno - 1 invno
    	FROM invoices i
    	WHERE NOT EXISTS
    		( SELECT 1
    		FROM invoices
    		WHERE invno = i.invno - 1)) m
    WHERE m.invno > in.invno
    GROUP BY m.invno
    However this will return ranges (i.e. if invoice 4,5,6 are missing, this should return from:4 to:6), not individual invoice numbers, as I hadn't read that your task was to find concrete numbers when I posted it.
    68,035 pointsBadges:
    report
  • philpl1jb
    I would have taken a different approach. 1. Make a missing invoice number file and populate it with every number in range. I don't know a single SQL command to do this. Requires a loop, as in stored procedure, RPG, COBOL, etc. If I make a loop in RPG why wouldn't I just write the records in native RPG code? 2. Remove the inovice number from the file created in 1. (File MISSINV, field mINVno) where they exist in file Invoices field InvNo SQL - DELETE FROM MISSINV where mINVno in (Select InvNo from Invoices) What remains in MISSINV are the missing invoice numbers Phil
    49,435 pointsBadges:
    report
  • DanTheDane
    Carlos, Alas your Dec 11-12:27 solution gave these numbers as result: 996,1053 and 1012.; - 1051 and 1052 were not found. My actual task has now been solved by a small RPG-pgm as follows:
    oices  ipe  e           k disk
    Fqsysprt   o    f   90        printer
    D counter         s                   like(einvno)
     /free
        if counter = *zero;
          counter = einvno;    // initiate counter start.
        else;
          counter += 1;
          if einvno > counter;
            dou counter = einvno;
              except print;
              counter += 1;
            enddo;
          endif;
        endif; 
     /end-free   
    Oqsysprt   e            print       1
    O                       counter       X     10
    
    My search for an SQL solution showed up to be time-consuming, BUT I am still very curious, and I hope to find the solution some day. Thanks to all who assisted in this case; - anyway I learned a few things about sql. PS: if any new sql-suggestion to this task are added to the discussion, I will test them out. Dan F
    2,555 pointsBadges:
    report
  • TomLiotta
    I don’t know a single SQL command to do this. That's exactly what my suggestion does. It's a single SQL statement that lists all missing INVNO numbers from a table named INVOICES. The only requirement is V5R4 or later. The CTE is recursive. It builds a temporary list of numbers from 1 up to (one less than) the highest INVNO in INVOICES. Using that list, it checks to see which ones do not exist in INVOICES. If the lowest number is higher than 1, it can be replaced by (SELECT min(INVNO) FROM INVOICES). Tom
    125,585 pointsBadges:
    report
  • philpl1jb
    Thanks Tom, I'll give it a try..I've been in recursion withdrawl .. er denial. Phil
    49,435 pointsBadges:
    report
  • DanTheDane
    Tom, I reduced my testing environmnt to test your solution, - IT WORKED AS WANTED . To begin with I had some misunderstanding around the namings in your SQL. Now that I have worked quite a while with this 'simple' task, I also found out that my own RPG solution lacks one essential thing: it does not handle duplicate invno's. My file is indexed on invno and a sequence number, which is not used at all in the application, and I found one duplicate invno, which makes my RPG-solution end program when reached. Your SQL solution handles this correctly according to the original request. I will NOT put a request to include a listing of dups in same sql ;-) Thanks to all involved in the dialog around this Q. DanF
    2,555 pointsBadges:
    report
  • TomLiotta
    I’ve been in recursion withdrawl .. Now that V5R4 is at the bottom of the supported releases, I figure it's clearly time to start stressing it. I don't have good info on efficiency, but I don't know how efficient building a manual table of numbers is either. It's possible that elements such as object creation/deletion, ownership, authorities and opens/closes might be handled automatically and maybe even bypassed. Maybe. Tom
    125,585 pointsBadges:
    report
  • TomLiotta
    I will NOT put a request to include a listing of dups in same sql That's probably a good choice. Finding numbers that don't exist concurrently with ones that do exist and are duplicated would unnecessarily complicate things. Duplicates are too easy to find with simple statements using GROUP BY and COUNT(). Tacking it on here couldn't make things better (IMO). Tom
    125,585 pointsBadges:
    report
  • Kccrosser
    A simpler and probably more efficient solution would be to use a function that returns a table of the missing records. The function would create an internal temporary table and use a cursor using Carlos' query. That cursor would retrieve the "last" record of each block, which would be added to the internal table. Then, using a simple internal loop, work backwards until you find an existing record, adding each missing id to the internal table. At the end, just do a select on the internal table (with order by) to return the list of missing records. An even simpler approach is a cursor for SELECT INVNO FROM INVOICES ORDER BY INVNO. Then, walk through the cursor - if the next value of INVNO is not the previous value + 1, then loop through from (previous+1) to (newvalue-1) as missing numbers. Unfortunately, this requires retrieving ALL the invoice numbers, but it may not be any more resource intensive than the other approaches.
    3,830 pointsBadges:
    report
  • TomLiotta
    The latest comment got me thinking. It's not that big of a deal to add a duplicate check:
    with allinv (allinvno) as (
                               select 1 from sysibm/sysdummy1 a
                         union all
                               select a.allinvno + 1 from allinv a, sysibm/sysdummy1 b
                                   where
                                       a.allinvno < (select max(invno) from invoices)
                              )
        select 'Missing  ', allinvno from allinv c
            where
                not exists (select 1 from invoices where invno = c.allinvno )
        union
        select 'Duplicate', i.invno from invoices i group by invno having count(*)>1
    Works fine on my local test INVOICES table. Tom
    125,585 pointsBadges:
    report
  • DanTheDane
    This is absolutely an SQL that I will record in my tools database. Toms latest solution does the job of telling of potential 'un-order' in a numbered file. Thanks again to everybody. DanF
    2,555 pointsBadges:
    report
  • TomLiotta
    V5R4 has been around for a while. IBM's working on getting the release after V7R1 ready. It's things like this that let us feel that upgrades (and learning the features of upgrades) are worth the effort. I hope it comes in handy in the future. Tom
    125,585 pointsBadges:
    report

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

To follow this tag...

There was an error processing your information. Please try again later.

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Thanks! We'll email you when relevant content is added and updated.

Following