2,540 pts.
 SQL EXCEPTION JOIN
I have three files (orderinfo, deliveryinfo and itemlines). The three files entries shall exist as 1 to 1 to many.

The files contains orders from 10+ years. Thus data has been migrated several times by various people.

I have discovered that the one-to-one-to-many rule has been broken for many orders. I wrote a pgm to check the file's ordernumbers against the other files to find the ordernumbers with missing records. It turned (as expected) out to be a very time-consuming task.

I challenged myself to write an sql to do the job, assuming that it will perform better, but... Alas, I must admit that I have not yet been able to write the sql. Can anybody help ?



Software/Hardware used:
IBM Power i, V6R1
ASKED: May 2, 2010  1:01 AM GMT
UPDATED: May 4, 2010  6:21:17 PM GMT
3,830 pts.

Answer Wiki:
A key bit of information is missing (no pun intended) - how are the tables related? Does each table have a single column "ordernumber"?

Assuming that all the files are linked by an "ordernumber" in each table, then there are the following types of data relationship problems which could exist:
1. OrderInfo with no DeliveryInfo associated
select orderid from orderinfo oi
where not exists (select 1 from deliveryinfo di where di.ordernumber = oi.ordernumber);

2. OrderInfo with no ItemLines associated
select orderid from orderinfo oi
where not exists (select 1 from itemline il where il.ordernumber = oi.ordernumber);

3. DeliveryInfo with no ItemLines associated
select deliveryid from deliveryinfo di
where not exists (select 1 from itemlines il where il.ordernumber = di.ordernumber);

4. DeliveryInfo with an OrderNumber which doesn't match any OrderInfo record
select deliveryid from deliveryinfo di
where not exists (select 1 from orderinfo oi where oi.ordernumber = di.ordernumber);

5. ItemLines with an OrderNumber which doesn't match any OrderInfo record
select itemid from itemlines il
where not exists (select 1 from orderinfo oi where oi.ordernumber = il.ordernumber)

There can also be itemlines records with ordernumbers that don't match deliveryinfo records, but that set would be a subset of the records already found above.

Doing one giant query that finds all the broken links would be a challenge, and probably cause lots of table scans.

If you have some nice unique indexes on the tables, like:
create unique index oin on OrderInfo(ordernumber, orderid);
create unique index din on DeliveryInfo(ordernumber, deliveryid);
create unique index iln on ItemLines(ordernumber, itemid);
then the queries will be relatively efficient. (This assumes you have some column, or small set of columns that is a primary key that can be used for the "orderid", "deliveryid", and "itemid" columns, respectively.)
Last Wiki Answer Submitted:  May 4, 2010  6:21 PM (GMT)  by  Kccrosser   3,830 pts.
To see other answers submitted to the Answer Wiki View Answer History.
Discuss This Question:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _




 

How about something like this ?

SELECT *
FROM orderinfo oi
WHERE NOT EXISTS
	(SELECT 1
	FROM deliveryinfo
	WHERE order_id = oi.order_id)
OR NOT EXISTS
	(SELECT 1
	FROM itemlines
	WHERE order_id = oi.order_id)

This kind of query works fast on indexed columns, but I don’t know if it will be usable on your platform.

 60,255 pts.