I am a newbie to database programming and have been given this project where data older than 7 days need to be removed from multiple tables that the application uses. I am thinking about creating a stored procedure to do that and schedule a job that will run this stored procedure everyday. Not sure if thats the best way. On a daily basis, one table has about 300-500 records and there are about 8-10 tables that I need to remove this data from. Please advice!! Following is my stored procedure. Any help would be useful. Thanks!!
CREATE PROCEDURE DeleteTablerecords
As
Declare @OldRecordsExpireDate datetime
SET @OldRecordsExpireDate = DATEADD(DAY, -7, GETDATE())
SET NOCOUNT ON
--delete data from Address table
DELETE from Address
Where Address_Type IN ('H','P','C','Q') AND (DateCreated < @OldRecordsExpireDate)
--START Go to Error Routine
If @@Error <> 0
BEGIN
GOTO Handle_Error
END
--END Go to Error Routine
Delete from CoApplicant Where DateCreated < @OldRecordsExpireDate
--START Go to Error Routine
If @@Error <> 0
BEGIN
GOTO Handle_Error
END
--END Go to Error Routine
Delete from Employment Where DateCreated < @OldRecordsExpireDate
--START Go to Error Routine
If @@Error <> 0
BEGIN
GOTO Handle_Error
END
--END Go to Error Routine
Delete from General Where DateCreated < @OldRecordsExpireDate
--START Go to Error Routine
If @@Error <> 0
BEGIN
GOTO Handle_Error
END
--END Go to Error Routine
Delete from Session Where DateUpdated < @OldRecordsExpireDate
--START Go to Error Routine
If @@Error <> 0
BEGIN
GOTO Handle_Error
END
--END Go to Error Routine
Delete from t_Employment Where Date_Record_Created < @OldRecordsExpireDate
--START Go to Error Routine
If @@Error <> 0
BEGIN
GOTO Handle_Error
END
--END Go to Error Routine
Delete from Funding Where Date_Record_Created < @OldRecordsExpireDate
--START Go to Error Routine
If @@Error <> 0
BEGIN
GOTO Handle_Error
END
--END Go to Error Routine
SET NOCOUNT OFF
-- Error Handling Routine --
Handle_Error:
print @@Error
ROLLBACK
GO
I work for a bank and the Feds are expecting us to delete them. As per them, noone should have access to anyone's financial info and hence these changes need to be made. I would hate to delete it too but have to make them.
I agree. Do not delete until you have been able to view the results of a "marking"; after marking the records, you (or someone or some sort of validation routine) can review the results and, if satified that the results are correct, delete the records.
I agree with the privacy issues, but you also have to use sound database management and programming when doing this or anything that is going to remove data from your database.
On the SQL: I have used only IBM Informix, so I cannot help you there.
First of all I would recommend long-term to look at normalizing your data so it only exists in one place.
Secondly, even if you do indeed delete this data, how many times has it been backed up before you delete it? Is this why you must delete it in more than one place?
Lastly, you can delete the records from the tables from a stored procedure run as a job in sql server that's scheduled but I would recommend something other than deletion. Instead of deleting the record, delete the information it contains and retain the empty record and it's primary key (inserting zeros or spaces where datatypes require it). This allows you in the future to know that the information for that record was erased in a positive way that is easily reported. This technique also makes it easier to make certain previous backup's contain empty data fields by "updating" them to the new condition.
It sounds like you have gotten a pretty firm directive from management stating that you need to delete this data so:
If you are going to delete data keep in mind data integrity. Your script seems to simply toss away anything older than 7 days from the current time at which the script is run. Also your script operates on multiple tables. I can only speculate but I'd imagine that some of this data is related to itself or other data in other tables by either enforced or implied foreign key relationships. If this is the case the way your script really needs to work is deleting data from the "fact table," that's older than 7 days then deleting all data from other tables that reference the now deleted records. If you do it using the method in your script you can leave dangling references and possibly break running systems.
Also as an alternative I'd suggest moving the data into a separate set of "expired data" tables for an interim period then maybe after a month delete it for real. To make your boss happy restrict the permissions on the "expired data" tables so that nobody can read them. This approach should be fairly straight-forward with MSSQL.
Good luck.
Discuss This Question: 4  Replies