Safe Data Purge: How We Remove Old Records in Large Oracle Tables
Safe Data Purge: How We Remove Old Records in Large Oracle Tables
At our company, we periodically purge old data to comply with data retention regulations. Deleting millions of rows from production databases requires careful planning. Here's how we do it.
Step 1: Catalog and Plan
- Catalog all tables and identify retention periods.
- Review retention schedule with legal team.
- Coordinate with other teams to identify downstream dependencies.
Step 2: Backup Tables
Always backup affected tables before deleting. This ensures quick recovery in case of unexpected issues.
Step 3: Understand Table Structure
Identify child tables and parent tables. Delete child tables first to avoid foreign key violations.
Step 4: Batch Deletes vs Direct Deletes
We tested batch deletes in our PT environment and it worked fine. However, in production:
- Tables have 7M+ rows and 1M+ rows to delete
- Batch of 20K rows caused 1.5-hour wait and FK violation
- Automatic rollback occurred
Now we perform incremental deletes:
- Delete 10K rows at a time and commit
- Repeat until all old data is purged
- Slower but more controllable; easier to detect FK violations
Step 5: Sample SQL Playbook
/* ============================================================================
Purpose : Purge old records safely
Strategy : Child tables first, batch deletes, commit per batch, gather stats
============================================================================ */
-- Pre-check counts
SELECT 'child_table_1', COUNT(*)
FROM child_table_1
WHERE created_on < ADD_MONTHS(TRUNC(SYSDATE), -144)
UNION ALL
SELECT 'child_table_2', COUNT(*)
FROM child_table_2
WHERE created_on < ADD_MONTHS(TRUNC(SYSDATE), -144);
-- Direct delete small child tables
DELETE FROM child_table_1
WHERE created_on < ADD_MONTHS(TRUNC(SYSDATE), -144);
COMMIT;
-- Batch delete large tables
DECLARE
v_batch_size CONSTANT PLS_INTEGER := 20000;
v_rows PLS_INTEGER;
v_cutoff_date DATE := ADD_MONTHS(TRUNC(SYSDATE), -144);
BEGIN
LOOP
DELETE FROM large_table
WHERE created_on < v_cutoff_date
AND ROWNUM <= v_batch_size;
v_rows := SQL%ROWCOUNT;
COMMIT;
EXIT WHEN v_rows < v_batch_size;
END LOOP;
EXCEPTION
WHEN OTHERS THEN
ROLLBACK;
RAISE;
END;
/
-- Post-check and gather stats
SELECT COUNT(*) FROM large_table WHERE created_on < ADD_MONTHS(TRUNC(SYSDATE), -144);
BEGIN
DBMS_STATS.GATHER_TABLE_STATS(
ownname => 'YOUR_SCHEMA',
tabname => 'large_table',
cascade => TRUE,
estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE
);
END;
/
Key Takeaways
- Use incremental batch deletes for large tables
- Commit frequently to avoid massive rollbacks
- Delete child tables first to respect FK constraints
- Check counts before and after deletion
- Gather statistics after purge to maintain performance
- Consider programmatic purging: As transaction volume grows rapidly, manual batch deletes may become slow again in 6–12 months. Automating purge helps maintain table performance continuously.
❤️ Support This Blog
If this post helped you, you can support my writing with a small donation. Thank you for reading.
Comments
Post a Comment