Safe Data Purge: How We Remove Old Records in Large Oracle Tables

Safe Data Purge: How We Remove Old Records in Large Oracle Tables



At our company, we periodically purge old data to comply with data retention regulations. Deleting millions of rows from production databases requires careful planning. Here's how we do it.

Step 1: Catalog and Plan

  • Catalog all tables and identify retention periods.
  • Review retention schedule with legal team.
  • Coordinate with other teams to identify downstream dependencies.

Step 2: Backup Tables

Always backup affected tables before deleting. This ensures quick recovery in case of unexpected issues.

Step 3: Understand Table Structure

Identify child tables and parent tables. Delete child tables first to avoid foreign key violations.

Step 4: Batch Deletes vs Direct Deletes

We tested batch deletes in our PT environment and it worked fine. However, in production:

  • Tables have 7M+ rows and 1M+ rows to delete
  • Batch of 20K rows caused 1.5-hour wait and FK violation
  • Automatic rollback occurred

Now we perform incremental deletes:

  • Delete 10K rows at a time and commit
  • Repeat until all old data is purged
  • Slower but more controllable; easier to detect FK violations

Step 5: Sample SQL Playbook


/* ============================================================================
   Purpose  : Purge old records safely
   Strategy : Child tables first, batch deletes, commit per batch, gather stats
   ============================================================================ */

-- Pre-check counts
SELECT 'child_table_1', COUNT(*) 
FROM child_table_1
WHERE created_on < ADD_MONTHS(TRUNC(SYSDATE), -144)
UNION ALL
SELECT 'child_table_2', COUNT(*) 
FROM child_table_2
WHERE created_on < ADD_MONTHS(TRUNC(SYSDATE), -144);

-- Direct delete small child tables
DELETE FROM child_table_1
WHERE created_on < ADD_MONTHS(TRUNC(SYSDATE), -144);
COMMIT;

-- Batch delete large tables
DECLARE
  v_batch_size CONSTANT PLS_INTEGER := 20000;
  v_rows PLS_INTEGER;
  v_cutoff_date DATE := ADD_MONTHS(TRUNC(SYSDATE), -144);
BEGIN
  LOOP
    DELETE FROM large_table
    WHERE created_on < v_cutoff_date
      AND ROWNUM <= v_batch_size;

    v_rows := SQL%ROWCOUNT;
    COMMIT;

    EXIT WHEN v_rows < v_batch_size;
  END LOOP;
EXCEPTION
  WHEN OTHERS THEN
    ROLLBACK;
    RAISE;
END;
/

-- Post-check and gather stats
SELECT COUNT(*) FROM large_table WHERE created_on < ADD_MONTHS(TRUNC(SYSDATE), -144);
BEGIN
  DBMS_STATS.GATHER_TABLE_STATS(
    ownname => 'YOUR_SCHEMA',
    tabname => 'large_table',
    cascade => TRUE,
    estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE
  );
END;
/
    

Key Takeaways

  • Use incremental batch deletes for large tables
  • Commit frequently to avoid massive rollbacks
  • Delete child tables first to respect FK constraints
  • Check counts before and after deletion
  • Gather statistics after purge to maintain performance
  • Consider programmatic purging: As transaction volume grows rapidly, manual batch deletes may become slow again in 6–12 months. Automating purge helps maintain table performance continuously.

❤️ Support This Blog


If this post helped you, you can support my writing with a small donation. Thank you for reading.


Comments

Popular posts from this blog

fixed: embedded-redis: Unable to run on macOS Sonoma

Copying MDC Context Map in Web Clients: A Comprehensive Guide

Reset user password for your own Ghost blog