1 Check the site links are still viable. Check replication schedules. Check DC error logs. Check that network links exist between the sites. can you ping DCs across the network. Does replication wok within the site. Was there a bridgehead server defined that has failed. Try a manual replication. Repair as necessary.
2 Find and remove inactive users – anyone not logged in for 90(?) days. Discover who should be in which groups and modify group by group (Long process)
3 If the failed DC is a FSMO role holder seize those roles onto another machine. Delete the failed DC using AD Users and Computers. Build a new DC – with a NEW name – and allow replication to bring it up to date.