My advice is to basically log everything you can, and make sure you also use NTP on everything so the times are all synchronised. We set everything to GMT, and never change to summer time (daylight saving), so it is clear when events happen.
Use syslog for most devices, and archive the logs from those that generate them. Make sure you also have the start/stop records from any RADIUS server, also making sure that this is used to authenticate access to the network and any devices.
To analyse this data I usually just do it manually. I have not really found any good tools, other than the Cisco MARS, but that is a bit too expensive for the network I manage (and for the manager’s I work for !).
most times when I have had to look at incidents, it is over a fairly restricted timeframe, and it is focussed on certain devices, so a manual trawl through the data is not as horrible as it could be. With everything timestamped, and synchronised, it is made a lot easier.
Just my 2p (2c) worth 🙂