LGWR redo write times (log file parallel write)
General ***rule of thumb*** I’ve heard and buy into is that decent log file syncs times should be 3ms or under. Well “log file sync” depends on “log file parallel write” times, and “log file parallel write” times depend on the size of the write, thus I looked at some different log write sizes on a Solaris system with redo files on a SAN with plenty of write back cache and 4Gb FC and there are the distributions I saw on this single system
In the following charts
X axis is # of blocks written per write
MB per Second on the bottom two graphs
milliseconds per write on top two graphs
This data was from the 10046 trace files from the lgwr process.Block size is 512bytes.
Stange that the throughput goes to 90MB/s at 4-5 blocks and then drops off precipitously to 10-20MB/s above, when I’d expect I/O to speed up with larger write sizes.
One nagging question for me is why aren’t the log writes faster over 4G FB with write back cache. The fastest log write above 10 blocks (5K) is just under 25MB/s. I’d expect over 100MB/s. Definitely can dd write with oflag=sync at over 100MB/s.
Nice old page from the formidable Steve Adams http://www.ixora.com.au/notes/log_block_size.htm with query to get block size:
select max(lebsz) from sys.x$kccle;
UPDATE: the above data was from a customer system. I just ran a similar test on my lab machine and got results of more along the lines I was expecting. On the my lab machine I have the redo on file system mounted over 1GbE NFS which maxes out around 115MB/s. On a good day I get 108ish MB/s and on a bad day around 50MB/s – it’s a public network (ie company wide). Today was a bad day, but the results are pretty good:
In house data (using NFS storage)
Query to get average log write write times
select dbid, btime, round((time_ms_end-time_ms_beg)/nullif(count_end-count_beg,0),1) avg_ms from ( select s.dbid, to_char(s.BEGIN_INTERVAL_TIME,'DD-MON-YY HH24:MI') btime, total_waits count_end, time_waited_micro/1000 time_ms_end, Lag (e.time_waited_micro/1000) OVER( PARTITION BY e.event_name ORDER BY s.snap_id) time_ms_beg, Lag (e.total_waits) OVER( PARTITION BY e.event_name ORDER BY s.snap_id) count_beg from DBA_HIST_SYSTEM_EVENT e, DBA_HIST_SNAPSHOT s where s.snap_id=e.snap_id and e.event_name in ( -- 'log file sync', 'log file parallel write' ) and s.dbid=e.dbid ) order by btime /
select dbid, to_char( begin_time ,'YYYY/MM/DD HH24'), round(sum(decode(metric_name,'Redo Generated Per Sec',average,0)) / nullif(sum(decode(metric_name,'Redo Writes Per Sec',average,0)),0)/1024,0) avg_log_write_kb from dba_hist_sysmetric_summary where metric_name in ( 'Redo Writes Per Sec', 'Redo Generated Per Sec' ) /* and DBID=[dbid if share repository] */ group by dbid, to_char( begin_time ,'YYYY/MM/DD HH24') /