During the years I heard lots of stories about sizing the Oracle Database Buffer Cache. Oracle introduced in Oracle 8i (8.1.7) the view V$DB _CACHE_ADVICE. According the documentation this view shows IO’s that would be performed at various cache sizes from 10% to 200% of the current size. For example (GV$DB_CACHE_ADVISE):
The thing what is crossing my mind all the time is the question, will a system really react the way the advisor is predicting? This question is a very obvious question of course and in this post I like to place a nice little discussion in place.
If we look at the most common production situation for databases than we see a unix database server attached to a SAN. The memory of the database server is for a big part used for the SGA, and for this post we concentrate only on the IO part, so the database buffer cache. Further the memory of the database server will be used for unix IO buffers (default unix behavior, free memory (outside the process memory) is used for this purpose) and inside the storage a storage cache will be present. If we make for this a simple picture, we will get the following technical infrastructure:
The parts which are green, mean that these parts will have high chance to contain the desired data requested from a client and will be fast (memory). The parts which are red mean that requesting data from here will go much slower. In the drawing the access to the physical disks inside the storage are slow. A very fast disk will still have a seek time of 6 ms, so all IO measured which goes faster as this, is in principle IO cache performance.
Looking at the picture makes it clear that the total size of the database server memory is limited, it is divided in a part SGA (database buffer cache from IO perspective) and unix IO buffers. Now the question, what is normally reported by Oracle as physical reads, in principle these are reads which are not coming from the Oracle database buffer cache, but from outside. In a lot of cases we will not do physical IO’s at all, or just a very small percentage (the reason we find read times of less than 6 ms). A lot of the IO will come from the unix IO cache or from the storage IO cache. This will mean that when we increase the Oracle database buffer cache, we in principle only exchange a part of the unix IO cache for more Oracle database buffer cache. Going back to the V$DB_CACHE_ADVISE view will show that less physical IO will be done, what should lead to a better performance, but is this the case ??
At this moment we have to start looking at what is stored in the Oracle database buffer cache and what not. When we do index scans and access tables like that (db file sequential reads), we will store these retrieved blocks in the Oracle database buffer cache. When we have to do full table scans (db file scattered reads), we will only reserve a small part of the Oracle database buffer cache as an IO read buffer, but will for sure not store the whole table in the cache.
As we now also look at the unix IO cache, we will find out that the unix IO cache is not as “smart” as the Oracle database buffer cache and will store any IO done. So it is not important if the data is part of a index scan search or via a full table scan. For the cache in the storage the same thing can be said. The only thing is that when the cache is used at the storage, on certain moments the speed of the connection between the storage and the database server can start to be a bottleneck, when the storage cache is big and the hit ratio of this cache is very high. So as a rule of thumb can be said that cache should be as close to the place where the data is requested as possible. With other words it is smarter to have a bigger RAM at your database server, than to invest in a bigger storage cache.
As we now go back to the question if an increase of the database buffer cache will have a positive effect, than we can start the following mind experiment (actually we should try this in real practice).
The application is using a lot of sequential reads (index scans). What will happen if we increase the Oracle database buffer cache?
This will imply that the amount of real physical reads will decrease. Why? The amount of cached database blocks (needed for the application) will increase if we look at the total cache. A full table scan (still a few) will not flush a database buffer cache, it will only effect the effectiveness of the unix IO cache and the storage cache. So in most cases (index scans), the requested data will be faster retrieved.
The application is using a lot of scattered reads (full table scans). What will happen if we increase the Oracle database buffer cache?
This will imply that the amount of real physical reads will probably be increased. Why? The amount of cached database blocks (needed for the application) will decrease if we look at the total cache. The database buffer cache is bigger, but will store less blocks used by the application (full table scans). Although the above sounds very weird, in practice this situation could occur. In the end it will mean that the requested data will be read more from physical storage and so in effect will be retrieved slower. The increased Oracle database buffer cache means that the less smart unix IO cache will be decreased. This unix IO cache is on the other hand much better for full table scans.
Although situation B will lead to discussion (it is a mind experiment), I think that we always should be careful by drawing conclusions how the system will react with increasing the Oracle database buffer cache, as this cache is not small (>2Gb). Sometimes the effect is less or different than expected due to effects we didn’t think of. Also the effect of an increased Oracle database buffer cache and the amount of extra CPU overhead is something what can gives unexpected effects as there is a cpu bottleneck, instead of an IO bottleneck. In all cases it is highly important to pinpoint what the real bottleneck in a performance issue is (GAPP, Method-R). So by just following some advisor and by stop thinking can bring you in dangerous situations, with very unpleasant side effects.
In practice I have seen also pretty weird effects from a RMAN backup. This problem can come up when your Oracle database buffer cache is not that big (<3Gb) and the database server itself has much more memory (>6Gb) In this kind of situations you can encounter very negative impacts of your RMAN backup, due to unix IO buffer and storage IO buffer flushes. See the picture below.
In this picture the read from RMAN, which is performing a full database backup, is flushing almost all the most used database blocks requested by the end user. This will imply that if a database block can’t be found in the Oracle database buffer cache, this will lead in most cases to a real physical (slow) read. So the backup will make the cache hit ratios from the unix IO cache and the storage IO cache very bad. In this case it becomes obvious that it would have been better to have a bigger Oracle database buffer cache than a larger unix IO cache.
The effect described with the backup will be less, if the RAM from the database server would have been bigger and more physical memory would have been assigned to the SGA from the database. See the picture below.
In this picture the RMAN backup has not much influence, because almost all the IO caching is done in the Oracle database buffer cache, because of its adequate size (green arrow for the end user).
I hope that with this post it is clearer why just increasing the Oracle database buffer cache will not always lead to a better performance, or even worse… Important is to understand that the Oracle database buffer cache is a more intelligent IO buffer than the unix IO buffer, but that the effect is depending on the usage. Further that IO caches can be flushed by several reasons like full table scans, backups, etc. The things and outcomes I discussed in this post are depending on hardware, application and bottlenecks in the system, but should be a start point of a wider view than only some advisory.