After having a good sleep I attended the first session in the morning from George Djerdj Srdanov with a presentation called “How to Get the Most Out of Your I/O Subsystem?”. George dived deep into the I/O subsystem, with the following important subjects:
Different RAID configurations (Read and Write penalties)
Block alignment and recommendation
George Djerdj Srdanov presenting at Hotos 2013
The presentation was good and triggered me to check some things at a customer I am currently working for. In the customer case the block alignment in regard of the online redolog files might be a point of investigation. Continue reading →
Due to a cold or “the bug” some at the symposium called it, I had a very bad night sleep. In the morning I was not able to follow the sessions and I ended up having a good breakfast and released the new white paper and latest presentation to Hotsos for distribution.
At 13:00 I attended the presentation of Dr. N.J.G. Gunther titled “Superlinear Scalability: The Perpetual Motion of Parallel Performance”. Because of his subjects I like to be present at his presentations, although on the other track Gwen Shapira had her first presentation titled “Visualizing Database Performance Using R” which was for me also a subject I would love to be present. The presentation from Neil discussed an important topic regarding the effect of increasing the amount of servers giving better throughput than expected on linearity, this phenomenon has been baptized by Neil as “Superlinear Scalability”. During the past couple of years he struggled to have his USL to fit with this phenomenon and at first he just ignored it, but after seeing the phenomenon more he had to admit the fact it really exists and his USL should be able to cope with it. After a long process he came to the conclusion that his USL is still able to apply if he would loosen the limitation of accepting negative numbers for his alpha parameter (Contention) in his USL. It basically means that by increased number of servers you get a kind of hybrid effect temporary (the throughput has increased with a factor more than expected on the number of added threads based on linear scalability). On a certain moment you still have to face the music and throughput degradation starts to appear due to coherency (Beta parameter in the USL formula). Based on the gathered proof, based on different data sets, he concluded that his USL still is valid, also in situations the “Superlinear Scalability” phenomenon is occurring. As usual Neil really showed in a very good scientific way that his claims were accurate and as he always says, “Models come from God and data comes from the devil!”. If you like to read more you can checkout his blog at: http://perfdynamics.blogspot.nl/2012/11/hotsos-2013-superlinear-scalability.html
Dr. N.J.G. Gunther at Hotsos 2013
After the presentation from Neil it was my turn to give my own presentation as I mentioned earlier, titled “”Method GAPP” Used to Mine OEM 12c Repository and AWR Data”. Continue reading →
It is the first of March 2013 finally… I will travel to Dallas for (one of) the best Oracle performance symposia in the world, Hotsos 2013. The flight to get there will be from Amsterdam to Philadelphia and from Philadelphia to Dallas. Against all odds I will not travel alone but an old Amis colleague and friend Marco Gralike with his colleague will be on the same flight (even going back). After departure from Schiphol at 13:00 in the afternoon, and having our stop in Philadelphia we arrive at the Omni Mandalay Hotel in Las Colinas (Irving / Dallas) at 22:30 local time… This is the real start of an awesome time at this great symposium…
Last week I got the great opportunity to present on Method-GAPP again at the UKOUG 2011 (see presentation of the UKOUG2011). This time the focus in the presentation was partly on the multi linear regression and for the other part especially on AWR data. The multi linear regression makes it possible to get a linear equation to calculate the end user response time, what makes it possible to get a complete breakdown of all involved components in the end user response time as show in the graph below. In the graph the test and modelling from the white paper is shown:
Breakdown of all the involved components for the end-user response time
In the breakdown, the UTILR80 is the utilization of the I/O and the UTILRAU is the utilization of the CPU. The breakdown shows that basically the REST is time which is always there but might be split out in more components if the involved model is enhanced. So more time is explained from the found variance of the end-user (R) response time. Continue reading →
In a lot of cases you like to know which SQL, wait-events, metrics, etc. in AWR is important for your specific end-user process response time. So it could be very well possible that the most important SQL, wait-events, metrics, etc. are show-in up in your “Top Activity” in your OEM grid control and AWR reports are actually not the most important for your end-user process response time.
After you know the share of time of your end-user process is taken by the database server (Method-GAPP primary components), you actual can use all the AWR (and ASH) information as secondary components as input in Method-GAPP (see the white paper). Basically we simply can use the “Data Mining – Explain” step in the method and create a factorial analyses as shown below (see the white paper).
After a long time of not able to finish my whitepaper, I finally finished it. Just struggling with time constraints made it hard to get my whole method on paper. I really wanted to have it finished before I would present the new improvements on the method at the HOTSOS Symposium 2011. In a couple of hours at 13:00 Dallas time I will do my talk based on the whitepaper and really hope I get a packed room of people.
Of course I hope the audience will see it’s potential and I will be able to put the message in the presentation as good as possible. I am just nervous on the demo I try to give… As some people may recall from HOTSOS 2009 I had a big issue with my laptop and in the end started 10 minutes late without a demo. So really hope this time everything will go smoothly.
The presentation will also become available on the blog, but for now you can download the official Method-GAPP whitepaper in the download section. As a last note I like to thank Cary Millsap and Dr. Neil Gunther for their inspiration and support.
Since I am working on my method-GAPP (see method-GAPP overview presentation) I have been challenged with the task to model a real system and not a Lab system with a programmed load profile. The big issue with a real system is that the load profile is changing all the time and the only thing we can recognize are periods of time we have a not to changing workload profile. For example an OLTP system will do during production hours from 9:30 in the morning till 11:30 and from 14:00 till 16:00 in the afternoon comparable things, but will do from 01:00 till 06:00 in the night something totally different. The given example could match maybe some OLTP systems but could be totally different for your OLTP production system. Continue reading →
The last couple of months I have worked very hard on method GAPP and have finally made a very big improvement to it. In the past GAPP was only able to pin point where in the architecture the biggest variance in response time was caused. The improvement to GAPP makes it now also possible to find within certain error also the service time per measured component in the architecture. The point is that sometimes the component causing the biggest variance in end user response time is not always the component responsible for the most service time of the total response time.
The second version of GAPP has now an extra step inside the method, which is “data modeling”, the data is first modeled by using normalized response times for different amount of servers by using the Erlang C formula. Next to this data mining is used with a generalized linear model and ridge regression, to solve near collinearities in the data. With this extra step in place the prediction of service time and wait time per measured component became possible. When I first verified it against real system data I was really happy to find out that it works very well. More information will follow soon in blogs and hopefully for the end of this year in a white paper.
I am very happy I get the opportunity from Hotsos to be able to present it next year in march 2011. Via this way I also like to thank everybody who inspired me and made this possible, especially Cary Millsap and Dr. Neil Gunther.
Since a long time I am busy using queuing formula’s to be able to calculate cpu queue’s and I/O queues. One of the big problems I was facing that the formula’s I like to use on big data sets with my GAPP analysis were only available in perl. For a long time I was using proximity functions to avoid the perl programmed Erlang-C formula and some other. Last weekend I just had the time to start programming the formula’s in PLSQL, just to have them easily accessible in my database. After finish programming the package I realized that the package can also be very handy for other people, so I decided to create this blog. The created package has the following important functions: Erlang-C, Erlang-B, Response Time in multi-server environments (ErlangR in the package), Queue length in multi-server environments (ErlangQ in the package) and Response Time in multi-queue environments like IO (paratqr in the package).
The formula’s are described in the book “Analyzing Computer System Performance with Perl::PDQ” from Dr. Neil J. Gunther 2005. In the source of the package are the exact locations in the documentation documented. Continue reading →
I recently discovered that under certain circumstances, for now it looks to be environment setting depended, that an ORA-25330 is encountered when running dbms_predictive_analytics.predict procedure. I found out of this problem while preparing a demo and certainly the error was comming up in TOAD (I am not a TOAD fan, but it is handy searching through the very big amount of factor columns in a GAPP analysis). When the same command below was executed directly on the server via sqlplus, the error was not encountered.
To investigate I tried to create a errorstack on the error, but without luck. So I started to do a sqltrace and found out that the ORA-25330 is actually on the server translated as ORA-40206. After I found that I created the following small test in TOAD: Continue reading →