Monday, August 24, 2009

Scalability and the Essbase Java API

During my Dodeca webcast a couple of weeks ago, someone asked a question about the typical number of servers necessary for a Dodeca deployment. Dodeca is quite resource friendly on the server due to it's architecture and thus the answer is 'Less than you would expect'.

One reason is that we use the Essbase Java API on the server. The Java API was designed from the ground up to be a highly scalable and highly dependable API layer to be consumed by both Hyperion applications and by third party applications. We did extensive scalability testing of our own on our Essbase services. The most extreme test ran 25 concurrent threads constantly on an old underpowered server we had sitting around. The test was intended to simulate approximately 500 users assuming the usage pattern is that the user is querying the database 5% of the time and analyzing the results of the time in the application. We left this test running for 5 months in a single instance of Tomcat with the following results:

Number of requests serviced - 204,043,599
Hours of processor time - 2237:07:28
RAM used - 66.2 Mb

Nearly a quarter of a billion transactions in a single Tomcat instance.. I remember 15 years ago when the Excel add-in wouldn't do more than 50 or 60 retrieves before it would sometimes crash. Now that we have faster servers, I am thinking perhaps we should go for a billion transactions in a single instance!

All of this ties in very nicely with the functionality I am working on currently for Dodeca. In a near future version, we expect to ship functionality that will allow administrators to capture the queries, calcs, etc. that users are calling when they use the application and use that input to run their own stress testing on their server using their own data and usage patterns. Let me know if you think this functionality would be useful in your environment.

1 comment:

David A. Fraser said...

Tim,
I'm working with the Essbase JAPI right now and I'm noticing that much of it is based on arrays. This could be a problem for scalability no? E.g. I'm wanting to stream the millions of members from a cube into a database. I don't want to pull all of the millions of members into memory on one machine and dump them into the db of the other. I want to fetch say 10000 at a time from Essbase. The api doesn't appear to allow me to do this. IEssMdAxis has getAllTupleMembers(..) which returns an IEssMdMember array instead of an iterator. Because it's an array the api must dump all the data onto my machine from the cube before I can start working with it. Ideas?