A quick and dirty substitution variable updater

There are a lot of different ways to update your substitution variables.  You can tweak them with EAS by hand, or use one of several different methods to automate it.  Here is one method that I have been using that seems to hit a relative sweet spot in terms of flexibility, reuse-ability, and effectiveness.

First of all, why substitution variables?  They come in handy because you can leave your Calc and Report scripts alone, and just change the substitution variable to the current day/week/month/year and fire off the job.  You can also use them in load rules.  You would do this if you only want to load in data for a particular year or period, or records that are newer than a certain date, or something similar.

The majority of my substitution variables seem to revolve around different time periods.  Sometimes the level of granularity is just one period or quarter (and the year of the current period, if in a separate Years dimension), and sometimes it’s deeper (daily, hourly, and so on).

Sure, we could change the variables ourselves, manually, but where’s the fun in that?  People that know me know that I have a tendency to automate anything I can, although I still try to have respect for what we have come to know as “keeping an appropriate level of human intervention” in the system.  That being said, I find that automating updates to timing variables is almost always a win.

Many organizations have a fiscal calendar that is quite different than a typical (“Gregorian”) calendar with the months January through December.  Not only can the fiscal calendar be quite different, it can have some weird quirks too.  For example, periods may have only four weeks one year but have five weeks in other years, and on top of that, there is some arcane logic used to calculate which is which (well, it’s not really arcane, it just seems that way).  The point is, though, that we don’t necessarily have the functionality on-hand that converts a calendar date into a fiscal calendar date.

One approach to this problem would be to simply create a data file (or table in a relational database, or even an Excel sheet) that maps a specific calendar date to its equivalent fiscal date counterparts.  This is kind of the “brute-force” approach, but it works, and it’s simple.  You just have to make sure that someone remembers to update the file from year to year.

For example, for the purposes of the date “December 22, 2008” in a cube with separate years, time, and weekday dimensions, I need to know three things: the fiscal year (probably 2008), the fiscal period (we’ll say Period 12 for the sake of simplicity, and the day of the week: day “2”).  Of course, this can be very different across different companies and organizations.  Monday might be the first day of the week or something.  If days are included in the Time dimension, we don’t really need a separate variable here.  So, the concepts are the same but the implementation will look different (as with everything in Essbase, right?).

I want something a bit “cleaner,” though.  And by cleaner, I mean that I want something algorithmic to convert one date to another, not just a look-up table.  Check with the Java folks in your company, if you’re lucky then they may already have a fiscal calendar class that does this for you.  Or it might be Visual Basic, or C++, or something else.  But, if someone else did the hard work already, then by all means, don’t reinvent the wheel.

Here is where the approaches to updating variables start to differ.  You could do the whole thing in Java, updating variables with the Java API.  You could have a fancy XML configuration file that is interpreted and tells the system what variables to create, where to put them, and so on.  In keeping with the KISS philosophy, though, I’m going to leave the business logic separate from the variable update mechanism.  Meaning this: in this case I will use just enough program code to generate the variables, then output them to a space-delimited file.  I will then have a separate process that reads the file and updates the Essbase server.  One of the other common approaches here would be to simply output MaxL or ESSCMD script itself, then run the file.  This works great too, however, I like having “vanilla” files that I can load in to other programs if needed (or, say, use in a SQL Server DTS/SSIS job).

At the end of the day, I’ve generated a text file with conents like this:

App1 Db1 CurrentYear 2008
App1 Db1 CurrentPeriod P10
App1 Db1 CurrentWeek Week4
App2 Db1 CurrentFoo Q1

Pretty simple, right?  Note that this simplified approach is only good for setting variables with a specific App/database.  It needs to be modified a little to set global substitution variables (but I’m sure you are enterprising enough to figure this out — check the tech ref for the appropriate MaxL command).

At this point we could setup a MaxL script that takes variables on the command line and uses them in its commands to update the corresponding substitution variable, but there is also another way to do this: We can stuff the MaxL statement into our invocation of the MaxL shell itself.  In a Windows batch file, this whole process looks like this:

SET SERVER=essbaseserver
SET USER=essbaseuser
SET PW=essbasepw

REM generates subvar.conf file
REM this is your call to the Java/VB/C/whatever program that
REM updates the variable file
subvarprogram.exe

REM this isn't strictly needed but it makes me feel better
sleep 2

REM This is batch code to read subvar.conf's 4 fields and pipe
REM them into a MaxL session
REM NOTE: this is ONE line of code but may show as multiple in
REM your browser!

FOR /f "eol=; tokens=1,2,3,4 delims=, " %%i in (subvar.conf) do echo
alter database %%j.%%k set variable %%i %%l; | essmsh -s %SERVER% -l
%USER% %PW% -i 

REM You would use the below statement for the first time you need
REM to initialize the variables, but you will use the above statement
REM for updates to them (you can also just create the variables in
REM EAS)

REM FOR /f "eol=; tokens=1,2,3,4 delims=, " %%i in (subvar.conf) do
echo alter database %%j.%%k add variable %%i; | essmsh -s %SERVER% -l
%USER% %PW% -i

Always remember — there’s more than one way to do it. And always be mindful of keeping things simple — but not too simple.  Happy holidays, ya’ll.

Getting everything done with just Essbase and Dodeca

Many people I talk to are amazed that a company can get by with just Essbase and Dodeca.  No Planning deployment, no HFM, none of that other stuff.  The core Essbase analytic engine brings a lot to the table, which is this: multi-dimensional analysis and all of the associated functionality for building, automating, and managing cubes (EAS, EDS, EIS, and such, in Essbase 7.x terminology).

So, out of the box with a typical Essbase setup, you can unlock the benefits of multi-dimensional analysis just by virtue of using using the Excel add-in.  We all know the Excel add-in has some issues (it’s not exactly aging gracefully), but it’s simple, fast, and many users absolutely love it.  The financial types tend to “get it” right away and go on their merry way.  They also seem pretty sad when they are “forced” to go to Business Objects for some data that isn’t in Essbase.

Providing access to cubes via the add-in gives your users a lot already.  Historically, it seems that many companies have grown organically around the Excel add-in.  They evolve to using VBA automation for retrieval and reporting jobs, creating formal Excel solutions, and of course facilitating various processes with a good old “Lock and Send,” with some calcs thrown in for good measure.  This is all well and I good, but for larger endeavors this approach can suffer from maintenance, security, and scalability issues.

Without clearly defined standards, the VB implementations seem to go south and turn into a mess of spaghetti code that breaks at the most inopportune times.  There are different versions of everything floating around.  The deployment method ranges from emailing updated files, to copying them off the network drive, to even throwing them on a USB stick and moving them around when all else fails.  It’s not always bad, but surely there is a way to improve upon this paradigm?

So, let’s step back for a moment.  Our big win in the first place was these useful cubes that slice and dice data in a way that other systems in the enterprise simply can’t compare to.  If we want to expand upon this existing success, what do we put on the wish list?  What are my goals, and what do I want to provide for my users?

I want to give my users something that’s easy to use, has a short learning curve, leverages existing Excel/ad-hoc knowledge, and is cohesive. As for myself , I’m a bit more concerned about the back-end.  I want something that is easy to deploy, easy to update, adapts to changing business needs, and is scalable.  Since I know and love cubes so much, I also want to leverage my existing knowledge of Essbase.

Here’s where Dodeca comes in.  Dodeca complements Essbase functionality (particularly my existing Essbase functionality) very well, by providing everything on top of the cubes that I want to give to my users.  My deployment and update issues are essentially solved since Dodeca uses Microsoft ClickOnce technology — each time the user runs Dodeca on their workstation, a process checks against the current version that has been published to a central server, and updates files if it needs to.  This functionality is dependent on the client workstation having the .NET Framework installed — this is not some obscure third-party library either, so if your client workstations don’t already have it, it is not incredibly difficult to get setup.  Essentially in this scenario, we get the benefits of a client-side application (responsive interface, complex widgets), coupled with the distribution benefits that the net provides.  Also, since all content is stored and retrieved from a central server, we just have to update content in one place and our clients will pull it down the next time they use it.

Once a user launches Dodeca, they are presented with a list of Views that they have access to.  Views can be different reports, web pages, and other things.  Dodeca is highly configurable, so my following comments will reflect a typical usage scenario, but not necessarily the absolute way that things have to be setup.  I also find it useful to set a user’s default view to a web page that provides status updates and other informational messages.

A common Essbase activity with Excel is to refresh a particular report (or numerous reports, or mountains of reports) at the end of a period.  This means updating the time period in some fashion, and refreshing everything.  This is either done by hand or with a little home brew VBA action.  This functionality lends itself quite well to being implemented in Dodeca.  We can provide the same report in Dodeca by modifying the current Excel sheet a little bit, importing it, and setting a few other things (who has access to it, how the report gets updated, what database it comes out of, etc).  By doing this, we can now provide the same report, with user-selectable dimensional members (such as Time period, but just as easily Location, Department, Scenario, and so on) to anybody in the organization.  If we need to modify the report, we can do it in one place (the server) and seconds later, if a user pulls up the report, they now see the latest version.

So far we’ve just scratched the surface, but as you can see, we’ve addressed our needs and wants fairly well.  Deployment and upgrades are simple (and I am largely abstracted from a sometimes lethargic IT department), and content updates ridiculously simple.  My users have a cohesive (one window with multiple tabs) environment to do their work in, with all of the functionality of Excel.  As an administrator I have gotten to keep, leverage, and extend all of my existing Essbase infrastructure and functionality.  And best of all, these tools are not mutually exclusive in the least bit: my users can keep using the Excel add-in the love so much.  In fact, while in Dodeca, any sheet they pull up can be instantly sent to Excel just by clicking a button.

It works for me, it gets things done, and it’s a winning combination.  In the coming weeks and months I’ll be expanding more on this.  Happy Holidays!

MaxL tricks and strategies on upgrading a legacy automation system from ESSCMD

The Old

In many companies, there is a lot of code laying around that is, for lack of better word, “old.” In the case of Essbase-related functionality, this may mean that there are some automation systems with several ESSCMD scripts laying around.  You could rewrite them in MaxL, but where’s the payoff?  There is nothing inherently bad with old code, in fact, you can often argue a strong case to keep it: it tends to reflect many years of tweaks and refinements, is well understood, and generally “just works” — and even when it doesn’t you have a pretty good idea where it tends to break.

Rewrite it?

That being said, there are some compelling reasons to do an upgrade.  The MaxL interpreter brings a lot to the table that I find incredibly useful.  The existing ESSCMD automation system in question (essentially a collection of batch files, all the ESSCMD scripts with the .aut extension, and some text files) is all hard-coded to particular paths.  Due to using absolute paths with UNC names, and for some other historical reasons, there only exists a production copy of the code (there was perhaps a test version at some point, but due to all of the hard-coded things, the deployment method consisted of doing a massive search and replace operation in a text editor).  Because the system is very mature, stable, and well-understood, it has essentially been “grandfathered” in as a production system (it’s kind of like a “black box” that just works).

The Existing System

The current system performs several different functions across its discreet job files.  There are jobs to update outlines, process new period data, perform a historical rebuild of all cubes (this is currently a six hour job and in the future I will show you how to get it down to a small fraction of its original time), and some glue jobs that scurry data between some different cubes and systems.  The databases in this system are setup such that there are  about a dozen very similar cubes.  They are modeled on a series of financial pages, but due to differences in the way some of the pages work, it was decided years ago that the best way to model cubes on the pages was to split them up in to different sets of cubes, rather than one giant cube.  This design decision had paid off in many ways.  One, it keeps the cubes cleaner and more intuitive; interdimensional irrelevance is also kept to a minimum.  Strategic dense/sparse settings and other outline tricks like dynamic calcs in the Time dimension rollups also keep things pretty tight.

Additionally, since the databases are used during the closing period, not just after (for reporting purposes), new processes can go through pretty quickly and update the cubes to essentially keep them real-time with how the accounting allocations are being worked out.  Keeping the cubes small allows for a lot less down-time (although realistically speaking, even in the middle of a calc, read-access is still pretty reliable).

So, first things first.  Since there currently are no test copies of these “legacy” cubes, we need to get these setup on the test server.  This presents a somewhat ironic development step: using EAS to copy the apps from the production server, to the development server.  These cubes are not spun up from EIS metaoutlines, and there is very little compelling business reason to convert them to EIS just for the sake of converting them, so this seems to be the most sensible approach.

Although the outlines are in sync right now between production and development because I just copied them, the purpose of one of the main ESSCMD jobs is to update the outlines on a period basis, so this seems like a good place to start.  The purpose of the outline update process is basically to sync the Measures dimension to the latest version of the company’s internal cross-reference.  The other dimensions are essentially static, and only need to be updated rarely (e.g., to add a new year member).  The cross-reference is like a master list of which accounts are on which pages and how they aggregate.

On a side note, the cross-reference is part of a larger internal accounting system.  What it lacks in flexibility, it probably more than makes up for with reliability and a solid ROI.  One of the most recognized benefits that Essbase brings to the table in this context is a completely new and useful way of analyzing existing data (not to mention write-back functionality for budgeting and forecasting) that didn’t exist.  Although Business Objects exists within the company too, it is not recognized as being nearly as useful to the internal financial customers as Essbase is.  I think part of this stems from the fact that BO seems to be pitched more to the IT crowd within the organization, and as such, serves mostly as a tool to let them distribute data in some fashion, and call it a day.  Essbase really shines, particularly because it is aligned with the Finance team, and it is customized (by finance team members) to function as a finance tool, versus just shuttling gobs of data from the mainframe to the user.

The cross-reference is parsed out in an Access database in order to massage the data into various text files that will serve as the basis of dimension build load rules for all the cubes.  I know, I know, I’m not a huge Access fan either, but again, the system has been around forever, It Just Works, and I see no compelling reason to upgrade this process, to say, SQL Server.  Because of how many cubes there are, different aliases, different rollups, and all sorts of fun stuff, there are dozens of text files that are used to sync up the outlines.  This has resulted in some pretty gnarly looking ESSCMD scripts.  They also use the BEGININCBUILD and ENDINCBUILD ESSCMD statements, which basically means that the cmd2mxl.exe converter is useless to us.  But no worries — we want to make some more improvements besides just doing a straight code conversion.

In a nutshell, the current automation script logs in (with nice hard-coded server path, user name, and password, outputs to a fixed location, logs in to each database in sequence, and has a bunch of INCBUILDDIM statements.  ESSCMD, she’s an old girl, faithful, useful, but just not elegant.  You need a cheatsheet to figure out what the invocation parameters all mean.  I’ll spare you the agony of seeing what the old code looks like.

Goals

Here are my goals for the conversion:

  • Convert to MaxL. As I mentioned, MaxL brings a lot of nice things to the table that ESSCMD doesn’t provide, which will enable some of the other goals here.
  • Get databases up and running completely in test — remember: the code isn’t bad because it’s old or “legacy code,” it’s just “bad” because we can’t test it.
  • Be able to use same scripts in test as in production.  The ability to update the code in one place, test it, then reliably deploy it via a file-copy operation (as opposed to hand-editing the “production” version) is very useful (also made easier because of MaxL).
  • Strategically use variables to simplify the code and make it directory-agnostic.  This will allow us to easily adapt the code to new systems in the future, for example, if we want to consolidate to a different server in the future, even one on a different operating system).
  • And as a tertiary goal: Start using a version control system to manage the automation system.  This topic warrants an article all on itself, which I fully intend to write in the future.  In the meantime, if you don’t currently use some type of VCS, all you need to know about the implications of this are that we will have a central repository of the automation code, which can be checked-in and checked-out.  In the future we’ll be able to look at the revision history of the code.  We can also use the repository to deploy code to the production server.  This  means that I will be “checking-out” the code to my workstation to do development, and I’m also going to be running the code from my workstation with a local copy of the MaxL interpreter.  This development methodology is made possible in part because in this case, my workstation is Windows, and so are the Essbase servers.

For mostly historical reasons the old code has been operated and developed on the analytic server itself, and there are various aspects about the way the code has been developed that mean you can’t run it from a remote server.  As such, there are various semantic client/server inconsistencies in the current code (e.g. in some cases we are referring to a load rule by it’s App/DB context, and in some cases we are using an absolute file path).  Ensuring that the automation works from a remote workstation will mean that these inconsistencies are cleaned up, and if we choose to move the automation to a separate server in the future, it will be much easier.

First Steps

So, with all that out of the way, let’s dig in to this conversion!  For the time being we’ll just assume that the versioning system back-end is taken care of, and we’ll be putting all of our automation files in one folder.  The top of our new MaxL file (RefreshOutlines.msh) looks like this:

msh "conf.msh";
msh "$SERVERSETTINGS";

What is going on here?  We’re using some of MaxL features right away.  Since there will be overlap in many of these automation jobs, we’re going to put a bunch of common variables in one file.  These can be things like folder paths, app/database names, and other things.  One of those variables is the $SERVERSETTINGS variable.  This will allow us to configure a variable within conf.msh that points to where the server-specific MaxL configuration file.  This is one method that allows us to centralize certain passwords and folder paths (like where to put error files, where to put spool files, where to put dataload error files, and so on).  Configuring things this way gives us a lot of flexibility, and further, we only really need to change conf.msh in order to move things around — everything else builds on top of the core settings.

Next we’ll set a process-specific configuration variable which is a folder path.  This allows us to define the source folder for all of the input files for the dimension build datafiles.

SET SRCPATH = "../../Transfer/Data";

Next, we’ll log in:

login $ESSUSER identified by $ESSPW on $ESSSERVER;

These variables are found in the $SERVERSETTINGS file.  Again, this file has the admin user and password in it.  If we needed more granularity (i.e., instead of running all automation as the god-user and instead having just a special ID for the databases in question), we could put that in our conf.msh file.  As it is, there aren’t any issues on this server with using a master ID for the automation.

spool stdout on to "$LOGPATH/spool.stdout.RefreshOutlines.txt";
spool stderr on to "$LOGPATH/spool.stderr.RefreshOutlines.txt";

Now we use the spooling feature of MaxL to divert standard output and error output to two different places.  This is useful to split out because if the error output file has a size greater than zero, it’s a good indicator that we should take a look and see if something isn’t going as we intended.  Notice how we are using a configurable LOGPATH directory.  This is the “global” logpath, but if we wanted it somewhere else we could have just configured it that way in the “local” configuration file.

Now we are ready for the actual “work” in this file.  With dimension builds, this is one of the areas where ESSCMD and MaxL do things a bit differently.  Rather than block everything out with begin/end build sections, we can jam all the dimension builds into one statement.  This particular example has been modified from the original in order to hide the real names and to simplify it a little, but the concept is the same.  The nice thing about just converting the automation system (and not trying to fix other things that aren’t broken — like moving to an RDBMS and EIS) is that we get to keep all the same source files and the same build rules.

import database Foo.Bar dimensions

    from server text data_file "$SRCPATH/tblAcctDeptsNon00.txt"
    using server rules_file 'DeptBld' suppress verification,

    from server text data_file "$SRCPATH/tblDept00Accts.txt"
    using server rules_file 'DeptBld'

    preserve all data
    on error write to "$ERRORPATH/JWJ.dim.Foo.Bar.txt";

In the actual implementation, the import database blocks go on for about another dozen databases.  Finally, we finish up the MaxL file with some pretty boilerplate stuff:

spool off;
logout;
exit;

Note that we are referring to the source text data file in the server context.  Although you are supposed to be able to use App/database naming for this, it seems that on 7.1.x, even if you start the filename with a file separator, it still just looks in the folder of the current database.  I have all of the data files in one place, so I was able to work around this by just changing the SRCPATH variable to go up two folders from the current database, then back down into the Transfer\Data folder.  The Transfer\Data folder is under the Essbase app\ folder.  It’s sort of a nexus folder where external processes can dump files because they have read/write access to the folder, but it’s also the name of a dummy Essbase application (Transfer) and database (Data) so we can refer to it and load data from it, from an Essbase-naming perspective.  It’s a pretty common Essbase trick.  We are also referring to the rules files from a server context.  The output files are to a local location.  This all means that we can run the automation from some remote computer (for testing/development purposes), and we can run it on the server itself.  It’s nice to sort of “program ahead” for options we may want to explore in the future.

For the sake of completeness, when we go to turn this into a job on the server, we’ll just use a simple batch file that will look like this:

cd /d %~dp0
essmsh RefreshOutlines.msh

The particular job scheduling software on this server does not run the job in the current folder of the job, therefore we use cd /d %~dp0 as a Windows batch trick to change folders (and drives if necessary) to the folder of the current file (that’s what %~dp0 expands out to).  Then we run the job (the folder containing essmsh is in our PATH so we can run this like any other command).

All Done

This was one of the trickier files to convert (although I have just shown a small section of the overall script).  Converting the other jobs is a little more straightforward (since this is the only one with dimension build stuff in it), but we’ll employ many of the same concepts with regard to the batch file and the general setup of the MaxL file.

How did we do with our goals?  Well, we converted the file to MaxL, so that was a good start.  We copied the databases over to the test server, which was pretty simple in this case.  Can we use the same scripts in test/dev and production?  Yes.  Since the server specific configuration files will allow us to handle any folder/username/password issues that are different between the servers, but the automation doesn’t care (it just loads the settings from whatever file we tell it), I’d say we addressed this just fine.  We used MaxL variables to clean things up and simplify — this was a pretty nice cleanup over the original ESSCMD scripts.  And lastly, although I didn’t really delve into it here, this was all developed on a workstation (my laptop) and checked in to a Subversion repository, further, the automation all runs just fine from a remote client.  If we ever need to move some folders around, change servers, or make some other sort of change, we can probably adapt and test pretty quickly.

All in all, I’d say it was a pretty good effort today.  Happy holidays ya’ll.

A @JExport odyssey: installing, updating, and troubleshooting the JExport custom defined function

Beginning

I have seen various references to JExport on the Network54 forum over the years but I have never had a use for it — until now. I would like to use a cube to sort of ‘manage’ data, but the SQL backend for the data needs to be updated at the same time. I could just use a report script out of the cube and load that to the RDBMS, but I want to be able to dynamically update based on changes to the cube, rather than have to sweep the cube at certain intervals. This also plays well into the fact that Dodeca will be used as a front-end to update the data, so that when the new data is sent in (essentially as a Lock & Send operation), everything gets updated on the fly.

There doesn’t really seem to be a lot of JExport documentation out there, and I couldn’t find anything at all where people are using JExport with a SQL backend.  After stumbling through things with my feeble Java skills, lots of trial and error, and numerous cups of coffee, I was able to get things all setup and working.

First of all, I had to get the necessary files. The JExport documentation labels it as “shareware” (and it is not an officially supported piece of functionality from Oracle/Hyperion), so for the time being I will simply host a zip of the files here. If it turns out someday that I have to take this link down, then so be it, but for now, here you are.

Extracting the zip revels a PDF whitepaper and another zip file. Inside the other zip you will find the following files:

  • ExportCDF.jar — this is the jar file (a jar file is a zip file that contains a bunch of Java classes and some other things).  You can open this up in 7-zip to view the contents.
  • ExportCDF_Readme.htm — some information on installing and configuration stuff
  • exportRDB.mdb — this is only necessary for demo purposes, you don’t really need it
  • rdb.properties — you’ll want this for configuring the JDBC connection.
  • RegisterExportCDF.msh — this is convenient to install the functions to Essbase for you (or you can do it through EAS).
  • src folder — this has the source for all of the methods, you’ll need this in order to compile a new .jar yourself (like if you need to make changes to the implementation).
  • Other files — there are some sample calc scripts, and a gif file (yay!)

So now what?

Following the instructions that are included is fairly straightforward.  I can vouch for the fact that these work on Essbase 7.1.6 but I do not know how things may have changed under the hood for newer versions, so if it works at all, then you may need to modify them accordingly.

As the directions indicate, you need to put the ExportCDF.jar file in to $ARBORPATH/java/udf.  Also put the rdb.properties in the same folder.  Here’s a tip if you want to use JDBC for the exports: ALSO put the JDBC jar file in this folder.  For example, the SQL Server .jar I am using is sqljdbc.jar, so I put this file in the same folder as well.

Still following the included directions, you need make some changes to your udf.policy file in the $ARBORPATH/java directory.  Add the following lines:

grant codeBase "file:${essbase.java.home}/../java/udf/ExportCDF.jar" {
    permission java.security.AllPermission;
};

Here’s another part where the included directions may fail you, if you are planning on using a SQL backend: ALSO add a line for your JDBC jar file.  After getting some Access Denied errors during my troubleshooting process, it finally occurred to me that I needed to add a line for sqljdbc.jar too.  Here’s what it looks like:

grant codeBase "file:${essbase.java.home}/../java/udf/sqljdbc.jar" {
    permission java.security.AllPermission;
};

Again, change the names accordingly if you need to use different .jar files. If you have to use the DB2 trifecta of .jar files, then I imagine adding the three entries for these would work too.  If you can’t access the udf.policy file because it’s in use, you may need to stop your Essbase service, edit the file, and restart the service.  Now we should be all set to register the functions.  The included MaxL file was pretty handy.

Be sude to change admin and password to an appropriate user on the analytic server (unless of course, you use ‘admin’ and admin’s password is ‘password’…) and run it.  On a Windows server this would mean opening up the commandline, cd’ing to the directory with the MaxL script, and running ‘essmsh RegisterExportCDF.msh’.  If you can’t run this from the Essbase server itself, then change localhost to the name of the server and run it from some other machine with the MaxL interpreter on it.  If you go in to EAS and check out the “Functions” node under your analytic server, you should now see the three functions added there.  Due to the way they’ve been added, they are in the global scope — that is, any application can access them.  From here, the example calc script files included with JExport should help you find your way.  I would suggest trying to get the file export method working first, as that has the least potential to not work.  Here is the code (I have cleaned it up slightly for formatting issues):

/*
 *    Export to a text file
 *   arg 1:  specify "file" to export to a text file
 *   arg 2:  file name.  This file name must be used to close the file after the calculation completes
 *   arg 3:  delimiter.  Accepts "tab" for tab delimited
 *   arg 4:  leave blank when exporting to text files
 *   arg 5:  an array of member names
 *   arg 6:  an array of data
 */

/* Turn intelligent calc off */
SET UPDATECALC OFF

/*
 * Fix on Actual so that only one scenario is evaluated, otherwise a
 * record for each scenario will be written and duplicated in the export
 */
FIX ("Actual")	

  Sales (
    IF ("variance" < 0)
      @JExportTo("file","c:/flat.txt",",","",
         @LIST(
           @NAME(@CURRMBR(Market)),
           @NAME(@CURRMBR(Product)),
           @NAME(@CURRMBR(measures)),
           @NAME(@CURRMBR(year))),
         @LIST(actual,budget,Variance)
    );
    ENDIF;
  }

ENDFIX

/* Close the file */

RUNJAVA com.hyperion.essbase.cdf.export.CloseTarget "file" "c:/flat.txt" ;

Note that you have to anchor the JExport function to a member — so in this case, if you try to take out Sales and the parentheses that surround the IF/JExportTo/ENDIF, EAS will bark at you for invalid syntax.  The main thing going on here is that we are calling the function with the given set of parameters (hey, there actually is a use for @NAME!  I kid, I kid…).  For blog formatting issues, I broke the call to JExportTo up into multiple lines, but this is still syntactically correct.  In the case where we are exporting to a flat file, we also call a method to close the file (this part isn’t needed for JDBC).

Feeling brave?  Try and run it.  If everything works as planned, you are well on your way to CDF happiness.  In my case, I was aiming high and trying to get the JDBC stuff to work, but when I realized that was going nowhere, I decided to simplify things and go with the flat file approach.  Among other things, it showed me that I had to fix up my member combinations a bit before it would fit into a SQL table.

But I want to use JDBC!

This is where you get to benefit from my pain and experience.  As you saw above, you need to put the .jar file for your RDMBS in the folder I mentioned, AND you have to edit the udf.policy file for the jar(s) you add.  Now you need to shore up your rdb.properties file.  You can comment out everything in the file except the section you need, so in my case, I put a # in front of the DB2 entries and the Oracle entries, leaving just the TargetSQL entries.

# SQLServer entries:
TargetSQL.driver=com.microsoft.sqlserver.jdbc.SQLServerDriver
TargetSQL.url=jdbc:sqlserver://foosql.bar.com:1433;databaseName=EssUsers
TargetSQL.user=databaseuser
TargetSQL.password=databasepw

Notice that all of these entries start with “TargetSQL.” This can actually be anything you want it to be, but whatever it is, that’s how you will refer to it from your calc scripts.  This means if you just have one rdb.properties file but you want to do some JExport magic with multiple SQL backends, then you just put in another section like foo.driver, foo.url, and so on.  Note that the syntax I am using is explicitly calling out port 1433.  This is what my little SQL Server 2000 box is using — you may need to adjust yours.  Originally I did not specify my JDBC URL correctly, but make sure for SQL Server you put the databaseName parameter on the end.  You could perhaps get away with not specifying the database name here and instead prefixing it to your table name in the JExport command, but this works so let’s run with it.

So what does the code look like in the calc script now?  It’s the same, except we don’t need the RUNJAVA line at the end, and we change the JExportTo line to something like this:

@JExportTo("JDBC","TargetSQL","","TEST_TABLE",
  @List(@NAME(@CURRMBR(Product)),@NAME(@CURRMBR(Market))),
  @LIST(Actual));

The function is the same, but the parameters have changed a bit.  We now tell it through the first parameter that this is a JDBC connection, and we tell it “TargetSQL” in the second parameter.  This should look familiar because that’s essentially the prefix it’s looking for in the rdb.properties file.

The third parameter is blank (this was the delimiter field for text file exports).  We then tell it a table name in the RDBMS to put the data in to, then we give it a list of parameters and values.  In this case, the Java method quite literally creates a SQL INSERT statement that will look something like the following:

INSERT INTO TEST_TABLE VALUES ('<Product>', '<Market>', 2)

Of course, <Product> and <Market> will actually be the current member from those dimensions, and the 2 would be whatever the actual data cell is (based on the FIX statement you saw earlier, of course).  Assuming you did everything correctly (and the specification for TEST_TABLE in the given database is consistent with the data you are trying to insert), everything should be all hugs and puppy dogs now.

But it’s not all hugs and puppy dogs

If, like me, you got here, but things didn’t work, you are now wondering “how in the name of all that is holy do I figure out what is wrong?”  Let’s recap.  We did all these things:

  1. Copied the .jar file to the server in the proper folder
  2. Copied the rdb.properties file to the same folder
  3. Copied any necessary .jar files for our RDBMS to the same folder
  4. Edited udf.policy to add all of these .jar files
  5. Ran the .msh file or used EAS to add the functions to the analytic server
  6. Stopped and started the Essbase service as needed (and one more time, just for good measure)
  7. Added the example code to a calc script in one of our apps
  8. Verified the syntax and it verified for us (no “function not defined” type of errors)
  9. Ran it (duh)
  10. Verified that everything works, at least for the .txt file output

How do we troubleshoot?  If you’re like me, and don’t know Java inside and out, and know even less about how the custom defined functions are setup within Essbase, then you really have no idea how to go about this.  I tried looking at the server logs and the app logs and the Essbase console itself, but I just couldn’t find where, if anywhere, the output from the ExportCDF methods would go (you know, so I could see an actual error message about what might be wrong).

So I did what any normal Essbase developer would do and I dug in and brute forced it.  There are some System.out.println() commands at various places in the Export functions, so I know if I can get the output from these then I can see what the deal is.  At this point, I also knew that I was able to successfully write files on the Essbase server (with the Export function to a flat file method), so, lacking any other clear method, how about I output the error messages to a file instead?  This is actually pretty straightforward, but the tricky part (for me) was recreating the ExportCDF.jar file from the .java files I had.

First of all, before any code changes, let’s see if we can turn the .java files into a .jar that still works on the analytic server.  The source files are located in our original .zip file under the src folder.  Under src/ is the com/ folder and another series of folders that represent the Java package name.  Let’s start things off by putting a folder on the server somewhere so we have a place to work in.  I used the server for a couple of reasons: one, I’ll be using the JDK from the server to roll the .jar file, so keeping the versions consistent will help reduce a possible area that might not work, and two, when I need to revisit this, I have all the files I need location in a place that is easy to get to, and regularly backed up.  If I were a real Java master I would probably do it all on test and target the architecture to make sure they’re compatible (although just for kicks, I compiled with 1.4 and 1.6 and the output appears to be the same).

If you have some fancypants Java toolchain that can do all this for you, then you should probably do that.  If you just pretend to know Java, you can get by with the following directions.  After putting all of the files in a folder to work on, I ended up with a jexport/ folder containing the com/ folder hierarchy. Next, create a folder such as “test” under the jexport/ folder.  This is where we will put the compiled Java files so we don’t muddy up our com/ folder hierarchy.  Then we need a command to compile the .java files to .class files, and another command to roll the .class files up into a single jar.  You can use the following commands, and optionally stick them in a batch file for convenience:

javac -d test -classpath %ARBORPATH%\essbase.jar com\hyperion\essbase\cdf\export\*.java
jar cf ExportCDF.jar -C test com

Note that for the classpath we need to refer to the essbase.jar file that is in our Essbase folder somewhere.  Hopefully all compiles correctly and you end up with a shiny new ExportCDF.jar file.  Stop the Essbase service, copy the new file to the $ARBORPATH/java/udf folder, start things up, and test that it works.

It was at this point in my own tribulations that I was quite pleased with myself for having used Java source code to create a module that Essbase can use, but I still hadn’t solved the mystery of the not-working JDBC export.  And since I still don’t know how to get the output from Essbase as it executes the function, the next best thing seems to be just writing it to a file.  With a little Java trickery, I can actually just map the System.out.println commands to a different stream — namely, a file on the Essbase server.

You can add the following code to the ExportTo.java file in order to do so (replacing absolute file paths as necessary):

/* Yes, there are much better ways to do this
   No, I don't know what those are */

try {
    FileOutputStream out = new FileOutputStream("D:/test.txt");
    PrintStream ps = new PrintStream(out);
    System.setOut(ps);
} catch (FileNotFoundException e) {}

It was after doing this that I discovered that in my particular case, my JDBC URL was malformed, and I got my access denied error (which was fixed by putting the sqljdbc.jar references in the udf.policy file).  After all that, I fired up the JDBC test once more, and was thrilled to discover that the data that had been working fine getting exported to a flatfile, was now indeed being inserted to a SQL database!  And thus concluded my two days of working this thing backwards in order to get JExport working with a SQL backend.

And now there is some JExport documentation out on the web —  Happy cubing!

Speaking at ODTUG!

The highly regarded Essbase figure Tim Tow of Applied OLAP called me the other day to inform me that my submissions for ODTUG were accepted! Details are still being worked out, but the focus of the first talk will be on data load optimization (a matter very near and dear to my heart) as well as a sponsored presentation on Dodeca, which is one of Applied OLAP’s flagship products (and also increasingly near and dear to me).

These will be my first presentations in front of the Hyperion/Essbase community so I am hopeful when the time comes I will be able to cram the hour into one of the most useful and comprehensive sessions ever. For those of you out there in the Essbase community that I haven’t met or spoken to, I’ll be seeing you in Monterey next year!

Automate that old cube archive process!

Your server may have dozens or even hundreds of cubes on it.  A common strategy with a large and slowly changing Measures dimension (or some other dimension like Product) is to spin off a copy of the cube after a certain time period, typically the fiscal year end.  There are a number of different reasons that you might do this.  First, the cube may simply focus on Current Year and Prior Year, or a fixed number of years and scenarios such that the cube becomes too unwieldy when you start adding more.  Second, if you need to be able to go back and pull a report so that it looks exactly how it did in a certain fiscal year, then you may need to spin off the cube.  Depending on how many cubes you end up spinning off for each fiscal year, it may be necessary to go and clean them up at some point, but you might still want to keep them around, just in case.  You can do this by hand by stopping the app, zipping up the app folder and all its contents, and deleting the app from within EAS.

Here is an example of a batch file you could use on Windows.  This relies on the free 7-Zip package being installed somewhere.  The nice thing about this approach is that while it uses MaxL, it doesn’t actually have any MaxL files — it just injects the MaxL command via the command-line.  Edit the variables for your setup, and you’re on your way.  It’s not pretty but it’s nice if you have to go cleanup a bunch of apps!  Happy cubing — Jason. [download zip of the following batch file]

@echo off

SET USER=adminuser
SET PW=adminuserpw
SET SERVER=essbaseserver
SET APPPATH=D:\Essbase\App
SET ZIP="7zp\App\7-Zip\7z.exe"

@echo.
@echo -------------------------------------------
@echo This is the cube archiver utility...
@echo.
@echo Looking for App %1 ...

IF NOT EXIST %APPPATH%\%1 GOTO NoApp

@echo.
@echo I found it at %APPPATH%\%1 ...
@echo.
@echo Attempting to stop the app...

REM essmsh -l %USER% -p %PW% -s %SERVER% StopCube.msh %1

echo alter system unload application %1; | essmsh -s %SERVER% -l %USER% %PW% -i

@echo Archiving the app ...

%ZIP% a -tzip EssApp_%1.zip %APPPATH%\%1

echo.

choice /M "Okay to delete app %1"

IF ERRORLEVEL 2 GOTO Done

echo alter application %1 enable startup; | essmsh -s %SERVER% -l %USER% %PW% -i
echo drop application %1 cascade force; | essmsh -s %SERVER% -l %USER% %PW% -i

GOTO Done

:NoApp

@echo I could not find that app at %APPPATH%\%1 !!!

:Done

Optimizing Essbase Automation Jobs (for fun and profit!) – Part 1

I freely submit that I am a complete geek.  But it’s okay, because I have come to accept and embrace that inner geek.  Anyway… Many organizations run jobs to perform a task as certain intervals.  Your job as a good Essbase/Hyperion administrator is to be lazy — that is, make the computers do the boring stuff, while you get to do the fun stuff, like thinking.  What we end up doing is setting up batch and script files that do all of the things in a particular sequence that we might do by hand.  So instead of logging in, copying a data file, clearing a cube, running a calc, loading some data, running another calc, and all that other stuff, we do this automatically.

After a while, you get a feel for how long a certain job takes.  The job that loads last week’s sales might take about an hour to run.  The job to completely restate the database for the entire fiscal year might take six hours to run.  This probably also becomes painfully obvious when it turns out that the job didn’t run correctly, and now you need to run it again (and keep those users happy).  I am all about jobs running as fast as they can.  This is useful on a few different levels.  First and probably foremost, it finishes faster, which is nice, especially for those times when the sooner you have data the better.  It’s also nice when you need to re-run things.  It can also open up some new possibilities to use the tool in ways you didn’t think or know you could — such as during the business week, getting live updates of data.

Typically there are a few big time-killers in an automated process.  These are the data loads, calculations, and reports that are run.  There are also some others, such as calling out to a SQL server to do some stuff, restructuring outlines, pulling tons of data off the WAN, and all that.  You can always go through your database logs to see how long a particular calculation took, but this is a bit tedious.  Besides, if you’re like me, then you want to look under EVERY single nook and cranny for places to save a little time.  To borrow an a saying from the race car world, instead of finding one place to shave off 100 pounds, we can find 100 places to shave off one pound.  In practice, of course, some of these places you can’t cut anything from, but some of them you can cut down quite a bit.

The first, and probably most important step to take is to simply understand where you are spending your processing time.  This will give you a better window into where you spend time, rather than just a few things in the normal Essbase logs.  There are numerous ways you can do this.  But, if you know me at all by now, you know that I like to do things on the cheap, and make them dead-simple.  That being said, the following posts will be based around Windows batch file automation, but the concepts are easily portable to any other platform.

First of all, take a look at the below graph:

Total Seconds to Run Essbase Automation Job

Total Seconds to Run Essbase Automation Job

What you’re looking at is a graph that has been created in Excel, based on data in a text file that was generated as a result of a profiling process.  The steps are as granular as we want them to be.  In other words, if we simply want to clump together a bunch of jobs that all run really fast, then we can do that.  Similarly, if we want to break down a single job that seems to take a relatively long amount of time, we can do that too.  Each segment in the starcked bar chart is based on the number of seconds that the particular step took.  We can see that in the original (pre-optimization) scenario, the orangey-brown step takes up a pretty considerable amount of the overall processing time.  After some tweaking, we were able to dramatically slash the amount of time that it took to run the whole job.  And, in the third bar, after yet more tweaking, we were able to slash the overall processing time quite a bit again.

The important thing here is that we are now starting to build an understanding of exactly where we are spending our time — and we can therefore prioritize where we want to attempt to reduce that time.  My next post will detail a sample mechanism for getting this kind of raw data into a file, and the next few posts after those will start to dig in to where and how time was shaved off due to better outline, system, and other settings and changes.  Happy profiling!

A MaxL quickie to unload those databases that are eating your precious memory

Due to various business requirements, some organizations end up archiving many of their cubes each year.  For example, if you have a huge Measures dimension that is constantly changing (even in subtle ways), but you need to be able to go back at some point in the future and see what some numbers looked like at some particular point in time, you might find yourself spinning off a copy of the cube that sits on the server.  Most of the time it just sits there, dormant, not being used.  But every now and then someone comes along and spins it up so they can refresh some obscure report.  It’s got a decent sized outline to it, so the overhead just on having this cube running is probably in excess of 50 megs of memory, just to sit there!  If you aren’t rebooting your servers that frequently, that app and database are just going to sit there until someone comes along and stops it.  This might not be a problem for your current situation, but for this particular server, there are over two-hundred apps available at any given time — and RAM is a finite resource.

Now, we could just setup a job to unload everything — and indeed, that’s part of the solution.  But I like to keep the core apps hot so they don’t have to spin up when people (or myself) login.  So what to do?  Create a whitelist of core apps and databases and write a little MaxL to unload everything, then start just the things I want.

For the purposes of brevity, I will just assume that connect.msh has a valid login statement in it.  Then the code to unload the apps (unloadall.msh) is pretty straightforward (spool to some output file as needed…):

msh “connect.msh”;
alter system unload application all;
logout;
exit;

Then we have a script file that starts up a specific app/db passed on the command line (startappdb.msh):

msh “connect.msh”;
alter application $1 load database $2;
logout;
exit;

So, then we have a simple text file with the list of each app/database combination to fire up (whitelist.txt):

App1 Db1
App2 Db2

Then, in a simple Windows batch file we could do the following (which I imagine you could port pretty easily to whatever platform/scripting combination you have):

essmsh unloadall.msh
FOR /f “eol=; tokens=1,2 delims= ” %%i in (whitelist.txt) do essmsh startappdb.msh %%i %%j

That’s it!  Dead simple, but effective.  The FOR in batch is basically broken out as follows: eol is the end-of-line or comment character (which we aren’t using in the data file in this instance), the first and second fields are broken down into %i and %j, and the delimiter between them is a space.  Then we call the script that will start it up (named startappdb.msh, passing along the App and Db).  There are many ways you could do this (passing commands on the command line itself) but this method to me is clean and simple.

Jason’s Top 10 Essbase Data Load Optimization Tips

I received an email today from someone looking to speed up their data loads, specifically their ASO data loads that seem to be taking too long.  This is, of course, an important topic for many Essbase cube wranglers.  I would be willing to bet that many people spend more time optimizing calcs and may even neglect profiling their performance on data loads.  You might be surprised just how long your automation spends doing a data load.  That being said, there are several different scenarios you may find yourself in and different places to optimize.  Of course there are more items that can go on this list, but here are the big ones off the top of my head.  In other words, if I was tasked with improving my data load speed, here’s what I would look at:

  1. Do you need to load the data in the first place? I know this seems a bit rudimentary, but if you are loading data that you don’t even need (and can possibly help it), then don’t waste your time on it.
  2. Use the fastest connection. Architecturally you may simply have to load data one way or another (from a text file, from a SQL server, off the SAN, etc), but, hands down the fastest data loads I’ve seen (short of trying to load from a RAMdisk, although I’m dying to try it) are from different physical hard drives attached to the Essbase server.  With a good RAID setup the performance is still quite good if you are loading text files from the Essbase server.  If you are loading your records from an RDBMS across the WAN, you might be killing your performance due to network bottlebecks.  Another option is to put the RDBMS on the same box as Essbase.  I know many people do this, but personally I am not a huge fan of this option.  My Essbase servers tend to have plenty enough to do without having SQL software on the box to worry about.  Additionally, we license SQL by the CPU, and since my Essbase servers are all quad-proc and the SQL servers are getting by just fine with dual-proc, the cost to license it for two more CPUs is quite significant — particularly when all of the other optimization methods are essentially ‘free’.
  3. If loading from a SQL RDBMS: your bottleneck may very well be the network speed here.  If you can, make sure your Essbase server is on the same LAN as the SQL server, with the fastest possible connections (Gigabit or better).  If you are loading from a SQL table and using the WHERE clause in your load rule, or you are loading from a SQL view, make sure you have good indexes setup in SQL.  This can make a HUGE difference if you are loading just the records from Period 08 and you have an index, versus making the SQL server scan the entire table.  If performance is extremely critical see if you can pre-stage text files on the Essbase server (like tip #2).  If loading to BSO, order the rows to match your dense/sparse settings and the order of the outline (basically, you are trying to give Essbase a hand and load up a whole dense datablock in as few passes as possible).
  4. Do as little work as possible in the load rule.  The cost of doing text replacements, column swaps, accept/reject rules, and all that stuff can really add up.  If you can do this elsewhere then do it there (e.g., in a SQL view or having your ETL software prep it for you in the format you need).
  5. Tweak your settings. The Essbase.cfg has some black magic stuff in it.  Try the DLTHREADSWRITE parameter (check the DBAG for details) to see if you can throw some threads at the problem.  Watch your performance on the server — slamming all the CPUs may cause performance for other users to decrease.
  6. (BSO) Sort the data.  Try to give Essbase a a hand.  If your records are sorted such that Essbase is looking at the fewest blocks at a time, and reading the items in the same order as the outline, you’ll reduce the punishment to Essbase and help it load records faster.
  7. Outline optimization.  This one, of course, applies to just about everything you do in Essbase.  Smaller datablocks are your friends — in the sense that if you are committed to your dense/sparse settings, see if you can lighten up the dense blocks with some strategic dynamic calcs and labels.  For instance, my Time dimension is usually Time –> Quarters –> Periods and nine times out of ten, when Time is dense, the Quarters and Time members are dynamic calc instead of stored.  Of course, there may be numerous other reasons that the dense/sparse settings are what they are (calc performance, retrieve performance, etc), so don’t go making changes without understanding them.
  8. Load and Swap. You may find it useful to load up a cube separate from the production cube, then implement MaxL to drop it on top of the production cube (all on the same server of course).  This way you do all of the hard work in one and when it’s ready you can just pop it in place.  I think this works better in theory than in practice, at least for me.  I initially tried this with some very large ASO cubes, and although the performance wasn’t terrible, at least with version 7.1.x of Essbase, the swap process (the MaxL was a “create or replace” command) was not very graceful — it would shake the server to its knees during the swap process.  I eventually dropped this method in favor of using all the other points to optimize and make the down time even smaller.
  9. Use your own load rules instead of EIS. Out of convenience, and particularly on small cubes, I will load data with EIS.  However, you are going to see better performance when you use your own load rules and optimize them.  Besides, you probably already did it this way since you are concerned about performance, and this is probably in an automation script anyway.
  10. Reduce the size of the data members.  Any time your bottleneck is the speed of the transmission of information, try to cut it down (particuarly for SQL loads across the LAN).  For example, don’t have the field be “Period 01” if you can use “P01” instead.  Use “08” instead of “Yr2008”.  Try to balance this with how much work your load rule is doing (Tip #4).

As always, experiment!  Try different combinations to see what works and what doesn’t.  Remember that for squeezing out that extra bit of performace, you are trying to help Essbase do its job better.  Always remember that dimensions can be and probably are (on cubes you didn’t build) setup how they are setup for a reason — i.e., you might have worse load time but the benefit is faster calc or retrieve time.  If you have your own tips please let me know!

I submitted two abstracts for ODTUG!

Last year in New Orleans, the Hyperion track at ODTUG was an unqualified success.  It was such a success, in fact, that hundreds of abstracts were submitted for next years conference in Monterey.  I submitted two of those, so wish me luck.  I’d like to talk about an end-to-end Essbase automation system profiling/performance/optimization effort which would cover all the Essbase optimizationi techniques collectively (Outlines/Dense/Sparse, Calculations, MaxL, and some other tricks I have up my sleeve).  I also submitted an abstract to talk about how using Dodeca has improved our reporting systems in many cases, and allowed us to ditch some spaghetti code VB apps in the process.  So I am anxiously waiting to see if I get to talk next year….