Introduction
This is the fifth of a six-part review of the Pentaho BI suite.
In each part of the review, we will take a look at the components
that make up the BI suite, according to how they would be used in the
real world.
In this fifth part, we'll be discussing how to create useful and
meaningful dashboards using the tools available to us in the Pentaho
BI Suite. As a complete Data Warehouse building tool, Pentaho
offers the most important aspect for delivering enterprise-class
dashboards, namely Access Control List (ACL). A dashboard-creation
tool without this ability to limit dashboards access to a particular
group or role within the company is missing a crucial feature,
something that we cannot recommend to our clients.
On the Enterprise Edition (EE) version 5.0, dashboard creation has
a user-friendly UI that is as simple as drag-and-drop. It looks like
this:
Figure 1. The EE version of the Dashboard Designer
(CDE in the CE version)
Here the user is guided to choose a type of grid layout that is
already prepared by Pentaho. Of course the option to customize the
looks and change individual components are available under the hood,
but it is clear that this UI is aimed towards end-users looking for
quick results. More experienced dashboard designers would feel
severely restricted by this.
In the rest of this review, we will go over dashboard creation
using the Community Edition (CE) version 4.5. Here we are going to
see a more flexible UI which unfortunately also demands familiarity
with javacript and chart library customizations to create something
more than just basic dashboards.
BI Server Revisited
In the Pentaho BI Suite, dashboards are setup in these two places:
- Using special ETLs we prepare the data to be displayed on the
dashboards according to the frequency of update that is required by
the user. For example, for daily sales figures, the ETL would be
scheduled to run every night. Why do we do this? Because the
benefits are two-fold: It increase the performance of the dashboards
because it is working with pre-calculated data, and it allows us to
apply dashboard-level business rules.
- The BI Server is where we design, edit, assign access
permissions to dashboards. Deep URLs could be obtained for a
particular dashboard to be displayed on a separate website, but some
care has to be taken to go through the Pentaho user authorization;
depending on the web server setup, it could be as simple as passing
authorization tokens, or as complex as registering and configuring a
custom module.
Next, we will discuss each of these steps in creating a dashboard.
As usual, the screenshots below are sanitized and there are no real
data being represented. Data from a fictitious microbrewery is used
to illustrate and relate the concepts.
Ready, Set, Dash!
The first step is to initiate the creation of a dashboard. This is
accomplished by selecting File > New > CDE Dashboard. A little
background note, CDE (which stands for Ctools Dashboard Editor) is
part of the Community Tools (or Ctools) created by the team who
maintains and improve Pentaho CE.
After initiating the creation of a new dashboard, this is what we
will see:
Figure 2. The Layout screen where we perform the
layout step
The first thing to do is to save the newly created (empty)
dashboard into somewhere within the Pentaho solution folder (just
like what we did when we save an Analytic or Ad-Hoc Reports). To
save the currently worked on dashboard, use the familiar New | Save |
Save As | Reload | Settings menu. We will not go into details on
each of this self-explanatory menus.
Now look at the top-right section. There are three buttons that
will toggle the screen mode, this particular one is in the Layout
mode.
In this mode, we take care of the layout of the dashboard. On the
left panel, we see the Layout Structure. It is basically a grid that
is made out of Row entries, which contains Column(s) which itself may
contain another set of Row(s). The big difference between Row and
Column is that the Column actually contains the Components such as
charts, tables, and many other types. We give a name to a Column to
tie it to a content. Because of this, the names of the Columns must
be unique within a dashboard.
The panel to the right, is a list of properties that we can set
the values of, mostly HTML and CSS attributes that tells the browser
how to render the layout. It is recommended to create a company-wide
CSS to show the company logo, colors, and other visual markings on
the dashboard.
So basically all we are doing in this Layout mode is determining
where certain contents should appear within the dashboard, and we do
that by naming each of the place where we want those contents to be
displayed.
NOTE: Even though the contents are placed within a Column, it is a
good practice to name the Rows clearly to indicate the sections of
the dashboard, so we can go back later and be able to locate the
dashboard elements quickly.
Lining-Up Components
After we defined the layout of the dashboard using the Layout
mode, we move on to the next step by clicking on the Components
button on the top horizontal menu as shown in the screenshot below:
Figure 3. The Components mode where we define the
dashboard components
Usage experience: Although more complex, the CDE is well
implemented and quite robust. During our usage to build dashboards
for our clients, we have never seen it produce inconsistent results.
In this Components mode, there are three sections (going from left
to right). The left-most panel contains the selection of components
(data presentation unit). Ranging from simple table, to the complex
charting options (based on Protovis data visualization library), we
can choose how to present the data on the dashboard.
The next section to the right contains the current components
already chosen for the dashboard we are building. As we select each
of these components, its properties are displayed in the section next
to it. The Properties section is where we fill-in the information
such as:
- Where the data is coming from
- Where the Component will be displayed in the dashboard. This
is done by referring to the previously defined Column from the
Layout screen
- Customization such as table column width, the colors of a pie
chart, custom scripting that should be run before or after the
component is drawn
This clean separation between the Layout and the Components makes
it easy for us to create dashboards that are easy to maintain and
accommodates different versions of the components.
Where The Data Is Sourced
The last mode is the Data Source mode where we define where the
dashboard Components will get their data:
Figure 4. The
Data Sources mode where we define where the data is coming from
As seen in the left-most panel,
the data source type is quite comprehensive. We typically use either
SQL or MDX queries to fetch the data set in the format that is
suitable to be presented in the Components we defined earlier.
For instance, a data set to be
presented in a five-columns table will look different than one that
will be presented in a Pie Chart.
This screen follows the other in
terms of sections, we have (from left to right) the Data Source type
list, the currently defined data sources, and the Properties section
on the right.
Usage experience:
There may be some confusion for those who are not familiar with the
way Pentaho define a data source. There are two “data source”
concepts represented here. One is the Data Source defined in this
step for the dashboard, and the other, the “data source” or “data
model” where the Data Source connects to and run the query against.
After we define the Data Sources
and name them, we go back to the Components mode and specify these
names as the value of the Data source
property of the defined components.
Voila! A Dashboard
By the time we finished defining the
Data Sources, Components, and Layout, we end up with a dashboard.
Ours looks like this:
Figure 5. The
resulting dashboard
The Title of the dashboard and the date range is contained within
one Row. So are the first table and the pie chart. This demonstrates
the flexibility of the grid system used in the Layout mode.
The company color and fonts used in this dashboard is controlled
via the custom CSS specified as Resource in the Layout mode.
All that is left to do at this point is to give the dashboard some
role-based permissions so access to it will be limited to those who
are in the specified role.
TIP: Never assign permission at the individual user level. Why?
Think about what has to happen when the person change position and is
replaced by someone else.
Extreme Customization
Anything from table column width to the rotation-degrees of the
x-axis labels can be customized via the properties. Furthermore, for
those who are well-versed in Javascript language, there are tons of
things that we can do to make the dashboard more than just a static
display.
These customizations can actually be useful other than just making
things sparkle and easier to read. For example, by using some
scripting, we can apply some dashboard-level business rules to the
dashboard.
Usage experience:Let's say we wanted to trigger some numbers displayed to be in
the red when it fell below a certain threshold, we do this using the
post-execution property of the component and the script looks like
this:
Figure 6. A sample post-execution script
Summary
The CDE is a good tool for building dashboards, coupled with the
ACL feature built into the Pentaho BI Server, they serve as a good
platform for planning and delivering your dashboard solutions. Are
there other tools out there that can do the same thing with the same
degree of flexibility? Sure. But for the cost of only time spent on
learning (which can be shortened significantly by hiring a competent
BI consultant), it is quite hard to beat free licensing cost.
To squeeze out its potentials, CDE requires a lot of familiarity
with programming concepts such as formatting masks, javascript
scripting, pre- and post- events, and most of the times, the answer
to how-to questions can only be found in random conversations between
Pentaho CE developers. So please be duly warned.
But if we can get past those hurdles, it can bring about some of
the most useful and clear dashboards. Notice we didn't mention
“pretty” (as in “gimicky”) because that is not what makes a
dashboard really useful for CEOs and Business Owners in day-to-day
decision-making.
Next in the final part (part-six), we will wrap up the review with
a peek into the Weka Data Mining facility in Pentaho, and some
closing thoughts.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Have you looked into using Talend?? It's got a great user interface, very similar to kettle, and their paid for version has version control that works very well, and you get the ability to run "joblets" which are basically re-usable pieces of code. Even in the free version there is version control, although it's pretty clumsy, and not joblets in the free, and the free version is difficult to get working with Github.