BI Trends 2011 – a synopsis

Boris Evelson wrote a white paper for Forrester on BI trends titled: Trends 2011 And Beyond: Business Intelligence.  The document is intended for “Business Process Professionals” but I think it is valuable for anyone that thinks about BI and how to improve it.

This post is a synopsis of the paper (or, Joel’s Notes) intended to provide the highlights of the paper.

Executive Summary: In short, Evelson maintains that successful BI will be challenging as we move forward:  “While BI maturity in enterprises continues to grow, and BI tools have become more function-rich and robust, the promise of efficient and effective BI solutions remains challenging at best and elusive at worst. Why? Two main reasons: First, BI is all about best practices and lessons learned, which only come with years of experience; second, earlier-generation BI approaches cannot easily keep up with ever-changing business and regulatory requirements…(we) should adopt Agile BI processes, technologies, and architectures to improve (our) chances of delivering successful BI initiatives”.

Why do BI initiatives sometimes fail?

Lack of ability and flexibility can lead to unsuccessful BI initiatives

Implementing BI requires using best practices and building upon lessons learned

  1. end-to-end BI implementations require closely coordinated efforts to put together multiple components like data sourcing, integration, modeling, metrics, queries, reports, dashboards, portals, and alerts.
  2. It’s difficult to define future BI requirements as business and regulatory requirements change.
  3. The business may not have strong requirements to begin with

BI technologies and processes have not kept pace with business realities

Standard BI solutions are neither flexible nor agile enough to react and adapt to information requirements that seem to change with ever-increasing speed.

Centralization has not led to agile, streamlined BI implementations

Although centralization can reduce costs, duplication of effort, and provide a “single version of the truth”, it is often too bureaucratic and cumbersome (anything but agile).

The figure below shows path of BI from the 90’s to the 10’s.


Agility is the key to efficient and effective business process

Untamed business processes: “Business processes that form in the seams and shadows of the enterprise, require a balance of human and system support, and cross department, technology, information, and packaged application silos to meet end-to-end business outcomes”

Because business processes can change periodically, traditional SDLC doesn’t work well for automating untamed processes.

Forrester defines Agile Business Intelligence as: “An approach that combines processes, methodologies, organizational structure, tools, and technologies that enable strategic, tactical, and operational decision-makers to be more flexible and more responsive to the fast pace of changes to business and regulatory requirements”.

And, I love this quote: “No technology or processes can address BI challenges if a company’s organizational structure and enterprise culture are not already on firm, agile ground” – very reminiscent of my earlier post “It’s not me, it’s my clubs”.

Best practices to prepare your organization for agile BI:

  • Insist on business ownership and governance of BI
  • Emphasize organizational and cultural change management
  • Decouple data preparation from data usage processes in end-to-end BI cycles
  • Approach and treat front- and back-office BI requirements and users differently
  • Establish hub-and-spoke organizational models
  • Use a combination of top-down and bottom-up approaches to BI design and applications
  • Use Agile development methodologies
  • Enable BI self-service for business users

Embrace change

Evelson continues “Once the organization is aligned for agility, the next step is to consider and implement agile BI processes”.

Once you’ve developed your swing, it’s time for a new set of clubs (Next generation technologies).

Next Generation Technologies are the future of agile BI – there are four major sub categories:

 

 

  1. Automated – eliminates manual work to free up resources for true value-added tasks.
  2. Unified – brings different BI use cases together in a unified platform.
  3. Pervasive – make BI applications available whenever strategic, tactical and operational decision-makers need to analyze information.
  4. Limitless – BI must operate on information without any borders or restrictions.

Evelson goes on to list many vendors that represent each of the four sub categories above.  (Here are some of the vendors without the additional information that the white paper lists about features)
Automation – Composite Software Discovery, IBM InfoSphere Discovery, Appian, Lombardi, Savvion, Alteryx, Endeca, Cognos AAF, JackBe, Kalido, etc.
Unified – Attivio, Endeca, TIBCO  Spotfire, Information Builders, Oracle, Cognos, Alteryx, Quiterian, Kalido, Business Objects, SAS, etc.
Pervasive – Most BI vendors, Appian, Fujitsu, Global360, HandySoft, Lomvardi, Metastorm, Savvion, Business Objects, TIBCO  Spotfire, Oracle, Information Builders, QlikTech, etc.
Limitless – Most BI vendors, PowerPivot, QlikTech, Quiterian, Attivio, Endeca, Saffron, TIBCO Spotfire, SISense, etc.

The real meat of the white paper is to say: Agile and Next-Gen BI may be the future of successful BI implementations.  Embrace it, keep an open mind about new technologies, don’t be afraid to let the business have it, but prepare your organization first.  There is no substitute for good process, best practices, strategic vision, and proper organizational structure.  Get these things right and you will be prepared to provide the kind of Business Intelligence that the business needs.

You can get the white paper here (you have to register).

Advertisements

Just give me everything…

I recently posted the following question on a group discussion in LinkedIn:

“Have you ever been gathering requirements for a DW/BI solution and the business says “Just give me everything in an ad hoc universe” or “we want everything that we had before”?  I thought it would be interesting to see different perspectives and techniques for dealing with this.  What are your thoughts?”

There were some good responses and I’d like to just pull out a few here as well as provide my own comments.

First of all, I’d like to point out that the consensus seems to be “start small and use an iterative approach”.  I like the way Barry Meltzer put it: “How do you eat an elephant? … One bite at a time”.  His approach is outlined as follows:

  • Describe the ‘bites’ (ie subject areas and desired outputs)
  • Determine which ‘bites’ must go together to be useful.
  • Determine any dependencies between the ‘bites’
  • Discuss which bites are ‘most broken’ and ‘most important’

“With that information in hand, you can work out a priority list (what do we attack first?) with your sponsors and then go into analysis, prototyping, etc”

Another person’s approach “is to have a patient hearing of business problems, challenges and try scoping the “Everything”. It’s an iterative process but works well, as business is getting a buy into it and when they see the articulation of their problems, “Everything” in front of them along with business case of “impact of not doing it”.”

Kent Graziano and Aran Nathanson both prefer to start with a prototype and iterate from there.  Aran mentioned a Ford quote that I like: “If I had asked my customers what they wanted they would have said a faster horse”.  I agree with Aran that the business users don’t always know (or, know how to articulate) what they want or need.  He states that the prototype “will also give people something concrete they can understand and play with rather than a conceptual framework which might look great on a powerpoint slide but has not real meaning for most users”.

Kent’s prototyping approach is to start with “some basic, obvious data (based on knowing a bit about their business), then iterate from there.  Of course it will help if you have an easy to use reporting or visualization tool, access to the right data (EDW, Data Vault, Data Mart, or even a virtualization layer on the OLTP data), and soem engaged user champions”.

I do believe that the prototype approach with iterations is an effective way of getting to real business value but be wary of unmanagable scope-creep.  Be sure to set specific goals and deliverables for each iteration or risk falling into the project abyss.

One approach that I have found useful is to get a small group of stakeholders in a room to facilitate a discussion. I try to frame the discussion in a manner that lets the business know that in order to provide the best solution, I need to ensure that I understand the business need, the data supporting the business need, and how the data will be used (and viewed). I usually start by having the business list about 15 to 20 business questions that they are expecting to have answered with the BI solution. Once we have the questions, we can group them into categories and usually eliminate much redundancy. This process usually narrows the focus from “everything” to a much more manageable subset. Whiteboarding some mockups helps to ensure that we are all on the same page. With the subset and mockups, the business usually responds favorably to an iterative approach where we start with a solution that provides answers to the key questions that they have identified rather than the more expensive “everything” approach.

Obviously, there are many variables and factors in play, such as: resources, knowledge of the data, knowledge of the business, time, historical successes and failures, tools, etc.  And, as such, there is never just one solution.  But, maybe this discussion will play a part in the approach you choose when the business asks for “everything”.

If you are a member of “TDWI’s Business Intelligence and Data Warehousing Discussion…” on LinkedIn, you can view the full thread of the discussion at: http://goo.gl/I37Cm

Thanks to following people for allowing me to quote them:

Barry Meltzer
Kent Graziano
Aran Nathanson (www.nprinting.com)

More on Validating the Data Model

Data Model Validation

“The validation activities are all related to the physical structure of the data. As such, validation of this model is the responsibility of IT personnel, and the primary roles belong to data analysts, database administrators, security administrators and programmers.”

Click on the link below for a quick read on the roles and responsibilities around data model validation.

More on Validating the Data Model

Advanced Display Options and an End to Custom Shapes (via Power Designing)

A nice solution to visually display metadata about a model utilizing new features in PowerDesigner (from Richard Kier’s Power Designing blog).

Advanced Display Options and an End to Custom Shapes Okay, so it’s been a LONG time since I’ve posted, but two small kids and a job (okay, mostly the kids) will do that.  Anyway … We’re preparing to upgrade to PowerDesigner 15.3 (sadly from 15.0) and one of the new features I’m very excited to get my hands on is the new advanced display controls available.  Previously, you were fairly limited to the information you could display on a diagram.  Each symbol gave you the ability to select among the … Read More

via Power Designing

It’s not me, it’s my clubs!

My golf game isn’t exactly on par these days but I’m pretty sure all I need is a new set of clubs.  I mean, I’m playing with some really old Wilson – Sam Snead blades and a wimpy driver.  If I just had a set of new Taylor Made Burner 2.0 irons and maybe the R11 TP driver, I think I could be ready to tour.  Well, maybe not…  Actually, I think if I took some lessons, hit the driving range on a regular basis, read up on the latest tips and techniques, maybe joined a league, and got some new clubs, I could probably be good enough to post my actual handicap here.

Can you make a similar correlation to Data Warehousing and Business Intelligence?  Yes, I believe so.  You can buy the latest and greatest tools, beef up your infrastructure,  invest in the hottest emerging technologies and you may have data in the hands of the business in record time.  But, did you hit the green or are you in the rough?  (Is the data accurate, does it meet the business needs, and can you maintain it sufficiently and in a timely manner?)

Sure the latest technologies will help you get there but it’s your swing that really matters.   In data warehousing and business intelligence, the swing is comprised of:  a strategic vision, partnership with the business, a well defined architecture, design and development best practices and process, and the right resources.  These are the building blocks that form the foundation from which success is created.

So, what does it take to build the right foundation?

  • Truly understanding the needs of the business and developing a strategy to meet them
  • Research tools, and trends as well as companies that have implemented them
  • Select tools that align with the strategy
  • Create a well-documented architecture that adheres to a solid methodology
  • Utilize resources that truly understand data management and business intelligence (it’s not good enough to be an expert on the tool)

Oh, there are plenty of other things to consider such as a metadata strategy, data governence, master data managment, etc.  But those are topics for another day…

There are some really cool tools and promising trends right now, such as Tableau, Spotfire, data virtualization, in-memory databases, cloud computing, column-oriented databases, predictive analytics, advanced data visualization, etc.  But, in order to implement these, a strong foundation is required.

Will new clubs help?  Yep, I think so.  But, I also need to work on my swing.

And, Honey if you are reading this, I have a great idea for a Father’s Day gift…

Updating very large tables in Oracle

How do you update very large tables in Oracle (tens or hundreds of millions of records)?  You don’t!

Ok, this isn’t anything new and it’s not rocket science, but maybe it’s a technique that you haven’t used before.  If you’ve tried to update millions or hundreds of millions of records in a table, you know that this can be an extremely time consuming process that ties up resources (and, may end up crashing anyway).  Here’s a simple solution that avoids using an update statement (and, all the redo associated with it):

Let’s say you have a table with 50 million records and you need to update one of the columns using by performing a calculation or applying a function.

At a high level…

  • Create a new table by selecting everything from the old table and performing the calculation in the select clause

Create table TABLE_NEW nologging as select <perform calculation here> from TABLE_OLD;

  • Apply all constraints, indexes, grants, etc. to the new table
  • Drop the old table
  • Rename TABLE_NEW to TABLE_OLD;

Note the nologging option bypasses generating any redo and will result in significant performance improvement.

That’s it!

Of course there are other options like partition swapping or writing a stored procedure (never do procedurally what can be done in a single sql statement).  But, I like this because it is safe and clean.

Writing effective definitions for the data model

It is the data modeler’s responsibility to document robust definitions of all objects in the data model.  Of particular importance are table and column definitions because they are the most visible objects to consumers of the data (Designers, Developers, Report Users, etc.).

Why is it important to write good table and column definitions?

  • Ensures consistent usage of the object – Providing a clear definition makes it more likely that the next designer or developer will continue to populate the object with the same intent that you had.
  • Allows for more accurate data integration – Data Integration involves pulling data from multiple systems into one.  When doing so, it is critical that the data you are integrating from different systems means exactly the same thing.  A combination of effective metadata and data profiling will ensure that your integration is accurate.
  • Provides a clear understanding of the contents for downstream audiences – Whether your database is transactional, an operational data store, or a data mart, it will be consumed by users and systems other than the one for which you designed it.  So, it is imperative that you provide enough information that everyone is using the data consistently.

What is an effective definition?
An effective definition is a description of an object that is clear, concise, and includes as much information (with a minimum amount of words) as possible to eliminate any ambiguity about the object.

Parts of a definition:

  1. The term to be defined
  2. The class of object or concept to which the term belongs.
  3. The differentiating characteristics that distinguish it from all others of its class
  4. Examples (not necessary but sometimes help to further define the entity or attribute)

For Example…
Region (term) is a  logical and geographical representation of the sales area (class).  Region is the highest level of the sales area hierarchy: Region, District, Territory (differentiating characteristics).  It may include values such as “Latin America”, “United States – West”, “Canada”, etc. (examples)

How to get started…
If possible, have a business subject matter expert provide definitions.  Refine the definition to meet the standards of an effective definition and have the business SME review for accuracy (and approve if the process is set up).  If you don’t have a business SME or data steward, use whatever resources are at your disposal (application metadata, wikipedia, google, industry standards documentation, etc.).

Tips for writing an effective definition

  • Avoid beginning the definition with “Contains…”, “A table that…”, “A column that…”, “Information about…”
  • Remember your audience. You are not writing for those who already understand what the term means.
  • Use expressions that help to distinguish the term from closely related terms.
  • Avoid close synonyms as much as possible.
  • Use business resources, data stewards, other application metadata when possible.
  • Write good.  Write well!  Use good grammer.
  • Be accurate!

* “Parts of a definition” originated from the Purdue Online Writing Lab