Featured Post

Big Data, Hadoop and Business Intelligence

I consider Hadoop as one of the technologies that creates a  link between Big Data Analytics and  Business Intelligence . In my previous posts I explained what Big Data means and what was the meaning of Unstructured Data. In this post I would like to introduce Hadoop, which makes it possible to gain...

Read More

Product Management by Reality – Agile BI

Posted by Anahita | Posted in Agile, Business Intelligence | Posted on 17-01-2012

Tags: ,

0

Agile focuses on delivering value. Business Intelligence is also about providing means to deliver value. But how value is defined?  Value may be different for internal  organisational units within an organisation, different roles within the same organisational unit,  or even different times. So to deliver value via business intelligence is not a static process. It involves change and this is where Agile BI comes into the picture.

To my opinion first of all Business Intelligence should not be considered as a project. I explain: Business Intelligence is about making sense of organisational data via accessing a well trusted delivery model that suits best for the domain and type of the user with  the supporting underlying infrastructure. Now it is simple: Data changes, people change, processes change, businesses merge or separate, systems merge or separate, teams combine, groups divide, processes streamline to accommodate all these changes, the new data enters the cycle, and some data seizes to be important of of having any value as the result.

This is why Business Intelligence and Management Information is no longer a programme, it is a product group and it is in fact an evolving product group with subgroups for handling many layers that Management Information covers. This is a portfolio of products that requires project or programme management for continuous improvements!

Agile can handle this complexity, because agile concentrates on delivery of  higher value on  repeated short time frames and keep reviewing the product items by adding what is recognised as value for the whole organisation.  In short Agile BI  is  Management of MI Requirements by Reality!

 

 

 

 

Big EDW!

Posted by Anahita | Posted in Agile, Business Intelligence, Data Warehouse, Technology | Posted on 09-01-2012

Tags: , , , , , , , , , ,

0

Big Data is changing the way we need to look at Enterprise Data Warehousing. Previously I posted about big data  in Big Data – Volume, Variety and Velocity!. I also posted about the supporting projects from Apache Hadoop, such as Hbase and Hive in Big Data, Hadoop and Business Intelligence. Today I want to introduce a new concept, or better say an original idea. Big EDW!  Yes, Business Intelligence and Data Warehousing also will have to turn to Big BI and Big EDW!

So what makes the fabric of Big EDW and Big BI Analytics? The answer is the ability to analyse and make sense of Big Data, which covers not only the 20% of the structured data that organisations keep on their relational and dimensional databases, but also the vast remaining 80% unstructured data scattered in digital and web documents such as Microsoft Word, MS Excel, MS PowerPoint, MS Visio,  MS Project, as well as web data such as social media, wikis, web sites and other formats such as pictures, videos, and log files. I have posted about the meaning of unstructured data  previously  in On Unstructured Data.

Traditionally Enterprise Data Warehouse is a centralised Business Intelligence System, containing the required ETL programs to access various data sources,   transformation and load into a well designed dimensional model.  The front end BI access tools such as reporting, analytical and dashboards then is used on their own or integrated with the organisations interanet, to give the right users timely access to relevant information for analysis and decision making activities.

The Big Data does not quite  fit into this model for three main reasons, volume, variety and velocity of change and growth. Big EDW will need to break some of the traditional data warehousing concepts, but once done, it will create value that has many folds of magnitude.

Big EDW, should have the ability to be quick and agile in dealing with Big Data. It has to make it available for quick access to many new available data sources  in high volume. Enhanced design patterns or new use cases  have to emerge to make this possible. These patterns and use cases  should make use of more intelligent and faster methods of providing the relevant data when  required. This could be achieved by many methods such as  dimensional modelling, advanced mathematical/statistical models such as bootstrap and jackknife sampling to provide more accurate results for more accurate approximation for mean. median, variances, percentiles and standard deviation of big data.   Apache Hadoop  plays an essential role with projects such as  MapReduce, HDFS, HSQL (Hive SQL) and HBase. New central monitoring tools should be developed and embedded within the Big EDW to handle big data metadata such as social media sources, text analysis, sensor analysis, search ranking, etc.  Parallel Machine Learning and Data Mining, being looked at recently via projects such as Apache Mahout and Hadoop-ML combined with Complex Event Processing (CEP), amongst faster SDLC and project methodologies such as agile scrum for handling the Big EDW life cycle are also becoming standard in the realm of Big EDW.

Note that the phrase “Big EDW”  is not used anywhere else and is the naming that I thought could fit EDW growth in to a system that can also accommodate and manage  Big Data!

 

 

 

 

 

 

 

Agile Analytics by Ken Collier

Posted by Anahita | Posted in Agile, Books, Business Intelligence, Project Management | Posted on 30-12-2011

Tags: , , ,

0

I am currently reading a book by Ken Collier, called “Agile Analytics, A Value-Driven Approach to Business Intelligence and Data Warehousing”.

This book is specifically written for Agile BI and Data Warehouse projects and includes a BI project scenario for a factitious company called FlixBuster.

The book has two parts:

Part I is about Agile Management Methods and concentrates on  management of Agile BI projects and teams. This part covers topics such as User Stories for BI Systems and Self-Organising Teams Boost Performance.

Part II is about Agile Technical Methods for delivery of BI systems and how the team can drive business value by producing working BI/BW results often. These will include topics such as Design,  Test Driven Data Warehouse, Version Control and Project Automation.

An excellent and unique book for both BI/DW Project Managers,  Scrum Masters and the technical BI/DW teams such as ETL Professionals, DBAs and Source Data Specialists. Also great for companies who would want to run their own internal BI/DW Agile projects.

I end this brief introduction with a couple of quotes:

“A sweeping presentation of the fundamentals that will empower teams to deliver high-quality, high-value, working business intelligence systems far more quickly and cost effectively than traditional software development methods.” — Ralph Hughes, author of Agile Data Warehousing

“This book captures the fundamental strategies for successful business intelligence/analytics projects for the coming decade. Ken Collier has raised the bar for analytics practitioners—are you up to the challenge?” — Scott Ambler, Chief Methodologist for Agile and Lean, IBM Rational Founder, Agile Data Method

Big Data, Hadoop and Business Intelligence

Posted by Anahita | Posted in Business Intelligence | Posted on 17-12-2011

Tags: , , ,

1

I consider Hadoop as one of the technologies that creates a  link between Big Data Analytics and  Business Intelligence . In my previous posts I explained what Big Data means and what was the meaning of Unstructured Data. In this post I would like to introduce Hadoop, which makes it possible to gain business value from the Big Data.

Apache Hadoop is an open source project, providing software for reliable and scalable distributed computing. A simple programming model provides the ability for the distributed  processing of large data sets.  This is achieved by using a cluster of distributed processing and storage and so make it possible for Hadoop to easily scale up as required. Hadoop consists of three subprojects: Hadoop Common, Hadoop Distributed Files System (HDFS) and finally Hadoop MapReduce. Hadoop ecosystem of products also include derived technologies that could be used on their own or together to achieved the desired outcomes. Some of these related projects are Hive, Hbase, Zookeeper, etc For more details on each of the above projects, please visit http://hadoop.apache.org/

Core Hadoop is HDFS and MapReduce.

HDFS is Hadoop Distributed File System and is used as a utility in Hadoop projects to distribute data blocks to nodes in cluster which results in extremely fast computation.

MapReduce is an algorithm that makes it possible to perform parallel computing across the nodes in a cluster.

For Business Intelligence, one of the Hadoop projects, called Hive, is a data warehouse system for Hadoop compatible file systems (such as Apache HDFS or Apache HBase) and allows query, analysis and creating summary of of big data using a specific query language called Hive-QL.

Data is growing faster than ever and at the moment it doubles every year!  This will become astronomical and out of hand soon as around 80% of this data is Unstructured Data. Projects like Apache Hadoop makes it possible to analyse the Big Data and related projects such as Hive will make equivalent data warehousing for further storage and analysis of relevant data.

 

 

Big Data – Volume, Variety and Velocity!

Posted by Anahita | Posted in Business Intelligence | Posted on 05-12-2011

Tags: ,

0

For so many years many companies were concerned with the data  they didn’t have. The main worry was how to collect the required data. Many applications and systems were developed to give organisations the required entity for gathering the data.

With collection of data growing in many systems and applications, the companies then faced other challenges, such as how to make sense of the data they had and turn them to actionable insights. Data Services and Integrations along with Business Analytic and Performance Management Tools then came to rescue.

In the last two years, companies started to realise the volume, variety and velocity of data growth. In fact 90% of the data in the word today has been created in the last two years!

Volume: These days an enterprise will have a  petabytes (1 million gigabytes) of  information and it is growing.  This is Big data!

Variety: Big data is not just the data stored in  relational databases. It includes unstructured data in all documents, audio, video, live web data such as in wikia, blogs,  tweets, facebook, etc.

Velocity: The speed in which Big data is produced, makes it absolutely necessary to be analysed for insight as near as  it happens, i.e. near live. There is no  time to wait for later analysis.

Business Analytic and Intelligence is growing into Big Data space as we speak.