Featured Post

Big Data – Examples of Unstructured Data

Big Data has become a reality that cannot be ignored. In one of my previous posts, I explained the reason for the adjective big that sits before data to create big data. I mentioned that big is not just refereeing to volume, but also to variety and velocity of growth.  Big data not only is big in...

Read More

Agile Analytics by Ken Collier

Posted by Anahita | Posted in Agile, Books, Business Intelligence, Project Management | Posted on 30-12-2011

Tags: , , ,


I am currently reading a book by Ken Collier, called “Agile Analytics, A Value-Driven Approach to Business Intelligence and Data Warehousing”.

This book is specifically written for Agile BI and Data Warehouse projects and includes a BI project scenario for a factitious company called FlixBuster.

The book has two parts:

Part I is about Agile Management Methods and concentrates on  management of Agile BI projects and teams. This part covers topics such as User Stories for BI Systems and Self-Organising Teams Boost Performance.

Part II is about Agile Technical Methods for delivery of BI systems and how the team can drive business value by producing working BI/BW results often. These will include topics such as Design,  Test Driven Data Warehouse, Version Control and Project Automation.

An excellent and unique book for both BI/DW Project Managers,  Scrum Masters and the technical BI/DW teams such as ETL Professionals, DBAs and Source Data Specialists. Also great for companies who would want to run their own internal BI/DW Agile projects.

I end this brief introduction with a couple of quotes:

“A sweeping presentation of the fundamentals that will empower teams to deliver high-quality, high-value, working business intelligence systems far more quickly and cost effectively than traditional software development methods.” — Ralph Hughes, author of Agile Data Warehousing

“This book captures the fundamental strategies for successful business intelligence/analytics projects for the coming decade. Ken Collier has raised the bar for analytics practitioners—are you up to the challenge?” — Scott Ambler, Chief Methodologist for Agile and Lean, IBM Rational Founder, Agile Data Method

Agile Team Velocity

Posted by Anahita | Posted in Agile | Posted on 29-12-2011

Tags: , , , ,


In this article, I am going to explain the Agile Team Velocity. I will use some scrum related terminologies, so please if you are not familiar with the definition of  any of these keywords, see  my post Quick Scrum Keywords.

When an Agile team starts an iteration, the goal is to deliver value through completion of the committed user stories. Each user story has a very important attribute: the story points!

Story points are the estimated measure of the complexity of each story.  Agile teams do not have to estimate the work in number of days or hours, but in the size of the user story. I will write about this in more details in future.

The velocity of the team is simply the total number of story points for the accepted user stories for each iteration. I have to stress in the word “accepted user story” as if the story has not delivered the value for the user, it is not marked as accepted and so its points does not count towards the velocity, although the team may have worked on that story the majority of their time during the iteration.

Lets make this more clear with a simple example:

An Agile team with two weekly iterations, selected and committed to three user stories, with the estimated story points of  3, 5 and 13. At the sprint demo, the team demonstrated  the completed user stories to the related stakeholders and obtained acceptance on only two user stories with the story points of 3 and 13.  So the velocity of the team is 16 irrespective of the fact that the team may have spent three days on the user story that was not accepted.

The velocity of the team may change as the project progresses. This is based on two very important and  clever features of Agile projects: learning and feedback. As the team starts to understand their own capabilities and the stakeholders expectations, they starts to get better in  estimating the work they can commit to.  This will improve the number of accepted users stories and so the accuracy of the team’s velocity!

During the next sprint, the team picks four user stories with 3, 5, 5 and 5 story points. All these are accepted at the sprint demo, and so the team velocity is increased to 18. Simple!

Quick Scrum Keywords

Posted by Anahita | Posted in Agile | Posted on 28-12-2011

Tags: , , , , , , , , , ,


Scrum is an agile framework for the definition, execution and delivery of the project outcome. I put together a “Quick Scrum Keywords” to help new scrum teams to learn and use the correct scrum terminology.



Product Owner: Part of the Scrum Team, responsible for defining the product requirements and prioritizing them.

Scrum Master: The scrum facilitator and process owner.

Team Member: Technical delivery team such as architects, developers, testers, and DBAs.

Stakeholder: All that can benefit from the outcome of  the project such as end users, business domain experts, etc.

Product Vision: The Business Case for the product aligned to the Business Strategy.

Product Backlog: An ordered list of requirements for the product.

Release Backlog: A subset of Product Backlog selected for a specific release.

Sprint Backlog: A subset of Release Backlog selected for the delivery in a sprint.

User Story: An Independent, Negotiable, Valuable, Estimable, Small, Testable  product requirement.

User Story Estimation: The process of collective estimation of each User Story in the Release Backlog by the team.

Sprint Demo: The demonstration of completed user stories to the stakeholders, by the team.

Sprint Retrospective: Session by the team to review the sprint and lessons learnt.

Daily Standup: Daily meeting of maximum 15 minutes by the team, to discuss the daily tasks and any impediments.

Release Burndown:  A chart showing  the remaining story points at the end of each sprint within a release.

Sprint Burndown: A chart showing  the remaining work at the end of each day.

Story Board:  A board divided to vertical lines, showing the progress of each story in a sprint, from Unassigned, to In Progress and Done.

Capacity: A measure of team’s performance in story points.

Velocity: A measure of team’s speed.

Story Points: A measure of complexity for each story.


How Agile is your Agile Project?

Posted by Anahita | Posted in Agile | Posted on 26-12-2011



Since the creation of Agile Manifesto, many Agile methodologies are introduced and used by organisations around the world for a spectrum of projects for developing software. These projects may follow frameworks and methodologies such as Scrum or XP, however they all  should  adhere to Agile Principles.


Lets take a look at these principles. Reflect on each principle and check to see if they are true for your project.

Our highest priority is to satisfy the customer through early and continuous delivery of valuable software.

This is a very concise sentence: First of all it states clearly about the highest priority, which is satisfying the customer. Second of creating valuable software early and often is the way to achieving this task.

Welcome changing requirements, even late in development. Agile processes harness change for the customer’s competitive advantage.

As stated, change is expected and welcome. If the change adds value, it is accepted and possible via an agile project delivery.

Deliver working software frequently, from a couple of weeks to a couple of months, with a preference to the shorter timescale.

Working Software delivered in short periods of time is what an agile project is about. There is no long waiting time. Two weekly iterations are preferable to longer ones. Working Software means a whole completed cycle from requirements to deployment completed and working to customers satisfaction.

Business people and developers must work together daily throughout the project.

This principle clearly states the necessity of daily interaction and collaboration between the technical team and the business. This collaboration makes it possible to deliver fast, often and avoid wasted time. If this is not happening in your project, you may not be following the correct Agile implementation.

Build projects around motivated individuals. Give them the environment and support they need, and trust them to get the job done.

In Agile project a self organising motivated team is one of the important principles. To get the job done, the environment and support should be provided to keep the team on tract and motivated.

The most efficient and effective method of conveying information to and within a development team is face-to-face conversation.

Yet again, a great emphasize in regular face to face communication between the team members, which gives each team member the opportunity to talk about their success on daily basis and any impediments.

Working software is the primary measure of progress.

There is no Working Software produced, often and early as per the first principle, there is no progress made.

Agile processes promote sustainable development. The sponsors, developers, and users should be able to maintain a constant pace indefinitely.

Pace is important in an Agile process. This means the team and business need  to constantly work together towards creating value  through the delivery of working software.

Continuous attention to technical excellence and good design enhances agility.

And yes, agility does not mean chaos. It is about technical excellence and good design, without which unnecessary waste and rework will be imminent.

Simplicity–the art of maximizing the amount of work not done–is essential.

Anything that adds to complication, any unnecessary work, is against the agility.

The best architectures, requirements, and designs emerge from self-organizing teams.

Self-organising teams will work together to get working software delivered, simple, early and often. This is the spirit of Agile.

At regular intervals, the team reflects on how to become more effective, then tunes and adjusts its behavior accordingly.

Ant yet no Agile project can be without lessons learnt. Efficiency and velocity increases as the self organising team work  together for the delivery of the working software and get together in regular basis to reflect on these lessons.




Why Agile for Business Intelligence?

Posted by Anahita | Posted in Agile, Business Intelligence | Posted on 24-12-2011



I have identified three categories  related to the nature of Business Intelligence projects that makes them highly suitable for Agile approach. These categories are Skills, Change,  and Data.


* BI projects are cross organisational and require both business and IT skills.
* Skills required for a successful technical delivery of  BI projects vary from Data Architect and DBA to ETL and Output Developer.
* There is a high chance that the initial requirements are vaguely defined due to the unfamiliarity of subject domain experts with  BI capabilities.


*Changes to the business processes affect the BI requirements and due to the typical length of these projects, the change is inevitable.
* Upgrade or change of systems, infrastructure and underlying technology  affect the BI implementation and delivery.
* Change of people during the project affect the requirements due to variety of management and operational styles and level of skills.
* Market conditions, regulations and legislations and any other factor that affect the business and strategy in any shape or  form affects the BI requirements or its priorities.


* BI projects rely on accessing and understanding data related information, i.e., metadata
* Master Data definition plays a crucial role in BI projects.
* Data Quality affects the speed of the implementation and delivery of the BI projects.



Protected: SQL Server 2008 Editions

Posted by Anahita | Posted in Technology | Posted on 18-12-2011

Tags: ,


This content is password protected. To view it please enter your password below:

Big Data, Hadoop and Business Intelligence

Posted by Anahita | Posted in Business Intelligence | Posted on 17-12-2011

Tags: , , ,


I consider Hadoop as one of the technologies that creates a  link between Big Data Analytics and  Business Intelligence . In my previous posts I explained what Big Data means and what was the meaning of Unstructured Data. In this post I would like to introduce Hadoop, which makes it possible to gain business value from the Big Data.

Apache Hadoop is an open source project, providing software for reliable and scalable distributed computing. A simple programming model provides the ability for the distributed  processing of large data sets.  This is achieved by using a cluster of distributed processing and storage and so make it possible for Hadoop to easily scale up as required. Hadoop consists of three subprojects: Hadoop Common, Hadoop Distributed Files System (HDFS) and finally Hadoop MapReduce. Hadoop ecosystem of products also include derived technologies that could be used on their own or together to achieved the desired outcomes. Some of these related projects are Hive, Hbase, Zookeeper, etc For more details on each of the above projects, please visit http://hadoop.apache.org/

Core Hadoop is HDFS and MapReduce.

HDFS is Hadoop Distributed File System and is used as a utility in Hadoop projects to distribute data blocks to nodes in cluster which results in extremely fast computation.

MapReduce is an algorithm that makes it possible to perform parallel computing across the nodes in a cluster.

For Business Intelligence, one of the Hadoop projects, called Hive, is a data warehouse system for Hadoop compatible file systems (such as Apache HDFS or Apache HBase) and allows query, analysis and creating summary of of big data using a specific query language called Hive-QL.

Data is growing faster than ever and at the moment it doubles every year!  This will become astronomical and out of hand soon as around 80% of this data is Unstructured Data. Projects like Apache Hadoop makes it possible to analyse the Big Data and related projects such as Hive will make equivalent data warehousing for further storage and analysis of relevant data.



On Unstructured Data

Posted by Anahita | Posted in Business Intelligence | Posted on 14-12-2011

Tags: , ,


The name “Unstructured Data” does not somehow define the type of data it refers to.

Generally when organisations use systems and applications, there is a database in the back end and  mainly in “Relational” format.  In a “Relational” database, data is designed to be saved in tables that relate to each other in a way that follow certain rules, called normal forms. This is a database design model that guarantees the users of the corresponding systems, such as ERP systems, to insert, amend and delete data in a quickest way. This is all about performance of applications and the related screens.

But normalised data in relational databases are not very good when the data is to be queried. To solve this problem, the relational data designers use indexes and other methods for querying and displaying the relational data, but use of so many indexes will reduce the performance of the system and so this is not an effective way when reports are required on historical data.

To solve this problem, data often is remodelled as dimensional and saved into another database, usually a data warehouse or a data mart.

All said the data saved in the systems and relational databases are a fraction of the information held in an organisation. Any data that is not saved into a relational or dimensional database, is referred to as “Unstructured Data”, despite the fact that these data may well have structures related to them!

Two examples of  unstructured data  that still have related structure are documents in file system and body of the emails. As certainly there is structure to file systems as well as data related to the information in body of emails, these data cannot be considered aas data with no structured, but still categorised as unstructured data. Other examples of unstructured data are Microsoft Office files, such as Word documents, Excel Spreadsheets, Visio Diagrams, pictures, scans, videos, webcasts, web data including social networks such as facebook and twitter, wikis, web blogs, and any text or picture data saved in any format such as logs.

Statistics shows that less than 20% of data in organisations are relational and so the remaining data is saved and kept outside a relational database and considered as unstructured data. The velocity of growth in unstructured data is faster and the variety and volume is also way higher than the relational data.

Up to now, it was physically impossible to use any sort of analysis on unstructured data due to its volume and variety. This issue is now becoming less of a problem, with new advanced methodologies in distributed computing.

In my next post, I will explain Apache Hadoop and how this can come to rescue to create amazing ways to analyse the “Big Data“.

Big Data – Volume, Variety and Velocity!

Posted by Anahita | Posted in Business Intelligence | Posted on 05-12-2011

Tags: ,


For so many years many companies were concerned with the data  they didn’t have. The main worry was how to collect the required data. Many applications and systems were developed to give organisations the required entity for gathering the data.

With collection of data growing in many systems and applications, the companies then faced other challenges, such as how to make sense of the data they had and turn them to actionable insights. Data Services and Integrations along with Business Analytic and Performance Management Tools then came to rescue.

In the last two years, companies started to realise the volume, variety and velocity of data growth. In fact 90% of the data in the word today has been created in the last two years!

Volume: These days an enterprise will have a  petabytes (1 million gigabytes) of  information and it is growing.  This is Big data!

Variety: Big data is not just the data stored in  relational databases. It includes unstructured data in all documents, audio, video, live web data such as in wikia, blogs,  tweets, facebook, etc.

Velocity: The speed in which Big data is produced, makes it absolutely necessary to be analysed for insight as near as  it happens, i.e. near live. There is no  time to wait for later analysis.

Business Analytic and Intelligence is growing into Big Data space as we speak.