Featured Post

Agile Analytics by Ken Collier

I am currently reading a book by Ken Collier, called “Agile Analytics, A Value-Driven Approach to Business Intelligence and Data Warehousing”. This book is specifically written for Agile BI and Data Warehouse projects and includes a BI project scenario for a factitious company called FlixBuster. The...

Read More

Apache Sqoop

Posted by Anahita | Posted in Business Intelligence | Posted on 24-11-2013

Tags: , , , , , ,



Apache Sqoop transfers bulk data between Apache Hadoop and relational datastores. Sqoop is used for importing the data into HDFS, or related similar datastores such as HBase or Hive. It is also used for bulk export of data from HDFS or similar datastores such as Hive and HBase into relational databases such as HSQLDB.

Sqoop provides a more efficient way of data analysis.


What is Machine Learning?

Posted by Anahita | Posted in Business Intelligence, Technology | Posted on 03-02-2013

Tags: , , , ,


Computers and statisticians both can use data, but the way the process is done is completely different. Statistics is about the use of data to enable humans to conclude patterns and gain insight from the data. On the other hand statistical and mathematical models and methods can be applied to produce tools and methodologies for computers. These then are used by the machine to perform the required tasks.

When we teach the computers to give us insight about the data, we teach them to extract information from the data through algorithms in order to identify the patterns from a mass volume of noise. These algorithms, also known as Patten Recognition Algorithms, are also used to automate required tasks, enable us to train the machines to put data into certain contexts using a training set of data.

There are two main types of problems that are solved through the machine learning: classification and regression.

In future posts I will introduce you to some of the methods used in machine learning and their real life applications in Big Data.

Big Data Applications in Online Retail

Posted by Anahita | Posted in Business Analytics, Business Intelligence | Posted on 27-01-2013



This is the first of a series of posts where I simply list some applications of big data analytics in various industries and related business opportunities.

In retail, especially the online retail market, the business growth and profitability has direct connection to customer.

Marketing campaigns are only successful if they can achieve what they intended: get customer attention, sell products and keep the business relationship active.

The data that a customer produces when visiting a retail website is kept in unstructured log files. Every single move, every single click, all basket items added and removed, all saved items, all page visits, all product views are recorded. When there is a marketing campaign, an interaction with the customer such as a video, picture or promotion creates further interaction to change the normal patterns of behaviour, prompting the web site visitor to respond to the campaign. How this change of behaviour is measured is not just about the success or failure of the campaign, but also about how the individual customers responded to it. This can give insight into the effectiveness of the campaign which could be used instructively for future marketing initiatives.

Another application of big data analytics is to adjust and align the marketing activities with the sales goals by targeting the right customers and channels in the right time to convey the right message.

Big Data Analytics provides a new way to look at data that’s huge in volume, not saved in a structured format and subject to unpredictable and constant change!

Big Data Infographic

Posted by Anahita | Posted in Business Analytics, Business Intelligence | Posted on 26-01-2013



Taming Big Data | A Big Data Infographic
Via: Wikibon Big Data

Big Data Analytics Presentation

Posted by Anahita | Posted in Business Analysis, Business Intelligence | Posted on 09-12-2012



A weekend effort to put together simple summary of Big Data, concentrating on the application in Customer driven industries such as retail.

Click on the link to download this presentation.

Big Data PowerPoint Presentation


Hadoop Explained!

Posted by Anahita | Posted in Business Analytics, Technology | Posted on 09-12-2012

Tags: ,


Big Data Explained!

Posted by Anahita | Posted in Business Intelligence, Technology | Posted on 09-12-2012



Big Data

Posted by Anahita | Posted in Business Intelligence | Posted on 18-11-2012



Big Data

Information Management in Big Data Era

Posted by Anahita | Posted in Data Governance, Information Management | Posted on 01-06-2012

Tags: , , , , ,


Traditionally IT maintained all information systems and provided business with the required answers to data and applications, which usually resided on various databases for CRM, ERP and Finance Systems.

This approach no longer can meet the requirements of organisations in what I would like to call the Big Data Era!  The speed in which the information is created and ever increasing growth in the volume of data, does not leave any time for lengthly record management and in what I call Business- IT  -Business cycle. In other word, IT cannot hold a bottleneck for information in the Big Data Era.

So what is the way forward and how the businesses can meet their information requirements for decision making purposes, which require instant access to all relevant data assets.

The answer is to put in place an infrastructure of business oriented  information professional who have both good IT and business management skills supported by an automated underlying technology. These professionals could be grouped to focus on different areas of information requirements and be in charge of focused requirements such as data governance, data stewardship, visual discovery and social/mobile information.

As part of this a framework Master Data Management has to be implemented or if already in place evolve, in order to implement  processes for creating, updating, managing and montitoring the organisations most critical asset: Information! Implementing an MDM structure is not about technology, but how an organisation as a whole takes responsibility for its information. The MDM programme have to include social network side of the business such as customer and supplier sematics in relation to products and services.

There has to be no cycle of IT – Business – IT, but a full integrated information infrastructure that facilitates the fast changing business and information requirements into the hands of multi task  professionals with the support of an automation provided by the underlying technology.

The business never have to wait for information from IT again!




In-Memory Technology and Big Data

Posted by Anahita | Posted in Business Intelligence, Data Warehouse | Posted on 14-05-2012

Tags: , , , , , , , ,


In my previous blogs I wrote about the Big Data and the related keywords and technologies such as unstructured data, Hadoop HDFS, MapReduce, etc . In this post I am looking at what “in-memory technology” brings in to help analysing the big data.

Business Intelligence is all about getting the right information to the right people in the right time, so they can make timely decisions that will help business achieve its goals such as higher service efficiency, better customer experience, and higher quality of products.

Dealing with Big Data creates many challenges, but above all of all, it is the velocity challenge. Velocity is when there is a time lag between  when the data is created and when the business can look at it and analyse it in order to correct behaviour or make  related decisions.

There are many cases that business cannot afford to wait for data to be consolidated in a data mart or data warehouse, or the aggregates to become available after the OLAP cubes are processed. There are cases that the information is required in “real time” and this is where in-memory technology becomes important.

So what is in-memory technology? In short it is when the data is stored in memory instead of the hard disk. The limitation of 4GB maximum memory is removed with the introduction of the 64 bit operating systems and considering the fact that the price of the RAM is relatively low, huge amount of data  (Terabytes  or Thousands of Gegabytes) can be stored in memory and available to be processed in real time. Having the data available in memory means faster access to the very data that is required in real time.

In summary, I have explained about the meaning of the in-memory technology and why it is now an available option for business intelligence. In fact the real benefits of in-memory technology is  the real time availability of data for Operational BI situations. This is used when huge number of transactions are required to be monitored and analysed in real time. This is very appealing to financial services for monitoring the financial transactions, call centre staff for real time fraud detection when talking to customers,  or service companies who require to act quickly as the requirements for their service capacity changes.

The in-memory technology is available by various vendors in products such as Microsoft SQL Server 2012 xVelocity and SAP HANA in SAP Business Objects BI 4.0. These solutions are varied in nature and come with several different capabilities and features, but they all make use of the new advances in hardware and software such as in-memory technology and massive parallel processing to reduce the gap between the data and the processor in order to remove bottlenecks and increase operational productivity.  Implementing the technology via these vendors promises substantially faster results in query analysis, faster decision making with real time data and finally chnaging the way the organisations get access to data and make us of massive available information!