I spent a very stimulating day at the Computing Magazine Big Data Summit and my thanks to them for laying on a well attended and very well organised event. You can find the report on the Summit here: http://www.computing.co.uk/ctg/news/2188239/gallery-computings-summit
The issue of Big Data is at the cross roads of Business Strategy and Technology so I thought it would be worthwhile summarising some of the points made by the speakers and the various panels through the day.
So what is Big Data and Why is it important?
Today it is estimated that businesses access between 1% and 5% of their data. The remainder lies effectively hidden and inaccessible (unsearchable, unstructured and siloed) in discrete legacy systems throughout organisations. This is not to suggest that this data is historic data – much of it is generated currently but businesses do not have the tools to turn it to good use. It is this huge amount of data which is increasing exponentially that is called “Big Data”.
If you can adopt the right tools then the management and analysis of Big Data can provide a significant competitive advantage.
Lets look at some examples of this:
- Common Repositories: consolidate information from a variety of sources for better access and maintenance
- Digital Content Delivery: repurpose existing information and distribute across devices and channels
- Information Intelligence: Exploit hetrogenous information, leveraging content analytics to discover trends and patterns
- Social Applications: share information to improve processes and support better decision making
- Metadata Catalogue: maintain repository of metadata to facilitate information sharing and discoverability
The Six Vs of Big Data
The sources of the data go beyond business transaction data such as would be captured within an accounting, ERP or a database system. A significant part of it comes from user generated content – emails, word documents, videos, pdfs and more. In addition now, there is the whole web generated-social media related content, much of it external but non the less relevant to the business. This breadth of different sources of data is known as application sprawl.
Big Data is not static. Its structure can vary and change rapidly making dependence on tabular data storage formats near impossible. Instead, Big Data needs to be addressed in a format sufficiently dynamic to cope with current and future changes.
Data volumes are increasing by many multiples, as much as 75x by 2020. Business transactional data is growing along a linear scale and may double in the next few years. Human/Machine driven content is expected to grow 10x in the same period. The web based data landscape, driven in a significant part by video traffic, is expected to increase 100x.
As another illustration of volume:
- A total of 20 Petabytes of HDD Capacity were manufactured in 1995
- By 2011 there were 39,000 Petabytes of unstructured data on the web.
- This is forecast to reach 226,000 Petabytes by 2015.
The speed at which data can now be generated is so fast, that this alone can overcome the ability of businesses to manage it
The data has value but only if you are able to derive insights from it. Understanding Big Data is not about analytics but interpretation – using the data to answer questions such as why? and how? rather than how much?
Virtual Reality Management
Big Data offers the prospect of, if not real time information, then near real time information for business which in turn offers the prospect of “Virtual Reality” business management. Instead of depending on information which is months or at best weeks out of date, businesses can be monitored on a virtual basis as sales are made and Directors can make faster, better informed and often prediction based management decisions, particularly if external market, environmental or other third party data is incorporated into the analysis.
Why does it have value?
Harnessing big data offer businesses the opportunity to create additional value, for example
- the ability to price individually based on predictive behaviour
- drive new product and pricing mechanisms
- construct new operating models
- devise completely new businesses
But allied to this comes additional responsibility and issues of privacy, security, confidentiality, conflicts of interest and commercial ethics become important issues for businesses to address as well.
How do you Manage Big Data?
Big Data represents a financial and operational challenge. This can only be addressed at scale if you can remove the human element from the process. Business needs to remove the requirement for administrative intervention, manual or repeated backup and recovery and eliminate human error at the time of capture.
In order to make Big Data manageable it needs to be organised at the time of capture, which can often involve tagging it with meta data. By storing it with the meta data, future search and retrieval becomes a far easier task than trying to search untagged data. There is also the possibility that the pre-sorting of data achieves significant data compression without losing any information and at scale this can have major cost advantages.
The meta data can help to overcome the problems of storing different types of information, as retrieval becomes dependent on the meta data and not accessing the source material.
This is referred to as “Smart Storage” rather than “Dumb Storage”
What are the costs of Big Data?
Storing data is far from free. The financial costs have been calculated to involve far more than just storage:
- Acquisition = 5% of costs
- Storage = 11%
- Retrieval = 6%
- Distribution = 15%
- Deliver = 12%
- Processing = 34%
- Other non-data Costs = 17%
The Big Data Adoption Curve
It is possible to characterise businesses in terms of their position on the adoption curve in four outline stages:
- Data Wasters
- Data Collectors
- Aspiring Data Managers
- Strategic Data Managers
In practice today, the bulk of businesses are to be found in the middle two groups, very few can genuinely yet claim to be Strategic Data Managers
So what Big Data do I need to keep?
Faced with this fire hydrant of information coming at business, careful decisions need to be made to capture only the relevant and strategically important data, although this is far easier said than done. The process can be summarised as:
- Information Capture
- Value Assessment
- Data Distillation
- Build Business Model
- Change Business Process
- And continually reiterate
This suggests the requirement for a specific Business Data Scientist who can combine the complex skills of understanding both Big Data Management and Business Process.
It also suggests challenges for infrastructure development. The current server based storage solutions are already giving way faster more complex storage involving parallel rather than serial processing, solid state drives and cloud based/hosted storage solutions.
This short summary of several hours of presentations, from much better informed people than myself, can be summarised as follows:
- Firstly evaluate what business data/information is available within your existing business
- Ask yourself what value this can deliver?
- Make Big Data a business driven initiative, based on your business KPIs first and foremost
- Then consider the technology solutions suitable to address your Big Data initiative
- Consider using both internal and external expertise, deployment options and business models.
- Review information governance, security, privacy, quality, security, standarisation and life cycle management
- Remember its about People, Processes and Products i.e your existing business!
To learn more about Big Data from the Computing Big Data Summit go to http://www.computing.co.uk/ctg/news/2188239/gallery-computings-summit
My thanks to all the presenters for improving my understanding of the subject. I hope this brief distillation of some of the major issues will help you address the Big Data challenges in your business
If you enjoyed this content, sign up to my mailing list and join the Six Minute Strategist Community
Thank you for joining the Conversation!