Big data and predictive analytics
Today ‘big data’ has become a buzz word. Everyone is
talking about it. Big data are characterized by three attributes, which are
high volume, high velocity and high variety. A few researchers also recently proposed
"high volatility" as the fourth attribute of big data. We can define
big data as the collection of data which are so large, complex and ever growing
that they cannot be processed and stored using traditional methods. The size of
big data is greater than petabytes. This makes storage of big data very difficult. Examples
of the big data are web data, telecom data, sensor data of jet engines, RNA-DNA
data etc.
The challenges in processing big data led to the development
of new technologies such as Hodoop and Map Reduce. People claim that big data
can be processed in reasonable time using these technologies. But I have yet to
use the above technologies. Hence I cannot judge the efficiency of these
technologies.
When we deal with any data, then we come across two kinds
of professionals. The first kind is the database architect and infrastructure
developer. He/She designs and models the database to store the data. They also
write programs to extract the raw data and convert the data into usable format.
To acquire expertise in this field, a person can enroll in courses such as
database engineering and design, computer networking etc. To learn about a
specific database such as Oracle, Db, Red shift etc, a person can opt for a
certificate program offered by many international organizations.
The second kind of the professionals are data scientists or
analysts. They use data to derive useful information for research, business or
governance. Earlier business leaders made decisions based on their convictions,
but now they also try to verify their hypothesizes using the data.
Professionals who are experts in processing data and studying them to get
information out of it are called data scientists. They also develop new methods
of data mining and learning algorithms. Data scientists use several
statistical algorithms to derive underlying information. These algorithms are
also called machine learning algorithms.
Data driven decisions are made mostly in the banking,
telecom, retail, and pharmaceutical and internet industry. In business, data
are used to derive information for marketing, customer retention and growth,
credit history assessment and clinical trials. Data analysis helps a business
leader to make decisions which deliver their business goals with minimum costs.
Comments
Post a Comment