What is Big Data Analysis?
Big data generated from social media websites, devices, sensors, Audio or video, networks,
log files and the web. It generated in real time And on very large scale.Big data analytics
is the process of examining This large amount of different big data or data types, to uncover
Hidden patterns, unknown correlations, and other useful information
How to Analyze Big Data with Hadoop Technologies:
Now a day’s speed innovations, frequent evolutions of technologies and Improving internet
population, systems are generating huge amounts of data to The tune of pet bytes and even terabytes
of information, the data generated In huge volumes with high velocity in all multi-structured formats
like weblogs, Sensor data, images, videos, etc
Hadoop provides the ability to store large-scale data on HDFS, there are Many solutions available in the
market for analyzing this huge data like Hive, Pig, and Map Reduce. The advancements of these different
data analyses Technologies to analyze the big data.
Let us see Hadoop data analysis technologies to analyze the huge stock data being Generated
Map Reduce:
1. Powerful model for parallelism
2. Based on a rigid procedural structure
Pig:
1. Procedural data fowl language
2. Used by programmers and researchers
Hive:
1. Declarative SQLish language
2. Used by analyst for generating reports
Which Data Analysis Technologies Should used?
Based on the sample dataset, it is having some properties,
1. It would join to calculate Stock Covariance
2. Data is having structured format
3. In real environment, data size would be too much
4. It could organize into schema
Based on the above analysis of features of these technologies we can conclude that,
If we use Map Reduce, then complex business logic needs to written to handle the Joins,
A lot of development effort needs to go into deciding how to map and reduce Will take
place, we should not able to map the data into schema format and all Efforts need to handled.
If we are going to use Pig, then we would not be able to partition the data, It can Be used
for sample processing from a subset of data by a particular stock symbol.Hive not only provides
a familiar programming model for people it also Eliminates lots of typical and sometimes tricky
coding that we would have to do, Whos knows SQL.
In Map Reduce programming, if we apply Hive to analyze the stock data. It will also reduce The development
time and manage joins between stock data also using Hive QL So apart from the above discussion, Hive seems
the perfect choice for the remove Case study.
Advantages of Big Data Analysis:
Big data analysis allows researchers, market analysts, and business. Users to develop deep
Judgment from the available data, resulting in many business advantages
1) Whenever users browse shopping sites, travel portals, hotels, flights add to aParticular item into their cart.
Then Ad Targeting companies can analyze this wide variety Of data. The activities and can provide better
recommendations to the user about, discounts Offers and deals based on the user product history and browsing
history
2) In the telecommunications space. if customers are moving from one service to another service. Then by analyzing
huge call data records of the different issues faced by the customers Can search out, Based on analyzing these
issues, it can identify if a telecom company Needs to place a new tower in a particular urban area, that way
customer can be Minimized.
