A Survey on Big Data Analytics

 A Survey on Big Data Analytics

Dr. B. C. Melinamath, HOD & Professor, Department of Computer Science & Engineering, SVERI’s COE, Pandharpur, Contact No: 9449811522, Email: bcmelinamath@coe.sveri.ac.in

  1. Introduction

Big data analytics is the process of analyzing high voluminous data using analytical techniques  like machine learning, AI techniques to find out information, for example hidden patterns, unknown correlations, market trends and customer preferences which know how to assist organizations make up to date business assessments. The big data analytics can be classified as analytics 1.0, analytics 2.0, and analytics 3.0. The analytics 1.0 is descriptive, analytics 2.0 is predictive, and analytics 3.0 is prescriptive. New technologies like memory analytics and database analytics are beauty of 3.9. On a large scale, data analytics technologies and techniques provide a means to evaluate data sets and describe conclusions about them that supports the organizations put together informed business judgment. Business intelligence (BI) queries answer essential questions regarding business procedure and presentation [2].

The importance of big data analytics

With the particular analytics systems and software, and high-capacity work out systems, big data analytics contains different business advantages, like:

  1. New revenue openings
  2. other effectual marketing
  3. Better client service
  4. superior equipped efficiency

Big data analytics uses enable different stakeholders to analyze different analytics programs as shown below in fig.

2. Technologies and Tools of Big data analytics

The data warehouses may not be capable of handle the processing orders posed by bunches of big data that require to be reorganized frequently or even repetitively, and the online events of website visitors or the concert of mobile applications [1].

Accordingly, the various organizations which gather the data to process and analyze big data turn to NoSQL databases and Hadoop and its companion data analytics tools, including:

  1. YARN: a group management technology and one of the most important features in second-generation Hadoop.
  2. Map Reduce: a software framework that concur to developers for writing programs that process huge amounts of unstructured data in parallel across a distributed cluster of processors or stand-alone computers.
  3. HBase: a column-oriented key/value data accumulate built to run on top of the Hadoop Distributed File System (HDFS).
  4. Hive: an open source data warehouse system for a question and analyzing large data sets stored in Hadoop files.

3. How Big Data Analytics Works

In different cases, Hadoop clusters and NoSQL systems are used mostly as landing pads and staging regions for data before it gets loaded into a data warehouse for analysis — frequently in a concluded form that is more beneficial to relational structures.

In additional, on the other hand, big data analytics consumers agree to the idea of a Hadoop data lake that provides as the main storage area for arriving streams of raw data. In such architectures, data can be analyzed freely in a Hadoop cluster. Data being stored up in the HDFS must be prearranged, configured and paneled correctly to obtain fine recital out of both pull out, convert and load integration jobs and analytical queries [3]. 

If the data is ready, it can be evaluated with the software commonly used for advanced analytics processes. That includes tools for:

  1. Data Mining, which sift through data sets in search of patterns and relationships; 
  2. Predictive Analytics, which build models to forecast customer behavior and other future developments; 
  3. Machine Learning, which taps algorithms to analyze large data sets; and 
  4. Deep Learning, a more advanced offshoot of machine learning.          

4. What Are Key Technologies?        

Not a particular technology that covers big data analytics. Absolutely, there is advanced analytics that can be helpful to big data, but in practicality a number of types of technology work together to assist you get the most value from your information [4]. Here are the biggest players:

  1. Machine Learning: It is particular subset of AI that instructs a machine how to be trained, makes it probable to quickly and mechanically generate the models that can analyze bigger, more complex data and carry faster, more accurate results – even on a very large scale. And by generating exact models, an organization has a superior chance of identifying profitable opportunities – or keeps away from unknown risks.
    1. Data mining.: Data mining technology supports you scrutinize huge amounts of data to discover patterns in the data – and this information can be used for further analysis to assist answer complex business questions. With data mining software, you can separate during all the disordered and recurring noise in data, pinpoint what is applicable, use that information to assess likely outcomes, and then go faster the pace of making informed decisions.
    1. Hadoop.: This open source software framework can accumulate huge amounts of data and run applications on groups of commodity hardware. It has become a main technology to doing business due to the steady increase of data volumes and varieties, and its distributed computing model processes big data fast.
    1. Predictive analytics.: Predictive analytics technology uses data, statistical algorithms and machine-learning techniques to recognize the likelihood of future outcomes based on historical data. It is all about providing a best assessment on what will happen in the future, so organizations can feel more confident that they’re making the best possible business decision.
    1. Text miningWith text mining technology, you can scrutinize text data from the web, comment fields, books and other text-based sources to reveal insights you hadn’t observed before. Text mining utilizes machine learning or natural language processing technology to search through documents – emails, blogs, Twitter feeds, surveys, competitive intelligence and more – to help you analyze huge quantity of information and find out new topics and term relationships.

Conclusion:

It gives the details on the ideas of big data have taken after by the applications and the difficulties confronted by it. Big data is an advanced field, where an important part of the research is up till now to be finished. The speed and different of data development is growing because of the expansion of sensor and cell phones with web association.

Big data at present is dealing with and handle of by the software named Hadoop. To setup the capability of Big Data totally later on, broad research desires to be carried out and innovative technologies need to be developed.

References

[1] S.J.Samuel, K.RVP, K.Sashidhar, C.R.Bharathi, “A survey on big data and its research challenges”, ARPN Journal of Engineering and Applied Sciences, Vol.10, No.8, Pp.3343-3347, 2015.

[2] S.Kuchipudi, T.S.Reddy, “Applications of Big data in Various Fields”, International Journal of Computer Science and Information Technologies (IJCSIT), Vol.6, No.5, Pp.4629-4632, 2015.

 [3] S.Mukherjee, R.Shaw, “Big Data–Concepts, Applications, Challenges and Future Scope” International Journal of Advanced Research in Computer and Communication Engineering, Vol.5, No.2, 2016.

 [4] A.Misra, A.Sharma, P.Gulia, A.Bana, “Big Data: Challenges and Opportunities”, International Journal of Innovative Technology and Exploring Engineering (IJITEE), Vol.4, No.2, Pp.41-42 2014.

Comments