ISSN: 2456–5474 RNI No.  UPBIL/2016/68367 VOL.- IX , ISSUE- I February  - 2024
Innovation The Research Concept

Knowing Big Data: Architecture and Real-World Applications

Paper Id :  18559   Submission Date :  11/02/2024   Acceptance Date :  22/02/2024   Publication Date :  25/02/2024
This is an open-access research paper/article distributed under the terms of the Creative Commons Attribution 4.0 International, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
DOI:10.5281/zenodo.10754821
For verification of this paper, please visit on http://www.socialresearchfoundation.com/innovation.php#8
Monika Juneja
Assistant Professor
Department Of Computer Science
Guru Nanak College
Sri Muktsar Sahib,Punjab, India
Abstract

The digital age has brought about a big change completely changing how industries and fields work worldwide. The accumulation of a vast amount of information generated from various sources and activities contributes to the emergence of big data. Traditional techniques and platforms exhibit inefficiency with slow responsiveness, limited scalability, and compromised performance and accuracy. Significant efforts have been devoted towards the development of varied distributions and technologies to navigate the complexities of big data. This paper delves into the profound impact of big data across diverse sectors. It helps in fields like healthcare, finance, education and more making a big difference in how things work. It reveals how big data transforms sectors, fostering innovation and improving decision-making processes with unprecedented insights. Big data efficiently collects and analyses vast amounts of information proving to be highly effective in its data processing capabilities. It also explores transformative impact of big data technology focusing on Hadoop and MapReduce innovations in data processing. Despite the significant strides made in big data, it has not exempted from facing distinct challenges that demand attention and resolution.

Keywords Big Data, Decision Making, Data Analytics, Application Areas, Digital Transformation.
Introduction

Digitization has brought about profound changes in how we live and work. It has transformed traditional industries paving the way for online platforms and services. Digitization has not only streamlined processes but has also accelerated innovation, creating a dynamic and interconnected global landscape. Big data represents a paradigm shift in the way we handle and process vast amounts of information. It refers to the immense volume, variety, and velocity of data that modern technologies generate. Big data encompasses vast and varied sets of structured, unstructured, and semi-structured data that undergoes continuous exponential growth. Organizations leverage big data within their systems to enhance operational efficiency, deliver superior customer service, craft customized marketing initiatives and undertake strategic actions that ultimately contribute to heightened revenue and increased profitability. Big data analytics enables organizations to extract valuable knowledge, identify patterns and make informed decisions in real-time. Every digital operation and interaction on social media generate big data. Platforms, detectors, and portable gadgets facilitate the transmission of this data. The advent of big data involves multiple sources that contributes data at a rapid pace with significant volume and diversity. To derive meaningful value from big data there is a requirement for optimal processing power, advanced analytics capabilities, and proficient skills [(Mereena Thomas (2015)]. The continual advancement in computing and electronic technology has led to the substantial generation of raw data which is projected to reach 44 trillion gigabytes. Presently individuals and systems overwhelm the internet with an exponentially increasing volume of data doubling in size every two years [(Anurag Agrahari et al.) (2017)]. Before the advent of the Big Data revolution companies faced challenges in storing extensive archives for prolonged durations and struggled to efficiently manage vast datasets. The management of Big Data demands substantial resources, novel methods, and robust technologies. Big Data involves tasks such as cleaning, processing, analysing, securing, and facilitating access to vast and dynamic datasets [(Ahmed Oussous et al.) (2018)]. The recent progress in information technology (IT) has made data generation more accessible. The swift expansion of cloud computing and the Internet of Things (IoT) additionally contributes to the substantial increase in data volume [(Min Chen et al.) (2014)]. Big data enables obtaining more comprehensive answers by providing access to a greater amount of information. Having more complete answers provides greater confidence in the data, leading to an entirely different approach to problem-solving. Big data serves as the invaluable and potent catalyst propelling the expansive IT industries of the 21st century. This revolutionary concept has become a cornerstone in various industries ranging from healthcare and finance to marketing and beyond reshaping the way we understand and utilize information in the digital age.

Aim of study

The objective of this paper is to study Knowing Big Data: Architecture and Real-World Applications.

Review of Literature

This paper relies on a foundation grounded in the certainty derived from a purposeful selection of journals within the domain of big data. In the preliminary stages, the analysis has been limited to abstract examination of these papers aiming to validate their pertinence to the field of big data. Over 123 articles have been systematically examined within the realm of big data encompassing various keywords such as applications of big data, architecture of big data and utilization of big data. These articles were sourced through the Google Scholar search engine.Amongst these papers a comprehensive examination was conducted on 80 papers which were guided by the relevance of their respective topics. Textual searches were employed to analyse the results. The exploration of related work primarily focuses on investigating diverse application domains where big data finds utilization. The utilization of big data in organizations has become integral to decision-making processes and operational efficiency. By utilizing vast volumes of data, organizations gain valuable insights into customer behaviour, market trends, and internal operations. The information presented in the paper is characterized by a high degree of accuracy and reliability. It holds the potential to offer benefits to organizations both governmental and non-governmental alike. The review of research done by certain authors is considered.

Table 1: Related studies

S. No

 Author & Date

Studies

1.

Mereena Thomas, 2015

To derive meaningful value from big data there is a requirement for optimal processing power, advanced analytics capabilities, and proficient skills

2.

Anurag Agrahari et al., 2017

The continual advancement in computing and electronic technology has led to the substantial generation of raw data which is projected to reach 44 trillion gigabytes.

3.

Ahmed Oussous et al., 2018

Big Data involves tasks such as cleaning, processing, analysing, securing, and facilitating access to vast and dynamic datasets.

4.

Min Chen et al., 2014

The swift expansion of cloud computing and the Internet of Things (IoT) additionally contributes to the substantial increase in data volume.

5.

Conboy et al., 2020

Volume indicates the dataset's size resulting from the combination of numerous variables and an even greater number of observations for each variable.

6.

Amir Gandomi et al., 2014

The widespread adoption of digital devices like smartphones and sensors has resulted in an unparalleled surge in data generation, prompting an increasing demand for real-time analytics and planning based on evidence.

7.

Nikhil Madaan et al., 2020

Data can originate both internally and externally to the enterprise. Variety encompasses diverse formats and types of information, as well as various methods and applications for analysing the data.

8.

Sivarajah et al., 2020

Vast amount of data is neither uniform nor adheres to a specific template or format. It exists in various forms and originates from diverse sources.

9.

Saneh Lata Yadav, 2017

Using Hadoop, it can analyse massive datasets across a cluster of servers and run applications on systems with thousands of computing units handling many terabytes of information.

10.

Rahul Beakta, 2015

Hadoop Distributed File System offers fast access to application data and is ideal for applications with extensive datasets. It can store data on a vast number of servers and follows a master/slave architecture.

11.

Jianqing Fan, 2014

In HDFS, user can only write data once but read it many times. Many clients can change metadata structures like file names and directories at the same time. It's crucial to always synchronize and reliably store this metadata.

12.

Chaowei Yang et al.,2017

Many Big Data applications using MapReduce need quick response times and enhancement of the performance of MapReduce tasks is a key focus for both academia and industry.

13.

Muhammad Iqbal et al., 2018

SMEs can gain advantages from abundant data by establishing partnerships and applying big data technologies in aspects like supply chain management and business operations.

14.

Pervaiz Akhtar et al., 2019

Companies that incorporate big data analytics into their operations tend to be more productive and financially successful than others. A study reveals that retailers can boost their return on investment by 15-20% through the effective use of big data applications.

15.

Ilias O. Pappas et al., 2018

Big data and business analytics ecosystems can contribute to creating sustainable societies through digital transformation. Big data analytics helps in getting the deeper insights of new innovations and value creation.

16.

Ralph Schroeder, 2016

There are three types of big data business models one those who use data, secondly those who supply data, and third those who facilitate data. These three groups depend on each other for a successful data-focused economy and all three need to grow together.

17.

Andrea Sestino et al., 2020

The combination of IoT and Big Data is transforming the way management and marketing strategies work through digitalization. It not only alters human interactions and daily routines but also revolutionize the way companies manage their methods and processes.

18.

Gabriele Santoro et al., 2018

Big data helps by quickly adjusting prices and managing costs with the right business strategy at the right time.

19.

Maria Mohammad Yousef, 2021

There is a necessity to transform raw data into meaningful insights by using different analytical tools of big data.

20.

Harshit Kumar et al., 2017

It facilitates individual health management through the provision of patient-centric services, enhancing treatment methodologies and quickens the detection of healthcare fraud with increased speed and efficiency.

21.

Hsinchun Chen, 2012

The primary sources of extensive health data are derived from genomics that includes genotyping and sequencing data as well as from payer–provider sources that encompasses electronic insurance records, health data and patient responses.

22.

WullianallurRaghupathi et al., 2014

Two-thirds of savings about 8% would come from reducing the national healthcare expenditure. His belief was that utilizing big data has the potential to decrease both waste and inefficiency.

23.

Nishita Mehta et al., 2018

Big Data analytics with its ability to predict and recognize patterns allows a move from medicine based on experience to medicine based on evidence.

24.

Maria Ijaz Baig et al., 2020

The arrival of big data allows teachers to check students’ performance in academics and their learning approach. The immediate and positive feedback inspires and satisfies students leading to a positive influence on their performance.

25.

Christian Fischer et al., 2020

Learning behaviours that were difficult to document in traditional classrooms can now be partially captured through the use of Learning Management Systems.

26.

Hui Luan et al., 2020

It is crucial to have collaboration between academia and industry to have a balanced integration of human and machine learning approaches.

27.

Joel R. Reidenberg et al., 2018

Protecting students' privacy is crucial in education because if their information is used or shared improperly then it can harm their learning and social growth.

28.

Miftachul Huda et al., 2016

Big Data operates in real-time allowing for the exploration of data to comprehend student behaviour and it has the capacity to provide tailored and customized services to individual students.

29.

Amelia H. Arsenault, 2017

Big data plays a crucial role in shaping global media networks in two main ways.

30.

Markus Lohnert, 2022

Organizations armed with big data platforms can forecast the success of content beforehand rather than relying on intuitions only.

31.

G. G. Hallur et al., 2021

Researchers are suggesting different ways to analyze and enhance the performance of algorithms. A model for big data should be crafted to optimize the effectiveness different of streaming services.

32.

Tawny Schlieski et al., 2012

Adaptive algorithms of big data turn them into stories and shape a more interesting future in the field of entertainment.

33.

Lorenzo Ardito et al., 2019

The concept of a selecting smart destination emerges from the integration of tourism destinations with various stakeholders' communities.

34.

Mariani, M et al., 2022

The web data derived from different destination websites can be employed to anticipate hotel demand in a tourist destination or identify suitable flights for their direct destinations.

35.

Heqing Zhang et al., 2021

To achieve the advantages of personalized tourism requires precise categorization, accurate analysis of tourist needs, and a consistent commitment to precision in design, ensuring tailored plans that meet customer requirements.

36.

E. Rahmadian et al., 2022

The utilization of big data in sustainable tourism is a subject of investigation in both academic and non-academic contexts. Different methods are used to identify, rank and predict behaviors as well as analyze tourist numbers.

37.

Dimitrios Belias et al., 2021

Big Data tools can offer immediate insights into the online behaviour of tourists concerning a destination.

38.

Hengyun Li et al., 2020

Big data provides real time insights into tourists' preferences and frequent updates. It addresses the limitations of traditional data in accurately forecasting tourism demand especially during unique events with changing data patterns.

39.

Hui Lv, Si Shi & Dogan Gursoy, 2021

Big data has been used in tourism research that includes both structured and unstructured data. The data in professional databases come pre-structured eliminating the necessity for tasks like data cleaning.

40.

Zaher Ali Al-Sai et al., 2017

The concept of big data presents fresh opportunities for creating value, making discoveries, predicting trends and enhancing business intelligence to support decision-making in e-government.

41.

Irina Pencheva, 2018

Big Data's advantages in setting priorities and formulating policies include improving accuracy, efficiency and speed.

42.

Cu Kim Long, 2021

Within Industry 4.0, governments employ cutting-edge technologies like blockchain, artificial intelligence (AI), Internet of Things (IoT), cloud computing and Big Data Analytics (BDA) to enhance intelligent governance.

43.

Jung Wan LEE, 2020

Government leaders aspire to transform organizations into data-driven entities with chief information officers ensuring accurate correlation and monitoring interdependencies. The goal is to ensure timely access to the right information for the right individuals.

44.

Akemi Takeoka Chatfield et al., 2015

Big data represents a strategic initiative for numerous government organizations responding to shifts in the external landscape encompassing economical, technical, political and socio-cultural elements.

45.

Shefali Virkar et al., 2018

Open government aligns closely with collaborative governance as the availability of open data enhances opportunities for advancement of knowledge, decision formulation and cross-disciplinary collaboration.

46.

Kaile Zhou et al., 2016

Enhancements are necessary in network bandwidth, data storage, processing capabilities and data interoperability within the IT infrastructure. This improvement aims to better facilitate big data-driven smart energy management.

Analysis

1. The Three Dimensions and Categories of Big Data: Volume, Velocity, and Variety- The concept of big data revolves around three key dimensions volume, velocity and variety each playing a pivotal role in shaping the data landscape. Understanding these dimensions is essential as they collectively define the challenges and opportunities presented by big data. They serve as transformative potential across various industries.

Figure 1: Dimensions of Big Data

Source: https://static.javatpoint.com/hadooppages/images/big-data-characteristics.png

i. Volume- Volume refers to the immense scale and magnitude of data generated, processed and stored that often involves exceptionally large datasets that surpass the capacity of traditional data management systems. Many things make data grow a lot like keeping records of transactions over time and getting lots of unorganized data from social media. Also, we collect big amounts of data from sensors and machines talking to each other [(Mereena Thomas) (2015)]. Volume indicates the dataset's size resulting from the combination of numerous variables and an even greater number of observations for each variable [(Conboy et al.) (2020)].

ii. Velocity- Velocity pertains to the speed at which data is created, collected and processed. The widespread adoption of digital devices like smartphones and sensors has resulted in an unparalleled surge in data generation, prompting an increasing demand for real-time analytics and planning based on evidence. Even traditional retailers are producing high-frequency data that handles over one million transactions every hour [(Amir Gandomi et al.) (2014)].

iii. Variety- Variety encompasses the range of data types an organization encounters that are sourced from diverse origins with varying degrees of value. Data can originate both internally and externally to the enterprise. Variety encompasses diverse formats and types of information, as well as various methods and applications for analysing the data [(Nikhil Madaan et al.) (2020)]. These articles have highlighted that the vast amount of data is neither uniform nor adheres to a specific template or format. It exists in various forms and originates from diverse sources [(Sivarajah et al.) (2020)].

Categories of Big Data

Within the domain of Big Data different data types are employed to classify the various forms of data generated daily. Essentially analytics identifies three primary types of data.

iv. Structured Data- Structured data refers to information that is organized, easily reachable, and can be stored in a fixed way. In Big Data working with structured data is straightforward because it has well-organized measurements defined by specific parameters. Structured data is characterized by its ability to be stored, accessed, and processed in a predetermined format. It includes data relevant to banking often organized in a tabular form with rows and columns [(Anurag Agrahari et al.) (2017)].

v. Unstructured Data-Unstructured data is characterized by an unknown or undefined format or structure. Apart from its substantial volume, unstructured data presents numerous challenges in processing to extract valuable insights from it. The rapid expansion of digital applications and services has led to a swift increase in unstructured information. Some projections indicate that 80-90% of organizational data lacks a defined structure and this volume continues to escalate significantly each year.

vi. Semi-Structured Data- Semi-structured data is a type of data that is not purely structured but also not completely unstructured. Semi-structured data varies from the conventional tabular data model or relational databases as it lacks a fixed schema. It refers to data that doesn't exist within a structured database but possesses certain organizational characteristics making it more accessible for analysis.

2. Big Data Management

The management of big data involves organization, governance and administration of extensive amounts of structured and unstructured data. The primary goal is to ensure a high level of data quality and accessibility fulfilling the needs of business intelligence and big data analytics applications. Effective big data management helps company to find important information from large amounts of messy data from different sources like social media, and sensors. Organizations managing big data must focus on where and how the acquired data is stored. Traditional methods include the process that cleans, transforms and organizes the data for analysis. In contrast to standard approaches big data environments require Magnetic, Agile, Deep (MAD) analysis skills.Unlike traditional methods, big data environments attract all data sources regardless of quality. Also, the storage needs to be agile that allows easy and quick adaptation to evolving data. It must be deep to handle complex statistical methods and allow analysts to study large datasets effectively [(Nikhil Madaan et al.) (2020)].

Figure 2: Big Data Architecture

Source:https://media.geeksforgeeks.org/wp-content/uploads/20200621105657/mapreduce-workflow.png

Analytical techniques are supported by various software products and technologies that aid in big data analytics. Some of the most commonly used ones are discussed here.

i. HADOOP - Hadoop is a widely used Java-based programming framework. It helps process large amounts of data in a distributed computing setup. Using Hadoop, it can analyse massive datasets across a cluster of servers and run applications on systems with thousands of computing units handling many terabytes of information [(Saneh Lata Yadav) (2017)]. Hadoop Distributed File System offers fast access to application data and is ideal for applications with extensive datasets. It can store data on a vast number of servers and follows a master/slave architecture. Files are divided into blocks of fixed size [(Rahul Beakta) (2015)].In HDFS, user can only write data once but read it many times. Many clients can change metadata structures like file names and directories at the same time. It's crucial to always synchronize and reliably store this metadata. The Name Node a single machine that manages all metadata. HDFS has in-built feature for replication that ensures if any individual machine gets failed data can be recovered without losing any information. [(Jianqing Fan) (2014)].

ii. Map- Reduce- MapReduce is a special tool in the Hadoop toolbox that helps handle big data stored in Hadoop. It's a crucial part based on working of Hadoop and its efficiency in managing and processing of large amount of data. In 2004, Google introduced a programming model to simplify the creation of applications that can process vast amounts of data simultaneously on large groups of computers that ensures reliability even if some of the hardware fails. This system operates on massive datasets by breaking down the problem and data into smaller parts and running them concurrently. The Map function is the initial step usually employed for filtering, transforming or parsing the data. The results produced by the Map function then serve as the input for the Reduce function. The Reduce function is typically employed to consolidate data generated by the Map function [(Rahul Beakta) (2015)]. The well-designed structure of MapReduce has led to its adoption in various computing setups, such as multi-core clusters, cloud environments and many more. Cloud providers often use MapReduce for offering data analytical services. Many Big Data applications using MapReduce need quick response times and enhancement of the performance of MapReduce tasks is a key focus for both academia and industry [(Chaowei Yang et al.) (2017)].

3. Applications Of Big Data- Big data has achieved notable milestones across various domains such as-

i. Marketing & Business- Big Data was created to comprehend the vast amounts of information generated when people interact with different systems and each other. This enables businesses to use analytics to identify their most valuable customers and innovate new experiences, services, and products. Big Data has been essential for numerous top companies to outdo their rivals. In various industries both new and existing competitors rely on data-driven strategies to compete, seize opportunities, and innovate. Companies in the worldwide whether large or small are looking for ways to use data. Small and medium-sized businesses can now take advantage of big data to make fast and precise decisions to enhance their business operations. SMEs can gain advantages from abundant data by establishing partnerships and applying big data technologies in aspects like supply chain management and business operations [(Muhammad Iqbal et al.) (2018)].

Figure 3: Big Data in Business Applications

Source: https://www.iteratorshq.com/wp-content/uploads/2020/08/360_customer_view.jpg

Companies that incorporate big data analytics into their operations tend to be more productive and financially successful than others. A study reveals that retailers can boost their return on investment by 15-20% through the effective use of big data applications ([(Pervaiz Akhtar et al.) (2019)]. With the advantage of the ongoing research in digital technologies and the capabilities of information systems a Digital Transformation and Sustainability (DTS) model was built. This model illustrates how big data and business analytics ecosystems can contribute to creating sustainable societies through digital transformation. Big data analytics helps in getting the deeper insights of new innovations and value creation [(Ilias O. Pappas et al.) (2018)].

Businesses are slowly shifting towards using more data in their operations. There are chances for more companies to discover the advantages of using data especially in new and creative ways. There are three types of big data business models one those who use data, secondly those who supply data, and third those who facilitate data. These three groups depend on each other for a successful data-focused economy and all three need to grow together [(Ralph Schroeder) (2016)].

The combination of IoT and Big Data is transforming the way management and marketing strategies work through digitalization. This marks a new era in business competitiveness. It not only alters human interactions and daily routines but also revolutionize the way companies manage their methods and processes [(Andrea Sestino et al.) (2020)]. Big data contributes to how we use data, gather information, require specific skills, and share data. Big data is useful for retail businesses facing more competition and new ways of doing business, demanding quick and efficient data strategies. Big data helps by quickly adjusting prices and managing costs with the right business strategy at the right time [(Gabriele Santoro et al.) (2018)].

Big data offers many advantages for businesses as (i) it enhances the ability to make decisions (ii) improved interaction with customers (iii) Enhanced possibilities for promoting social good. Although it offers many advantages to the business but it has to face many challenges also. (i) It increases business operating cost (ii) concerns regarding personal privacy (iii) problems with the quality of data (iv) requirements for talent and staffing. Big data helps businesses make smarter choices and work more efficiently. It’s a key tool for growth and staying competitive in the fast-changing business world.

ii. Healthcare- Health care data includes medical conditions, the standard of life, and results related to health. It comes from various sources like wearable devices, patient records, and medical imaging. This data helps assess the quality of care, guide clinical decisions, and identify risk factors. It benefits patients, healthcare professionals, facilities, and systems. The influence of big data in the healthcare sector is substantial and the market has expanded accordingly. Healthcare professionals use big data for various purposes as to gain insights in medical research and offering personalized medicine to patients. The healthcare industry being one of the largest and fastest-growing worldwide manages data speedily but various electronic health records collect data differently in variety of formats. Electronic health records (EHRs) offer valuable data for studying processes of diseases and improving individualized medical care. Therefore, there's a necessity to transform raw data into meaningful insights by using different analytical tools of big data [(Maria Mohammad Yousef) (2021)].

Figure 4: Big Data in Healthcare

Source: https://www.netscribes.com/wp-content/uploads/2022/11/Big-Data-sources-in-Healthcare.png

The inherent advantages of employing big data analytics in healthcare encompass the timely identification of diseases and ailments during their initial phases for efficient control and treatment. Additionally, it facilitates individual health management through the provision of patient-centric services, enhancing treatment methodologies and quickens the detection of healthcare fraud with increased speed and efficiency [(Harshit Kumar et al.) (2017)]. The health community is addressing an overwhelming surge of health and healthcare associated challenges. The primary sources of extensive health data are derived from genomics that includes genotyping and sequencing data as well as from payer–provider sources that encompasses electronic insurance records, health data and patient responses. The insight from large-scale health data presents notable research and practical hurdles [(Hsinchun Chen) (2012)].

McKinsey suggests that using big data analytics could save over $300 billion annually in U.S. healthcare. Two-thirds of these savings about 8% would come from reducing the national healthcare expenditure. His belief was that utilizing big data has the potential to decrease both waste and inefficiency [(WullianallurRaghupathi et al.) (2014)].Big Data analytics with its ability to predict and recognize patterns allows a move from medicine based on experience to medicine based on evidence [(Nishita Mehta et al.) (2018)]. Big data has provided many benefits for improving health care by providing patients with special care and smarter treatment plans.

iii. Education - The education system collects a ton of data about students and dealing with this information is really important. Big Data in education helps us to change the way things can be done, fill gaps, and make learning better for everyone. It's like using information to improve the whole education system. Big data offers academic institutions the chance to bring together essential systems, applications and platforms. This enables them to improve effectiveness and cut down expenses. The arrival of big data allows teachers to check students’ performance in academics and their learning approach. The immediate and positive feedback inspires and satisfies students leading to a positive influence on their performance [(Maria Ijaz Baig et al.) (2020)].

Figure 5: Big Data in Education

Source:https://media.springernature.com/lw685/springer-static/image/chp%3A10.1007%2F978-981-16-9447-9_54/MediaObjects/517563_1_En_54_Fig8_HTML.png

The rise of big data in education can be linked to at least two significant trends in the digital age. Firstly, the process of recording and storing institutional data in conventional environments has progressively shifted to digital platforms that helps to generate abundant standardized student information. Secondly, learning behaviours that were difficult to document in traditional classrooms can now be partially captured through the use of Learning Management Systems (LMS) [(Christian Fischer et al.) (2020)].

Vocational and hands-on education offer many chances for successful collaboration between academia and industry. As work dynamics and technology becomes more prevalent so there is a rising need for substantial changes in vocational education that impacts both teachers and students. It is crucial to have collaboration between academia and industry to have a balanced integration of human and machine learning approaches [(Hui Luan et al.) (2020)].

Protecting students' privacy is crucial in education because if their information is used or shared improperly then it can harm their learning and social growth. The worry about secret monitoring of every action online can make students stressed about their performance which is not good for education or the goals of using Big Data in education. These challenges can be overcome only if Big Data tools in education takes into account moral and ethical considerations. [(Joel R. Reidenberg et al.) (2018)].

Big Data can enhance the learning process by granting access to dependable data sources. It aids in fostering student involvement, participation and widespread knowledge dissemination to both students and the broader community. Big Data operates in real-time allowing for the exploration of data to comprehend student behavior and it has the capacity to provide tailored and customized services to individual students [(Miftachul Huda et al.) (2016)]. There are many other benefits of using big data in education that includes make plans for the future and also creates new opportunities for learning.

iv. Media and Entertainment- Media and entertainment are a big part of our lives. People love trying out new shows and movies. Things are changing and now there are tons of options available for users. It is easy to watch them on different devices making it super convenient for everyone. Big data plays a crucial role in shaping global media networks in two main ways. Firstly, turning media into data and using services of big data helps create digital networks where competition and collaboration are with working together and exchanging goods and services. Secondly, big data is becoming a worldwide format very much similar to how TV formats spread globally [(Amelia H. Arsenault) (2017)].

Figure 6: Big Data in Media & Entertainment

Source: https://i1.wp.com/techvidvan.com/tutorials/wp-content/uploads/sites/2/2021/05/Big-Data-in-Media-Entertainment.jpg?fit=802%2C420&ssl=1

Big data helps media industry in a variety of ways. It determines the customer’s interest and insights into their browsing history and social media activities. It can also recognize the time spent, reactions, and responses to alterations in the applications that the users are engage with. By utilizing big data, businesses within the media and entertainment industry can formulate or adjust strategies to attract and retain customer loyalty. Organizations armed with big data platforms can forecast the success of content beforehand rather than relying on intuitions only. Big Data applications enhances ad targeting in a progressively refined consumer landscape [(Markus Lohnert) (2022)].

Technologies and related innovations in big data are transforming every industry. Social media serves as a crucial communication channel in the contemporary world. Individuals and the general public convey their emotions including joy, anger, affection and dislike through social media. Social media works like a continuous stream of data that companies can analyse to understand what people think about their products. Businesses are using sentiment analysis to grasp viewers' opinions about movies as people share their choices on social media. Researchers are suggesting different ways to analyse and enhance the performance of algorithms. A model for big data should be crafted to optimize the effectiveness different of streaming services [(G. G. Hallur et al.) (2021)].

In the world of petabytes, there are chances to consider entirely new roles and connections with data. To understand and use these connections practically, adaptive algorithms of big data turn them into stories and shape a more interesting future in the field of entertainment [(Tawny Schlieski et al.) (2012)]. Big data has simplified the customization of services for businesses allowing more precise targeting in marketing efforts, optimization of content, prediction of future trends and innovative ways to interact with their audience.Top of Form

v. Travel and Tourism- Big data within the travel industry encompasses the extensive volume of information gathered from diverse sources such as reservation platforms, social media channels and GPS monitoring. Its significance lies in its role in comprehending customer preferences, forecasting trends, streamlining operations and tailoring services. Data-driven technique enhances customer satisfaction and optimizing operational effectiveness within the travel industry. Characterized by cutting-edge services, big data offers a high level of innovative, open, integrated and collaborative processes aimed at improving the well-being of both locals and visitors.

The concept of a selecting smart destination emerges from the integration of tourism destinations with various stakeholders' communities. This integration occurs through dynamic platforms, knowledge-intensive communication flows and advanced decision support systems [(Lorenzo Ardito et al.) (2019)]. Tourism companies, destination administrators and consumers collectively produce and utilize extensive data, employing data analytics to enhance decision-making across various levels. The web data derived from different destination websites can be employed to anticipate hotel demand in a tourist destination or identify suitable flights for their direct destinations [(Mariani, M et al.) (2022)].

Figure 7: Big Data in Tourism

Source:https://media.licdn.com/dms/image/D4D12AQE8mMmLiz66rg/article-cover_imageshrink_600_2000/0/1702560624497?e=2147483647&v=beta&t=mv0pGyyqOPUOvz6QLEcJBPP5y_Px9wxz2b7XM7gJCBA

In order to achieve the advantages of personalized tourism requires precise categorization, accurate analysis of tourist needs, and a consistent commitment to precision in design, ensuring tailored plans that meet customer requirements [(Heqing Zhang et al.) (2021)]. The utilization of big data in sustainable tourism is a subject of investigation in both academic and non-academic contexts. Different methods are used to identify, rank and predict behaviors as well as analyze tourist numbers [(E. Rahmadian et al.) (2022)].

Big Data tools can offer immediate insights into the online behavior of tourists concerning a destination. This implies that these tools can furnish valuable information to assist decision-makers at a destination in gaining a clearer understanding of the expectations and requirements of potential tourists [(Dimitrios Belias et al.) (2021)]. Big data from the internet presents a valuable chance to enhance the accuracy of forecasting the demand for tourism and provide timely insights. These data enable the measurement and monitoring of tourist behaviors and satisfaction promptly, overcoming delays associated with conventional forecasting methods. Big data provides real time insights into tourists' preferences and frequent updates. It addresses the limitations of traditional data in accurately forecasting tourism demand especially during unique events with changing data patterns [(Hengyun Li et al.) (2020)].

The author [(Hui Lv, Si Shi & Dogan Gursoy) (2021)] here indicates that different type of big data has been used in tourism research that includes both structured and unstructured data. Within the realm of research focused on hospitality and tourism, the data in professional databases come pre-structured eliminating the necessity for tasks like data cleaning. This streamlines the process for scholars, allowing them to directly extract and analyse the data. The recent rapid progress in Internet technology has led to the creation of extensive unstructured big data sets. Consumers have extensively shared their travelling experiences on many platforms such as Facebook, Twitter, TripAdvisor and many more that leads to the creation of diverse unstructured data in tourism domain. The data includes online reviews and geolocated photos which holds significant value for investigating into individual-level hospitality and tourism. Big data empowers the travel industry to make informed decisions, improving demand anticipation, pricing strategies, targeted marketing and enhancing customer experiences.Top of Form

vi. Government- Advanced big data management methods for analytics enables governments to grasp citizen needs, counter fraud, mitigate system errors, and enhance operations. This leads to cost reduction and improved services across government entities. Big data empowers government entities to provide services with increased efficiency and security enabling quick and accurate responses to the needs of customers and citizens. The concept of big data presents fresh opportunities for creating value, making discoveries, predicting trends and enhancing business intelligence to support decision-making in e-government. Big data facilitates the establishment of a smart government that ensures the efficient and reliable delivery of services to citizens [(Zaher Ali Al-Sai et al.) (2017)].

Figure 8: Big Data in Government

Source:https://journals.sagepub.com/cms/10.1177/0952076718780537/asset/images/large/10.1177_0952076718780537-fig1.jpeg

The author [(Irina Pencheva) (2018)] describes policy formulation involves turning identified issues and proposals into government programs. Big Data's advantages in setting priorities and formulating policies include improving accuracy, efficiency and speed. It aids public managers in aggregating and analyzing citizens' policy preferences, enhancing understanding of effective incentives and circumstances. Within Industry 4.0, governments employ cutting-edge technologies like blockchain, artificial intelligence (AI), Internet of Things (IoT), cloud computing and Big Data Analytics (BDA) to enhance intelligent governance. It aims in enhancing transparency, ensuring accountability and overall improving efficiency and effectiveness [(Cu Kim Long) (2021)].

Government leaders aspire to transform organizations into data-driven entities with chief information officers ensuring accurate correlation and monitoring interdependencies. The goal is to ensure timely access to the right information for the right individuals. A strategic approach is crucial in placing data where it can be accessed most successfully when needed [(Jung Wan LEE) (2020)]. Big data represents a strategic initiative for numerous government organizations responding to shifts in the external landscape encompassing economical, technical, political and socio-cultural elements. It is crucial to have evaluative feedback and comprehension for significant issues that impact the directions of strategic change [(Akemi Takeoka Chatfield et al.) (2015)].

Big data has arisen the concept of open governance from the recognition that information is a public asset. It has the potential to shift towards electronic governance. Open government aligns closely with collaborative governance as the availability of open data enhances opportunities for advancement of knowledge, decision formulation and cross-disciplinary collaboration [(Shefali Virkar et al.) (2018)]. Big data transforms government operations by improving decision-making, transparency and efficiency. Its strategic use enables governments to proactively address challenges and deliver enhanced services shaping a more responsive and effective public sector.Top of Form

4. Challenges and Pitfalls of Big Data- Big data presents significant opportunities in many fields. However, traditional models struggle with the large volume of data. To tackle this issue, it is essential to explore challenges posed by big data and create computing models that facilitate effective data analysis.

i. Handling extensive volumes of data- Many companies are expanding their daily data collection. This data can be either structured or unstructured that poses a challenge in data analysis due to heterogeneity. Many business executives’ express concerns about the insufficient storage capacity. The global shift towards Cloud technology is causing a rapid surge in data generation [(Rahul Beakta) (2015)]. Cloud storage solutions can adapt dynamically to increased storage requirements while big data software is crafted to efficiently store and rapidly retrieve vast amounts of data.

ii. IT framework- The rapid expansion of big data and the demand for swift collection, processing, and utilization of energy data pose significant challenges to conventional IT infrastructure. Enhancements are necessary in network bandwidth, data storage, processing capabilities and data interoperability within the IT infrastructure. This improvement aims to better facilitate big data-driven smart energy management [(Kaile Zhou et al.) (2016)].

iii. Security and Privacy- Organizations are greatly concerned about security as non-encrypted information is susceptible to theft or damage from cyber-criminals. Consequently, professionals in data security must find a balance between providing access to data and upholding rigorous security protocols. A combination of industry self-governance, technical measures and reinforced legislation should work together to ensure security and privacy of sensitive data.

iv. Ensuring quality of data- The success of analytics procedures relying on vast datasets is essential for producing reliable insights. Incomplete data can lead to unexpected results. With the proliferation of huge data, it becomes challenging to have accurate insights of data. Specific data quality software can be employed to validate and cleanse your data prior to processing.

v. Lack of Skilled Professionals- A common challenge for many companies in dealing with big data is that their existing staff lacks experience in this domain and acquiring the necessary skill set is not a quick process. Involving untrained personnel can lead to workflow disruptions and processing errors.
Conclusion The impact of big data in various application areas is transformative and far-reaching. Big data has revolutionized the way information is processed, analysed and utilized. The importance of big data resonates across diverse fields leaving an enduring impact on many areas including healthcare, education, business, tourism, government and entertainment. The ability to derive meaningful insights from massive datasets has not only improved decision-making processes but also opened avenues for innovation and efficiency. As we continue to advance in the era of big data, the potential for positive impact across diverse fields remains immense, promising a future where data-driven solutions drive progress and shape our understanding of the world. While the applications are vast and transformative, challenges such as data security, privacy and integration, still persist. Nevertheless, the enduring importance of big data in these varied sectors underscores its role as a catalyst for innovation, efficiency and progress shaping a dynamic future across the spectrum of human endeavours.
References

1. Thomas, Mereena., 2015, “A Review paper on BIG Data”, International Research Journal of Engineering and Technology (IRJET)”, Volume: 02, Issue: 09, pp. 1030-1034.

2. Agrahari, Anurag., Rao, Dharmaji., 2017, “A Review paper on Big Data: Technologies, Tools and Trends”, International Research Journal of Engineering and Technology (IRJET), pp. 640-649.

3. Oussous, Ahmed., et al., 2017, “Big Data technologies: A survey”, Journal of King Saud University – Computer and Information Sciences, pp. 431-438.

4. Chen, Min., Mao, Shiwen., Liu. Yunhao., 2014, “Big Data: A Survey”, Mobile Netw Appl, pp. 171-209, DOI 10.1007/s11036-013-0489-0.

5. Conboy, Kieran., Mikalef, Patrick., Dennehy, Denis., Krogstie, John., 2020, “Using Business Analytics to Enhance Dynamic Capabilities in Operations Research: A Case Analysis and Research Agenda”, European Journal of Operational Research 281(3), pp. 656-672.

6. Gandomi, Amir., Haider, Murtaza., 2015, “Beyond the hype: Big data concepts, methods, and analytics “, International Journal of Information Management 35, pp. 137-144.

7. Madaan, Nikhil., Kumar, Umang., Jha, Kr, Suman., 2020, “Big Data Analytics: A Literature Review Paper”, International Journal of Engineering Research & Technology (IJERT), Volume 8, Issue 10, pp. 11-19.

8. Sivarajah U, Irani Z, Gupta S et al., 2020, “Role of big data and social media analytics for business to business sustainability: A participatory web context”, Industrial Marketing Management. 86, pp. 163-179.

9. Yadav, Lata, Saneh., Sohal, Asha., 2017, “Review Paper on Big Data Analytics in Cloud Computing”, International Journal of Computer Trends and Technology (IJCTT) – Volume 49 Number 3, http://www.ijcttjournal.org/, pp. 156-160.

10. Beakta, Rahul., 2015, “Big Data And Hadoop: A Review Paper”, International journal of computer science & information, Volume 2, Spl. Issue 2, pp. 13-15.

11. Fan, Jianqing., Han, Fang., Liu, Han., 2014, “Challenges of Big Data analysis”, National Science Review, pp. 293–314.

12. Yang, Chaowei., Huang, Qunying., Li, Zhenlong., Liu, Kai., Hu, Fei., 2017, “Big Data and cloud computing: innovation opportunities and challenges”, International Journal of Digital Earth, 10:1, pp. 13-53.

13. Iqbal, Muhammad., Dr. Manzoor, Amir., et al., 2018, “A Study of Big Data for Business Growth in SMEs: Opportunities & Challenges”, International Conference on Computing, Mathematics and Engineering Technologies, pp.1-7.

14. Akhtar, Pervaiz; Frynas, Jedrzej, George; Mellahi, Kamel., Ullah, Subhan., 2019, “Big data-savvy teams’ skills, big data-driven actions and business performance”, British Journal of Management, pp. 252–271.

15. Pappas, O, Ilias., Mikalef, Patrick., et al., 2018, “Big data and business analytics ecosystems: paving the way towards digital transformation and sustainable societies”, Information Systems and e-Business Management, pp. 479–491.

16. Schroeder, Ralph., 2016, “Big data business models: Challenges and opportunities”, Cogent Social Sciences, 2:1, 1166924, DOI: 10.1080/23311886.2016.1166924, pp. 1-15.

17. Sestino, Andrea.,. Prete, M, I., Piper, L., Guido, G., 2020, “Internet of Things and Big Data as enablers for business digitalization strategies”, Technovation, pp. 1-9.

18. Santoro, Gabriele., Fiano, Fabio., Bertoldi, Bernardo., Ciampi, Francesco., 2018, “Big data for business management in the retail industry”, AperTO - ArchivioIstituzionale Open Access dell'Università di Torino, pp. 1-13.

19. Yousef, Mohammad, Maria., 2021, “BIG DATA ANALYTICS IN HEALTH CARE: A REVIEW PAPER”, International Journal of Computer Science & Information Technology (IJCSIT) Vol 13, No 2, pp. 17-28.

20. Kumar, Harshit., Singh, Nishant., 2017, “Review paper on Big Data in healthcare informatics”, International Research Journal of Engineering and Technology (IRJET), Volume: 04 Issue: 02, pp. 197-201.

21. Chen, Hsinchun., H. L. Chiang, H, L, Roger., Storey, C, Veda.,2012, “BUSINESS INTELLIGENCE AND ANALYTICS: FROM BIG DATA TO BIG IMPACT”, MIS Quarterly Vol. 36 No. 4, pp. 1165-1188.

22. Raghupathi, Wullianallur., Raghupathi, Viju., 2014, “Big data analytics in healthcare: promise and potential”, Health Information Science and Systems, pp. 1-10.

23. Mehta, Nishita., Pandit, Anil., 2018, “Concurrence of big data analytics and healthcare: A systematic review”, International Journal of Medical Informatics, pp. 57-65.

24. Baig, Ijaz, Maria., Shuib, Liyana., Yadegaridehkordi, Elaheh., 2020, “Big data in education: a state of the art, limitations, and future research directions”, International Journal of Educational Technology in Higher Education, pp. 1-23.

25. Fischer, Christian., et al., 2020, “Mining Big Data in Education: Affordances and Challenges”, Review of Research in Education, 44, pp. 130-160.

26. Luan, Hui., et al., 2020, “Challenges and Future Directions of Big Data and Artificial Intelligence in Education”, Frontiers in Psychology, volume 11, pp. 1-11.

27. Reidenberg, R, joel., Schaub, Florian., 2018, “Achieving big data privacy in education”, Theory and Research in Education, pp. 1-17.

28. HUDA, Miftachul., et al., 2016, “Innovative Teaching In Higher Education: The Big Data Approach”, The Turkish Online Journal of Educational Technology, pp. 1210-1216.

29.  ARSENAULT, H, AMELIA., 2017, “The datafication of media: Big data and the media industries”, International Journal of Media & Cultural Politics Volume 13 Numbers 1 & 2, pp. 7-24.

30. Lohnert, Markus., 2022, “THE IMPACT OF DIGITAL TRANSFORMATION ON BUSINESS MODELS A Literature Review with a Focus on the Media and Entertainment Industry”, JOHANNES KEPLER UNIVERSITY LINZ, pp. 1-95.

31. Hallur, G, G., Prabhu, S., Aslekar, A., 2021, “Entertainment in Era of AI, Big Data & IoT”, Digital Entertainment, https://doi.org/10.1007/978-981-15-9724-4_5, pp. 87-109.

32. Schlieski, Tawny., Johnson, David, Brian., 2012, “Entertainment in the Age of Big Data”, Proceedings of the IEEE, Vol. 100, pp. 1404-1408.

33.  Ardito, Lorenzo., Cerchione, Roberto., Vecchio, Del, Pasquale., Raguseo, Elisabetta., 2019, “Big data in smart tourism: challenges, issues and opportunities”, Current Issues in Tourism, 22:15, DOI: 10.1080/13683500.2019.1612860, pp. 1805–1809.

34. Mariani, M, M., Baggio, R., 2022, “Big data and analytics in hospitality and tourism: a systematic literature review.”, International Journal of Contemporary Hospitality Management, 34 (1), pp. 231-278.

35. Zhang, Heqing., Guo, Tingting., Su, Xiaobo., 2021, “Application of Big Data Technology in the Impact of Tourism E-Commerce on Tourism Planning”, Hindawi Complexity, https://doi.org/10.1155/2021/9925260, pp. 1-10.

36. Rahmadian, E., Feitosa, D., Zwitter, A., 2022, “A systematic literature review on the use of big data for sustainable tourism”, Current Issues in Tourism, 25:11, pp. 1711-1730, DOI: 10.1080/13683500.2021.1974358.

37. Belias, Dimitrios., et al., 2021, “The Use of Big Data in Tourism: Current Trends and Directions for Future Research”, Academic Journal of Interdisciplinary Studies, Vol 10 No 5, pp. 357-364.

38. Li, H., Hu, M., Li, G., 2020, “Forecasting tourism demand with multisource big data”, Annals of Tourism Research, 83, pp. 1-23.

39. Hui Lv., Si Shi., Dogan Gursoy., 2021, “A look back and a leap forward: a review and synthesis of big data and artificial intelligence literature in hospitality and tourism”, Journal of Hospitality Marketing & Management, DOI: 10.1080/19368623.2021.1937434, pp. 1-31.

40. Al-Sai, Z, A., Abualigah, L, M., 2017, “Big Data and E-government: A review”, 8th International Conference on Information Technology (ICIT), pp. 580-587.

41. Pencheva, Irina., et al., 2018, “Big Data and AI – A transformational shift for government: So, what next for research?”, Public Policy and Administration, Vol. 35(1), pp. 24–44.

42. Long, C, M., et al., 2021, “A big data framework for E-Government in Industry 4.0”, Open Computer Science, https://doi.org/10.1515/comp-2020-0191, pp. 461-479.

43. LEE, Wan, Jung., 2020, “Big Data Strategies for Government, Society and Policy-Making”, Journal of Asian Finance, Economics and Business, Vol 7, No 7, pp. 475 – 487.

44. Chatfield, A, T., et al., 2015, “Capability Challenges in Transforming Government through Open and Big Data: Tales of Two Cities”, Thirty Sixth International Conference on Information Systems, Fort Worth, pp. 1-21.

45. Virkar, Shefali., Pereira, G, V., 2018, “Exploring Open Data State-of-the-Art: A Review of the Social, Economic and Political Impacts”, 17th International Conference on Electronic Government (EGOV), Krems, Austria, pp. 196-207, 10.1007/978-3-319-98690-6_17.

46. Zhou, K., et al., 2016, “Big data driven smart energy management: From big data to big insights”, Renewable and Sustainable Energy Reviews 56, pp. 215-225.