Studi On Big Data Analytics Framework in Smart City Context

The issue of global urbanization, which is a separate problem faced by the government, is the very rapid growth of population density in cities. To face this challenge, the government launched a smart city project by targeting sustainable economic growth and improving the quality of life. Information and Communication Technology governance is the key to realizing a smart city. However, each of these I.C.T. tools produce large amounts of data known as Big Data. Data processing with the Big Data approach is becoming a trend in information systems to provide better public services and provide references in the policy-making process. However, to obtain important information in the scope of big data, a Big Data Analytics process is needed, also known as Big Data Value Chain. Extracting knowledge from the related literature can identify the characteristics of the big data analytic framework for smart cities. This paper reviews several big data analytic frameworks applied to smart cities. This paper is to find the advantages and disadvantages of each framework so that it can be a direction for future research


Introduction
The issue of urbanization and global population growth that is increasingly uncontrollable is causing problems in urban areas. The government launched a smart city project by targeting sustainable economic growth with a better quality of life. In Indonesia, the Smart City concept was initiated by an expert from ITB, Suhono S. Supangkat. A smart city is a city's government that is the fastest and most accurate in providing solutions to its citizens. Smart cities have been initiated by European countries that the smart city concept consists of supporting components, namely: smart economy, smart people, smart governance, smart government, smart mobility, smart environment, and smart living [1]. The smart city is not only a process of making each of these domains "smart", but also how the integration process between domains forms a unified system. This integration process requires the sharing of information between domains [2]- [3].
The use of digital technology from various aspects and interactions between humans, between machines and humans with machines produce a very large volume of data which is called Big Data [4,5]. A term where there is complex and rapidly growing data. So to get the right information, a precise and accurate analysis process is needed. This process is called Big Data Analytic BDA or Big Data Value Chain (B.D.V.) [6]- [7].
Reviewing related literature on BDA can provide knowledge about how to identify the characteristics of BDA and B.D.V. framework for smart city * Corresponding author : dinda26kc@gmail.com implementation [7]. This paper reviews the big data analytic framework applied to smart cities. The purpose of this review is to find the advantages and disadvantages of each of these frameworks.

Big Data Analytics Concept
The term Big Data is a natural product of digital artefacts and their applications. Cell phones, sensors and social media. Increasing human-to-human, human-to-machine and machine-to-machine interactions to a more advanced level. Issues related to Big Data are volume, velocity, variety, and veracity [8].
Big Data Analytics BDA is the entire process and tool needed to find the right information to support the decision-making process. The process consists of data extraction, transformation to knowledge, and analysis. Involves BDA specific tools, techniques and methods. Because this process is not like ordinary traditional information extraction, this creates opportunities for researchers and technology providers to develop reliable models, platforms, frameworks and algorithms for BDA [6].
Big Data management platform scalability is divided into two approaches, namely: vertical approach and horizontal approach, namely Vertical scaling, which accommodates increased data volume by increasing computer capabilities (memory, CPU, etc.). Horizontal scaling is dividing the workload evenly in parallel to several computers at once.
There are two approaches, both from perspective advantages and disadvantages. Vertical scaling has a drawback in terms of the maximum limit of a computer being upgraded. While horizontal scaling requires complicated handling when using many computers with different operating systems. High-Performance Computing (HPC) and Apache Hadoop are examples of vertical and horizontal scaling platforms [8]. Cost issues, limitations in computer upgrades and dynamics of multidomain datasets make horizontal scaling more widely used as a reference in designing big data analytic platforms such as Apache Hadoop.

Smart City Concept
The term Smart city is the concept of a system with the use of technology by using sophisticated data processing with the aim to make the governance of the city be more efficient, people are satisfied, and promising business environment more secure and convenient [9]. Three factors influence the formation of a smart city: technological aspects (hardware and software infrastructure), human aspects (creativity, diversity and education) and institutional aspects (government and regulations) [2].
Perspectives of I.C.T., industry and academia create a layered approach to model smart city. Yin Chuan Tai, et al. make 4 layers models: the acquisition and transmission of data, data virtualization, data services, and applications. There are many applications for implementing BDA in various smart city domains, such as for planning [10], traffic control and transportation [11]. Analysis of crime [12], energy sector [13] and environment [14]. In order to design software that is in accordance with the objectives of a smart city, it is necessary to identify aspects of functional requirements and non-function requirements related to data sources and their management. As an example of the diversity of data sources (I.O.T. Data, Social Media Data, data and medical records) requires integration, scalability of the system, the security and privacy aspects [15].
Related research by [16], the aspect of non-functional requirements such as sustainability, availability, privacy, social aspects and flexibility/extensibility (scalability). Meanwhile, functional requirements aspects such as object interoperability, real-time monitoring, historical data, mobility, service composition and integrated urban management. Similar research by [17] mentions eight functional requirements, namely: data management, application run-time, Wireless Sensor Network W.S.N. management, data processing, external data access, service management, software engineering tools and definition of smart city model. It is coupled with eight non-functional requirements, namely: interoperability, scalability, security, privacy, context awareness, adaptation, extensibility and configurability.
Can we conclude that there are seven aspects of functional requirements: interoperability, real-time analysis, historical analysis of data, mobility, iterative processing, data integration, and aggregation models. It has Nine aspects of non-functional requirements, such as scalability, security, privacy, context awareness, adaptation, extensibility, sustainability, availability, and configurability.

Results and Discussion of BDA Framework
Based on the results of reviewing and evaluating several related articles, there are 3 popular architectural frameworks related to BDA to be applied in the smart city context, namely the framework: BASIS [18], SWIFT [19] and RADICAL [20].

BASIS
BASIS is a 3 layer big data architecture for smart city. This layer consists of: conceptual, technology and infrastructure. The conceptual layer packs BASIS internal and external functionality which includes data capturing, integration, storage and APIs for web streams and data analysis. The technology layer is realized with the Hadoop platform. The infrastructure layer manages the hardware interconnection with the outside world. This infrastructure layer is the interface between the external technology layers. BASIS has design principles including:  Open data services [21], facilitating the realization of new services developed by the government, organizations or communities.  It is providing storage and processing of data that is distributed or centralized.  Focus on security and privacyPartitioning data to manage data cycle  Focus on services and the freedom of users  Use of open-source technology, making it more open. BASIS has been validated in case studies to find flight delay profiles in several American cities. The profiles were extracted using the K-Means algorithm which was run on 13 computer clusters using a dataset containing 3.5 million flight records in America.

SWIFT
SWIFT, which stands for Smart Wireless Sensor Network (W.S.N.)-based Infrastructural Framework for smart Transactions, is a three-layer architectural framework that supports the integration of various devices. The most basic layer is the Smart Wireless Sensor Network (S-WSN) The framework has S-WSN layer functions as a SWIFT detector that looks for information in hundreds of wireless sensors that are spread out. Several sensors are grouped in a cluster and managed by a smart cluster head (SCH) which is distributed in various places. SCH collect and combine data from nearby sensors. In addition, the sensor also sends information about the identity of the sensor and its battery status. SCH has the ability to sound an alarm or activate an emergency signal etc. Data from SCH is sent to Smart Fusion Nodes (SFN), which are in the SWIPE layer. SWIPE is at the core of SWIFT. SFN has the function of classifying and merging data to get interpretation results from sensor data. Meanwhile, SDCE provides cloud-based services to all domains in the smart city.

RADICAL
A platform called Rapid Deployment on Intelligent Cities And Living (RADICAL) is a Service Oriented Architecture (S.O.A.) based platform that collects and analyzes IoT (Internet of Things) and Social Network (S.N.) data into value-added services for smart cities [22]. Data related to IoT is stored in the form of observations and measurements. As an example of measuring CO2 levels. Meanwhile, the S.N. data is accessed in real-time using the network API.
On top of the main platform, RADICAL provides a set of application management tools that facilitate users to maximize the use of the platform.
IoT data is collected in a RADICAL repository (MySQL Database) using an API (Application Programming Interface).