IoTBDS 2019 Abstracts


Area 1 - Big Data Research

Full Papers
Paper Nr: 5
Title:

Automatic View Selection for Distributed Dimensional Data

Authors:

Leandro Ordonez-Ante, Gregory Van Seghbroeck, Tim Wauters, Bruno Volckaert and Filip De Turck

Abstract: Small-to-medium businesses are increasingly relying on big data platforms to run their analytical workloads in a cost-effective manner, instead of using conventional and costly data warehouse systems. However, the distributed nature of big data technologies makes it time-consuming to process typical analytical queries, especially those involving aggregate and join operations, preventing business users from performing efficient data exploration. In this sense, a workload-driven approach for automatic view selection was devised, aimed at speeding up analytical queries issued against distributed dimensional data. This paper presents a detailed description of the proposed approach, along with an extensive evaluation to test its feasibility. Experimental results shows that the conceived mechanism is able to automatically derive a limited but comprehensive set of views able to reduce query processing time by up to 89%–98%.
Download

Paper Nr: 8
Title:

Speeding Up Classifier Chains in Multi-label Classification

Authors:

Jose M. Moyano, Eva L. Gibaja, Sebastián Ventura and Alberto Cano

Abstract: Multi-label classification has attracted increasing attention of the scientific community in recent years, given its ability to solve problems where each of the examples simultaneously belongs to multiple labels. From all the techniques developed to solve multi-label classification problems, Classifier Chains has been demonstrated to be one of the best performing techniques. However, one of its main drawbacks is its inherently sequential definition. Although many research works aimed to reduce the runtime of multi-label classification algorithms, to the best of our knowledge, there are no proposals to specifically reduce the runtime of Classifier Chains. Therefore, in this paper we propose a method called Parallel Classifier Chains which enables the parallelization of Classifier Chain. In this way, Parallel Classifier Chains builds k binary classifiers in parallel, where each of them includes as extra input features the predictions of those labels that have been previously built. We performed an experimental evaluation over 20 datasets using 5 metrics to analyze both the runtime and the predictive performance of our proposal. The results of the experiments affirmed that our proposal was able to significantly reduce the runtime of Classifier Chains while the predictive performance was not statistically significantly harmed.
Download

Paper Nr: 16
Title:

An Efficient Heuristic Method for Repairing Event Logs Independent of Process Models

Authors:

Li Kong, Chuanyi Li, Jidong Ge, Zhongjin Li, Feifei Zhang and Bin Luo

Abstract: Due to the big volume of data and complex execution, event logs of business processes inevitably contain various errors. In the field of process mining, if we derive process models from the event data without repairing, it is very likely that the resulting process is extremely different from what we expect. Current methods of repairing logs generally compare the log with an existing reference model to seek an optimal alignment, which requires that there should be a reliable reference model. Therefore, this paper presents an approach which only refers to the log itself to repair mistaken traces. We identify loop structures and frequent event sequences (sound conditions) between certain events. For each trace, basic trace and loop events are separated in advance. The basic trace is split into several parts to get repaired one by one according to sound conditions. Then loop events are added back and checked according to corresponding loop structure we discover. The repaired log should be as clean as possible and as similar to the original log as possible so that correctness and integrity of the original log are guaranteed. Experimental results based on different logs prove that our approach is effective and efficient.
Download

Paper Nr: 20
Title:

Associative Classification in Big Data through a G3P Approach

Authors:

J. M. Luna, F. Padillo and S. Ventura

Abstract: The associative classification field includes really interesting approaches for building reliable classifiers and any of these approaches generally work on four different phases (data discretization, pattern mining, rule mining, and classifier building). This number of phases is a handicap when big datasets are analysed. The aim of this work is to propose a novel evolutionary algorithm for efficiently building associative classifiers in Big Data. The proposed model works in only two phases (a grammar-guided genetic programming framework is performed in each phase): 1) mining reliable association rules; 2) building an accurate classifier by ranking and combining the previously mined rules. The proposal has been implemented on Apache Spark to take advantage of the distributed computing. The experimental analysis was performend on 40 well-known datasets and considering 13 algorithms taken from literature. A series of non-parametric tests has also been carried out to determine statistical differences. Results are quite promising in terms of reliability and efficiency on high-dimensional data.
Download

Paper Nr: 38
Title:

Industrial Big Data: From Data to Information to Actions

Authors:

Andreas Kirmse, Felix Kuschicke and Max Hoffmann

Abstract: Technologies related to the Big Data term are increasingly focusing the industrial sector. The underlying concepts are suited to introduce disruptive changes in the various ways information is generated, integrated and used for optimization in modern production plants. Nevertheless, the adoption of these web-inspired technologies in an industrial environment is connected to multiple challenges, as the manufacturing industry has to cope with specific requirements and prerequisites that differ from common Big Data applications. Existing architectural approaches appear to be either partially incomplete or only address individual aspects of the challenges arising from industrial big data. This paper has the goal to thoroughly review existing approaches for industrial big data in manufacturing and to derive a consolidated architecture that is able to deal with all major problems of the industrial big data integration and deployment chain. Appropriate technologies to realize the presented approach are accordingly pointed out.
Download

Short Papers
Paper Nr: 3
Title:

Towards a Software Architecture for Near Real-time Applications of IoT

Authors:

Dominik Grzelak, Carl Mai and Uwe Aßmann

Abstract: The number of Internet of Things (IoT) devices increases and will become an ever important source of information made available through their sensors. As a result, devices form denser networks producing a huge variety and volume of data. If intercommunication and interaction between many decentralized resources are not considered as primary objective by vendors, networking distributed IoT devices will be complicated due to their heterogeneity. Thus, mastering the challenge of collecting and processing data with low-latency is a difficult task. In this paper, we present a practical and easy to employ reference software architecture for fog computing application scenarios, enabling the communication between a multitude of devices which require an efficient and robust real-time system. As proof, we conduct a practical demonstration—a three-dimensional mouse is constructed, called the Cube-It, to control a six-joint robot (i.e., the UR10). The findings of this work are expected to aid researchers studying the integration of heterogeneous IoT devices within fog computing environments comprising many sensors and actuators.
Download

Paper Nr: 27
Title:

SOSE4BD: Service-Oriented Software Engineering Framework for Big Data Applications

Authors:

Muthu Ramachandran

Abstract: Service computing has emerged to address the notion of delivering software as a service and Service-Oriented Architecture emerged as a design method supporting well defined design principles of loose coupling, interface design, autonomic computing, seamless integration, and publish/subscribe paradigm. Integrated big data applications with IoT, Fog, and Cloud Computing grow exponentially: businesses as well as the speed of the data and its storage. Therefore, it is time to consider systematic and engineering approach to developing and deploying big data services as the data-driven applications and devices increasing rapidly. This paper proposes a software engineering framework and a reference architecture which is SOA based for big data applications’ development. This paper also concludes with a simulation of a complex big data Facebook application with real-time streaming using part of the requirements engineering aspect of the SOSE4BD framework with BPMN as a tool for requirement modelling and simulation to study the characteristics before big data service design, development, and deployment. The simulation results demonstrated the efficiency and effectiveness of developing big data applications using the reference architecture framework for big data.
Download

Paper Nr: 28
Title:

Reasoning Methods in Fuzzy Rule-based Classification Systems for Big Data Problems

Authors:

Antonio González, Raúl Pérez and Rocio Romero-Zaliz

Abstract: The analysis with a very high number of examples is a subject of growing interest that needs new algorithms and procedures. In this case, we study how the massive use of data affects the reasoning processes for classification problems that make use of fuzzy rule-based systems. First, we describe the standard reasoning model and the operations associated with its use, and once it is verified that these calculations may be inefficient in some cases we propose a new model to perform such calculations. Basically, the proposal eliminates the need to review all the rules in every inference process, generating the rule that best adapts to the particular example, which does not have to be part of the set of rules, and from it explore only the rules that have some effect on the example. We make an experimental study that shows the interest of the proposal presented.
Download

Paper Nr: 35
Title:

Learning from Others’ Mistakes: An Analysis of Cyber-security Incidents

Authors:

Giovanni Abbiati, Silvio Ranise, Antonio Schizzerotto and Alberto Siena

Abstract: Cyber security incidents can have dramatic economic, social and institutional impact. The task of providing an adequate cyber-security posture to companies and organisations is far far from trivial and need the collection of information about threats from a wide range of sources. One such a source is history in the form of datasets containing information about past cyber-security incidents including date, size, type of attacks, and industry sector. Unfortunately, there are few publicly available datasets of this kind that are of good quality. The paper reports our initial efforts in building a large datasets of cyber-security incidents that contains around 14,000 entries by merging a collection of four publicly available datasets of different size and provenance. We also perform an analysis of the combined dataset, discuss our findings, and discuss the limitations of the proposed approach.
Download

Paper Nr: 43
Title:

A First Approach on Big Data Missing Values Imputation

Authors:

Besay Montesdeoca, Julián Luengo, Jesús Maillo, Diego García-Gil, Salvador García and Francisco Herrera

Abstract: Albeit most techniques and algorithms assume that the data is accurate, measurements in our analogic world are far from being perfect. Since our capabilities of storing and processing data are growing everyday, these imperfections will accumulate, generating poorer decisions and hindering any knowledge extraction process carried out over the raw data. One of the most disturbing imperfections is the presence of missing values. Many inductive algorithms assume that the data is complete, thus if they face missing data they will not work properly or the quality of the knowledge extracted will be poorer. At this point there is no sophisticated missing values treatment implemented in any major Big Data framework. In this contribution, we present two novel imputation methods based on clustering that achieve better results than simply removing the faulty examples or filling-in the missing values with the mean that can be easily ported to Spark’s MLlib.
Download

Paper Nr: 44
Title:

Big Data Preprocessing as the Bridge between Big Data and Smart Data: BigDaPSpark and BigDaPFlink Libraries

Authors:

Diego García-Gil, Alejandro Alcalde-Barros, Julián Luengo, Salvador García and Francisco Herrera

Abstract: With the advent of Big Data, terabytes of data are generated and stored every second. This raw data is far from being perfect, it contains many imperfections (noise, missing values, etc.) and is not suitable for analysis, as it will led to wrong conclusions. Data preprocessing is the set of techniques devoted to polish, clean, fix, and improve that raw data. With this preprocessed data, we would be able to find more patterns in it, and to better explain the underlaying distribution of the data. This is what is called Smart Data, raw data that has been preprocessed and is ready for being analyzed, data that contains valuable information that will led to knowledge. In this work, we present two Big Data libraries for achieving Smart Data from Big Data, BigDaPSpark and BigDaPFlink. They are built on top of two Big Data frameworks, Apache Spark and Apache Flink. Both libraries contain a series of algorithms for Big Data preprocessing, ranging from noise cleaning, to discretization, or data reduction, among many others. Additionally, we ilustrate the usage of the libraries with two cases of use.
Download

Paper Nr: 51
Title:

Challenging Big Data Engineering: Positioning of Current and Future Development

Authors:

Matthias Volk, Daniel Staegemann, Matthias Pohl and Klaus Turowski

Abstract: This contribution examines the terms of big data and big data engineering, considering the specific characteristics and challenges. Deduced by those, it concludes the need for new ways to support the creation of corresponding systems to help big data in reaching its full potential. In the following, the state of the art is analysed and subdomains in the engineering of big data solutions are presented. In the end, a possible concept for filling the identified gap is proposed and future perspectives are highlighted.
Download

Paper Nr: 56
Title:

Research Directions on Big IoT Data Processing using Distributed Ledger Technology: A Position Paper

Authors:

Benjamin Agbo, Yongrui Qin and Richard Hill

Abstract: The significant growth and adoption of Internet of Things (IoT) solutions has led to tremendous increase in the generation of data. The need for high speed data processing has become very important to meet with the ever increasing volume and velocity of IoT data, due to the large scale and distributed nature of IoT infrastructure and networks. Present cloud based technologies are struggling to meet up with these needs for real time data processing in the midst of enormous amounts of data. The success of bitcoin has inspired more research in the application of Distributed ledger technologies in various domains. The decentralized nature of these platforms have enabled security and privacy of data in previous research and their architecture has a potential for enabling large scale decentralized data processing. In this paper, we identify some open areas of research in the use of distributed ledger technology and propose a framework for storing, analyzing and ensuring the security of large volumes of IoT data.
Download

Area 2 - Emerging Services and Analytics

Short Papers
Paper Nr: 47
Title:

Towards an Automated Optimization-as-a-Service Concept

Authors:

Sascha Bosse, Abdulrahman Nahhas, Matthias Pohl and Klaus Turowski

Abstract: Many organizations try to apply analytics in order to improve their business processes. More and more cloud services are offered to support these efforts. However, the support of prescriptive analytics is weak. While concepts for such an optimization-as-a-service exist, these require much expert knowledge in solution methods. In this paper, a workflow for optimization-as-a-service is proposed that utilizes an optimization knowledge base in which machine learning techniques are applied to automatically select and parametrize suitable solution algorithms. This would allow consumers to use the service without expert knowledge while reducing operational costs for providers.
Download

Paper Nr: 53
Title:

Analysis of Filipino Mood Swings within a Day using Tweets

Authors:

Rodalyn A. Balajadia, Vincent L. Maglambayan and Maria R. Pulido

Abstract: We infer moods and how they vary with time using a dataset from the social media application Twitter. We used Python text mining techniques to gather all tweets originating from the Philippines within a span of 24 hours. From the dataset of around 130,000 tweets, we gathered the highest-frequency words and filtered out neutral words to come up with words that imply mood levels. We then plotted the density of keyword usage with respect to time, distinguishing between positive and negative moods. Our initial results of positive mood and negative mood trends are consistent with published studies regarding microblogging mood scales. The emergence of Big Data and the Internet of Things has greatly amplifed our ability not only to express ourselves but to understand each other.
Download

Paper Nr: 54
Title:

Rainfall Distribution Trend Analysis of the Philippine National Capital Region (2013-2016)

Authors:

Miguel M. Bobadilla, Ryan A. Eugenio and Maria R. Pulido

Abstract: The Philippine archipelago is a tropical country that experiences only two major seasons annually: wet (June-November) and dry (December-May). Due to these conditions, the country is bound to experience significant amounts of rainfall, followed by drought. Hence, studying long-term rainfall trends is highly beneficial for the country’s livelihood and safety. In this work, we studied the rainfall distribution in the National Capital Region covering the period of 2013 to 2016, and analysed the data using the Mann-Kendall Test and the Bootstrap procedure. Using a monthly scale, we found a negative trend, signifying a decrease in rainfall amount over the four years of data. Interestingly, we found a positive trend using a yearly scale, showing an increase of rainfall overall. Therefore it is quite risky to generalize a certain region's rainfall condition just by looking at it annually, but must consider as well its seasonal and monthly phenomena for a more detailed analysis. We note also that the area being studied was considerably large and the rainfall data varied with the location of the weather station where it was obtained. This work demonstrates the potential of using Big Data and the Internet of Things to measure and predict weather trends using various sensors and processors.
Download

Paper Nr: 60
Title:

Bias in Filipino Newspapers? Newspaper Sentiment Analysis of the 2017 Battle of Marawi

Authors:

Dexter R. Valdeavilla and Maria R. Pulido

Abstract: Newspapers provide factual reports on current events. However, news media has been shown to be ideologically biased, often negatively shaping the readers' point of view. News on controversial issues makes the bias of the newspaper or its writers more visible. This study aims to measure the objectivity of newspapers by classifying news articles from three newspaper agencies covering the 2017 Battle of Marawi in Southern Philippines. We used Aylien Sentiment Analysis Tool to detect the bias or polarity in each news article (whether positive, negative or neutral). Negative articles on Marawi dominated the three broadsheets (45.1% to 59.9%) while the neutral articles were the least frequent (16.1% to 21.2%). These results indicate that newspapers apply unequal space on the different sides of an issue, which may lead to unbalanced reporting. We also note that despite the varying number of total articles, the three papers applied the same proportion of positive, negative and neutral articles, which may imply collusion. The emergence of Big Data greatly increases the speed of gathering news articles on any given issue, while the Internet of Things enables readers and journalists to measure the objectivity of the news.
Download

Area 3 - Big Data for Multi-discipline Services

Full Papers
Paper Nr: 13
Title:

A Risk Factors Screening Method in the Context-aware System of Hypertension

Authors:

Duoyi Xie, Guixia Kang and Longfeng Chen

Abstract: Hypertension has become a health problem that seriously endangers human life and is the leading cause of cardiovascular disease. Many patients do not know exactly whether their blood pressure is well controlled or not, which makes their conditions worse. A context-aware intelligent system can help patients to analyse their control situation of blood pressure (BP) and provide feedback. It is especially important to determine whether the risk-factors input in the context-aware system of hypertension is appropriate. The choice of risk factors will affect the classification performance and accuracy of the system. The risk factors screening method for hypertension proposed in this paper combined the random forest algorithm and stability selection (RFSS). It can remove the redundant context information, and leave the key factors of BP control situation. Experimental results showed that the prediction accuracy achieved more than 77% prediction accuracy, and dimension of risk factors reduced by 59%. The results indicated that RFSS is an effective method in the screening of risk factors and the prediction of hypertension.
Download

Paper Nr: 70
Title:

Recommendations from Cold Starts in Big Data

Authors:

David Ralph, Yunjia Li, Gary Wills and Nicolas G. Green

Abstract: In this paper, we introduce Transitive Semantic Relationships (TSR), a new technique for ranking recommendations from cold-starts in datasets with very sparse, partial labelling, by making use of semantic embeddings of auxiliary information, in this case, textual item descriptions. We also introduce a new dataset on the Isle of Wight Supply Chain (IWSC), which we use to demonstrate the new technique. We achieve a cold start hit rate @10 of 77% on a collection of 630 items with only 376 supply-chain supplier labels, and 67% with only 142 supply-chain consumer labels, demonstrating a high level of performance even with extremely few labels in challenging cold-start scenarios. The TSR technique is generalisable to any dataset where items with similar description text share similar relationships and has applications in speculatively expanding the number of relationships in partially labelled datasets and highlighting potential items of interest for human review. The technique is also appropriate for use as a recommendation algorithm, either standalone or supporting traditional recommender systems in difficult cold-start situations.
Download

Short Papers
Paper Nr: 6
Title:

Big Data and International Accreditations in Higher Education: A Dutch - Russian Case Study

Authors:

Florentin Popescu, Roman Iskandaryan and Tijmen Weber

Abstract: This comparative study paper seeks to document how with the help of Big Data different aspects of International Accreditations are perceived by both Plekhanov Russian University of Economics and HAN University of Applied Sciences (Arnhem Business School), the Netherlands faculty and higher management. This paper is looking at bringing advice and helping the chosen universities with their international accreditations processes by demonstrating the importance of Big Data. The importance of Big Data and International Accreditations to both universities will be accounted for in this paper. In comparison to current research on Big Data in higher education, this work focuses on the goal of preparing the universities for future academic international accreditations. It is a comparative study where is meant to learn from best practices rather than to generalize and extrapolate results. On a conceptual way, this paper contributes to knowledge by attempting to develop a strategic planning of the international accreditations process by determining the best practices using Big Data while creating a process for internationalization to increase the universities’ global competitiveness.
Download

Paper Nr: 34
Title:

Investigation of Sound-Gustatory Synesthesia in a Coffeehouse Setting

Authors:

Nicole V. Santos and Maria R. Pulido

Abstract: Synesthesia is a perceptual phenomenon involving the stimulation of multiple senses. In this work, we determine the presence of sound-gustatory synesthesia by looking at the possible effects of background music on the perceived taste of a coffee-sugar mixture. We asked participants (N = 83) to listen to music while identifying the tastes they perceived drinking a coffee-sugar sample. Our results showed that sweetness was perceived more while listening to the “Slow” music, which is consistent with previous work. The perception of sourness also increased with the tempo of the music, consistent with work associating sourness with pitch. Interestingly, participants also perceived saltiness and sourness even though the ingredients did not contain ingredients with those tastes, which provides further evidence of sound influencing taste perception. This study has shown the presence of sound-gustatory synesthesia in a typical coffeehouse setting, introducing potential applications in psychophysics, food science, and other complex systems research. Our algorithm has also shown how quantitative tools can be used in a qualitative field such as psychological perception. We expect multisensory, interconnected technology in the Internet of Things to spread the experience of synesthesia within a population, with Big Data enabling researchers to detect and measure synesthesia much more accurately.
Download

Paper Nr: 65
Title:

An Integrated Data Platform for Agricultural Data Analyses based on Agricultural ISOBUS and ISOXML

Authors:

Franz Kraatz, Heiko Tapken, Frank Nordemann, Thorben Iggena, Maik Fruhner and Ralf Tönjes

Abstract: Over the last years many different agricultural online management portals got to market. The focus of these portals is on documentation, accounting and task planning. Data analyses and process planning are often not considerd. For this reason, the existing data in the data platforms of present portals is often badly integrated and consequently not designed for data analyses. This paper introduces a new architecture concept for an integrated agricultural data platform. With this new data platform agricultural data analyses for precision farming become possible. Furthermore, the integration of the agricultural devices and external sources into one platform changes task planning for one machine into a process planning for cooperated machines. Several challenges for the integration of agricultural data and data types for agricultural data analyses are discussed.
Download

Paper Nr: 33
Title:

Quality Management for Big 3D Data Analytics: A Case Study of Protein Data Bank

Authors:

Hind Bangui, Mouzhi Ge and Barbora Buhnova

Abstract: 3D data have been widely used to represent complex data objects in different domains such as virtual reality, 3D printing or biological data analytics. Due to complexity of 3D data, it is usually featured as big 3D data. One of the typical big 3D data is the protein data, which can be used to visualize the protein structure in a 3D style. However, the 3D data also bring various data quality problems, which may cause the delay, inaccurate analysis results, even fatal errors for the critical decision making. Therefore, this paper proposes a novel big 3D data process model with specific consideration of 3D data quality. In order to validate this model, we conduct a case study for cleaning and analyzing the protein data. Our case study includes a comprehensive taxonomy of data quality problems for the 3D protein data and demonstrates the utility of our proposed model. Furthermore, this work can guide the researchers and domain experts such as biologists to manage the quality of their 3D protein data.
Download

Paper Nr: 52
Title:

Investigating the Presence of the Symptoms of Depression among University-age Filipinos

Authors:

Dexter R. Valdeavilla, Nicole V. Santos, Agana S. Domingo and Maria R. Pulido

Abstract: Depression is a mental illness that negatively affects how a person feels, thinks and acts. In this work, we used an online survey to ask 501 Filipino university-age students on symptoms commonly associated with depression: sadness or isolation, headaches or migraine, anxiety over everyday activities, moodiness or irritability or agitation, chronic fatigue, and low self-esteem or motivation. We learned that all respondents experience at least one symptom weekly. Most respondents (52.7%) experience all six symptoms weekly, with 1 to 3 days a week being the most common frequency. An overwhelming majority attributed such symptoms to academics (92.6%), followed by family (69.5%) and friends (49.5%). Lastly, most (41.9% - 59.7%) believe they have around 1-3 friends with the same symptoms that they experience. The researchers are calling for an increased awareness of mental health issues and good practices, especially within homes and schools, to address the prevalence of depression in university-age Filipinos. The prevalence of Big Data and the Internet of Things within this particular demographic greatly enhances the ability of mental health professionals and researchers to detect and hopefully address the symptoms of depression.
Download

Area 4 - Internet of Things (IoT) Applications

Full Papers
Paper Nr: 64
Title:

OPeRAte: An IoT Approach towards Collaborative, Manufacturer-independent Farming 4.0

Authors:

Maik Fruhner, Thorben Iggena, Franz Kraatz, Frank Nordemann, Heiko Tapken and Ralf Tönjes

Abstract: Without modern agricultural technology, it would not be possible to feed the world’s steadily growing population. In order to be able to handle this task in the future, new improvements must be constantly developed to improve agriculture and increase yields. These include methods known as ’Precision Farming’, ’Smart Farming’ or ’Farming 4.0’. These terms describe working in the field with high-tech machines that are supported by intelligent systems that communicate with each other. But at the present time, such intercommunicating ecosystems are manufacturer-bound and hardly interoperable. This paper presents a new approach to connecting various agricultural machines from different manufacturers into a common network of IoT devices. In this project, a framework for the orchestration of agricultural processes is being developed that is capable of planning, controlling, monitoring and documenting joint collaborative tasks between many independent machines while breaking the commitment to a single manufacturer. The first application example of the system is the development of a tank trailer for liquid slurry spreading with various sensors and controls. By using IoT-specific technologies, the tank can already be configured by a process management system, so that exact nutrient quantities are applied part-field-specifically and a legally compliant documentation is generated.
Download

Area 5 - Internet of Things (IoT) Fundamentals

Full Papers
Paper Nr: 11
Title:

Ensembled Outlier Detection using Multi-Variable Correlation in WSN through Unsupervised Learning Techniques

Authors:

Marc Roig, Marisa Catalan and Bernat Gastón

Abstract: Outlier detection in Wireless Sensor Networks is a crucial aspect in IoT, since cheap sensors tend to be seriously exposed to errors and inaccuracies. Hence, there is the need of a solution to improve the quality of the data without increasing the cost of the sensors. In Big Data paradigms, it is difficult to exploit the temporal correlation of sensors since Big Data architectures and technologies do not process data in order. In this paper, a complete study of multi-variable based outlier detection is carried out. Firstly, three known unsupervised algorithms are analysed (Elliptic Envelope, Isolation Forest and Local Outlier Factor) and are tested in a big data architecture. Secondly, an ensemble outlier detector (EOD) is created with the outputs of these algorithms and it is compared, in a Lab environment, with previous results for different parameters of contamination of the training set. The analysis of the results show that for correlated variables, multi-variable EOD has a very good detection rate with a very low false alarm rate. Finally, the EOD is used in a real world scenario in the city of Barcelona and the results are analysed using spectral-decomposition techniques which indicate that EOD has a good performance in a real case.
Download

Paper Nr: 15
Title:

A Systematic Mapping Study of Deployment and Orchestration Approaches for IoT

Authors:

Phu H. Nguyen, Nicolas Ferry, Gencer Erdogan, Hui Song, Stéphane Lavirotte, Jean-Yves Tigli and Arnor Solberg

Abstract: Internet of Things (IoT) systems are typically distributed and perform coordinated behavior across IoT, edge and cloud infrastructures. Because of the dynamic and heterogeneous nature of these infrastructures, the IoT is challenging state of the art approaches for the deployment and orchestration of software systems. We need a clear picture of the research landscape of the existing deployment and orchestration approaches for IoT (DEPO4IOT). Such a picture can show us how advanced the current state of the art is and what are the gaps to address. We conducted a systematic mapping study (SMS) to find out the research landscape in this area. The results of our SMS show the overall status of the key artifacts of DEPO4IOT. Among the results, we found a sharp increase in the number of primary DEPO4IOT publications in two recent years. We also found that most approaches do not really support the deployment or orchestration at low-level IoT devices. Meanwhile, there is a lack of addressing the trustworthy aspects and advanced supports in the existing DEPO4IOT approaches. Finally, we point out the current open issues in this research area and suggest potential research directions to tackle these issues.
Download

Paper Nr: 25
Title:

Formal Analysis of Energy Consumption in IoT Systems

Authors:

Oualid Demigha and Chamseddine Khalfi

Abstract: In this paper, we apply model-checking approach to formally analyze energy consumption of the radio interface in order to guarantee network lifetime in the context of the Internet of Things. We propose a joint MAC-physical model of the IEEE 802.15.4 standard to capture and represent key operations that consume energy resources of the nodes at the radio interface component. We argue that the combination of the radio interface ON/OFF state switching mechanism with CSMA/CA medium access method leads to better modeling of the energy consumption and help understanding the interaction between MAC and physical layers. Our model provides accurate representation of simulation models at the first two layers of node’s protocol stack compliant with IEEE 802.15.4 standard.
Download

Paper Nr: 32
Title:

MQTT-RD: A MQTT based Resource Discovery for Machine to Machine Communication

Authors:

Eliseu Pereira, Rui Pinto, João Reis and Gil Gonçalves

Abstract: The Internet of Things (IoT) is one of the key enablers for digital businesses and economic growth. By interconnecting objects and people through diverse heterogeneous networks, using Machine to Machine (M2M) communication, IoT enables the continuous monitoring of devices its surrounding environment, proving to have a huge potential in terms of new business opportunities. One of the biggest challenges nowadays in M2M communication, is the way devices are capable to look up for other devices and their services in local networks and internet. This paper proposes a distributed resource discovery architecture (MQTT-RD) based on the MQTT protocol. The proposed architecture enables decentralized discovery and management of devices in multiple networks, by introducing plug and play capabilities to devices, contributing for a mechanism for zero-configuration networking in IoT environments. This architecture was tested in an experimental environment, composed of multiple devices, in order to test resource discovery capabilities using an MQTT based protocol. The evaluated metrics were the overall message drop in the network, the delays in the delivery of messages and the processing time of each message.
Download

Paper Nr: 66
Title:

Stream Generation: Markov Chains vs GANs

Authors:

Ricardo Jesus, Mário Antunes, Pétia Georgieva, Diogo Gomes and Rui L. Aguiar

Abstract: The increasing number of small, cheap devices full of sensing capabilities lead to an untapped source of information that can be explored to improve and optimize several systems. Yet, hand in hand with this growth goes the increasing difficulty to manage and organize all this new information. In fact, it becomes increasingly difficult to properly evaluate IoT and M2M context-aware platforms. Currently, these platforms use advanced machine learning algorithms to improve and optimize several processes. Having the ability to test them for a long time in a controlled environment is extremely important. In this paper, we discuss two distinct methods to generate a data stream from a small real-world dataset. The first model relies on first order Markov chains, while the second is based on GANs. Our preliminiar evalution shows that both achieve sufficient resolution for most real-world scenarios.
Download

Short Papers
Paper Nr: 7
Title:

Flexible IoT Edge Computing System to Solve the Tradeoff of Optimal Route Search

Authors:

Tadashi Ogino

Abstract: In recent times, large-scale cloud computing based Internet of Things (IoT) systems are facing problems such as an increase in network load, delay in response, and invasion of privacy. To solve these problems, edge computing technique has been employed in many IoT systems. However, if the cloud function is excessively migrated to the edge, the collected data cannot be shared between IoT systems, thus, reducing the system's usefulness. We propose a multi-agent based flexible IoT edge computing architecture to balance global optimization by a cloud and local optimization by edges and to optimize the role of both the cloud and the edge servers in a dynamic manner. In this paper, as an application example, we introduce a route search system based on the proposed edge computing system architecture to demonstrate the effectiveness of the proposed method.
Download

Paper Nr: 10
Title:

Managing Application-level QoS for IoT Stream Queries in Hazardous Outdoor Environments

Authors:

Holger Ziekow, Annika Hinze and Judy Bowen

Abstract: While most IoT projects focus on well-controlled environments, this paper focuses on IoT applications in the wild, i.e., rugged outdoor environments. Hazard warnings in outdoor monitoring solutions require reliable pattern detection mechanisms, while data may be streamed from a variety of sensors with intermittent communication. This paper introduces the Morepork system for managing application-level Quality of Service in stream queries for rugged IoT environments. It conceptually treats errors as first class citizens and quantifies the impact on application level. We present a proof of concept implementation, which uses real-world data from New Zealand forestry workers.
Download

Paper Nr: 22
Title:

TriCePS: Self-optimizing Communication for Cyber-Physical Systems

Authors:

Jia L. Du, Stefan Linecker, Peter Dorfinger and Reinhard Mayr

Abstract: The progress in energy-efficient, cost-effective and highly capable sensor-actuator electronics and data transmission technologies has been triggering a new phase in the digital transformation with potentially billions of cyber-physical systems connected in the Internet of Things in future. To fully harvest the potential of this development, a strategy for efficient, robust, interoperable and future-proof communication between a myriad of different CPS in a global network is essential. Such a strategy will have to cope with the desire for communication between potentially up front unknown systems using up front unknown communication networks under unknown conditions. Within the TriCePS project a framework and missing building blocks for adaptive communication for cyber-physical systems are designed and developed. The three main pillars will be Application Adaptation, Protocol Negotiation and Protocol Parameter Optimization. In this paper, the general concept and architecture as well as first results are presented.
Download

Paper Nr: 55
Title:

On the Complexity of Cloud and IoT Integration: Architectures, Challenges and Solution Approaches

Authors:

Damian Kutzias, Jürgen Falkner and Holger Kett

Abstract: Cloud Computing and the Internet of Things (IoT) shift from trend technologies to well established and broadly accepted means to foster business development and service quality. For utilising their full potential in the context of complex systems, applications based on these technologies often have to be properly integrated resulting in major challenges. In this paper, we provide means for better understanding possible occurrences of integration challenges when establishing Cloud and IoT systems. We briefly present several existing Cloud and IoT architectures and a survey on existing integration challenges. Based on these results, we derived an overall integration architecture as a supporting tool for the indication of the different integration challenges, which is presented in a short and full version due to the overall complexity. At last, some general approaches for integration are discussed.
Download

Paper Nr: 68
Title:

Performance Evaluation of "Dynamic Double Trickle Timer Algorithm" in RPL for Internet of Things (IoT)

Authors:

Muneer B. Yassein, Ismail Hmeidi, Haneen Shehadeh, Waed B. Yaseen, Esra’a Masadeh, Wail Mardini, Yaser Khamayseh and Qanita B. Baker

Abstract: Internet of Things (IoT) is a modern technology which used to support a variety of domains and applications in life. It is based on connecting various devices which can communicate with each other without the need for human intervention. Low Power and Lossy Networks (LLN), which already used IOT techniques, suffer from limited energy and resources. Special protocols have been designed for LLN, like RPL which uses the Trickle Timer algorithm, it turns to the act as a router and organizer for transmission of messages in the network. However, the trickle algorithm suffers from performance deficiency problems such as prolonged time and high power consumption. Therefore, there are such efforts to develop Trickle Timer algorithm to solve performance shortcomings in the algorithm. This work is an attempt to enhance the trickle timer algorithm to overcome delay and energy consumption problems, using dynamic doubling technique. Researchers used Cooja 2.7 simulator to evaluate the performance of the proposed algorithm by using several metrics: packet delivery ratio, convergence time and power consumption. The simulation examined under different scenarios. It also showed better results in performance and lower energy consumption of the proposed algorithm.
Download

Paper Nr: 46
Title:

Life Cycle-Oriented Evaluation of Cyber-Physical Systems

Authors:

K. Höse and U. Götze

Abstract: Cyber-physical systems as technical enabler of “Industrie 4.0” (I4.0) have been discussed in many published papers. The application of I4.0-technologies allows for an intelligent interconnection between product development, logistics, customers and production. As a result, it is expected that the implementation of I4.0-technologies contributes to the protection of economic wealth of companies and society. This trend enables innovative processes and products right up to new business models. Nevertheless, companies often hesitate to invest in I4.0-solutions. The uncertainty of the benefit of using I4.0 is one reason making an economic consideration of I4.0-solutions necessary. Therefore, a structured analysis and evaluation of I4.0-solutions in form of CPS is the topic of this paper. Firstly, the evaluation requirements are described. One main requirement is the life cycle-oriented analysis of CPS, because not only the implementation costs and expenditures are important, but also the prospective costs and benefits of the application of CPS. Afterwards, a decision theory-based procedure model is suggested to handle the complexity of a life cycle-oriented evaluation. Within the description of the steps of the procedure model, characteristics and challenges regarding the evaluation of CPS are discussed. Additionally, instruments and methods, which support the evaluation of CPS, are presented.
Download

Paper Nr: 73
Title:

A Real Data Analysis in an Internet of Things Environment

Authors:

João V. Poletti, Lucas E. Martins, Samuel Almeida, Maristela Holanda and Rafael D. Sousa Júnior

Abstract: The Internet of Things (IoT) emerged as a consequence of the advanced development of increasingly interconnected intelligent devices. These devices integrate within our environment to achieve specific goals that can relate to the areas of object tracking, health care, security, transport, and recreation. However, the amount of devices connected to the Internet and their variety is a problem that needs attention. The purpose of this paper is to present analysis based on real data retrieved from devices inside an IoT universe. The paper proposes a strategy for data extraction as well as a method for handling the information by filtering it and applying an analysis in order to identify different types of measuring devices and techniques to validate the measurements retrieved from the objects. Two techniques from the data mining were used, linear regression and clustering, and another one was developed. The results give different alternatives for the distribution of data in hypothetical devices that were inferred.
Download

Area 6 - IoT Technologies

Short Papers
Paper Nr: 21
Title:

In-Vehicle IoT Platform Enabling the Virtual Sensor Concept: A Pothole Detection Use-case for Cooperative Safety

Authors:

Ilaria Bosi, Enrico Ferrera, Daniele Brevi and Claudio Pastrone

Abstract: Nowadays the number of on-board sensors increases continuously due to their benefits in many different areas, such us driving efficiency, maintenance, autonomous driving, etc. Usually the vehicle itself and its users are those which take direct advantage from these benefits. By leveraging Internet-of-Things (IoT) technologies, it is possible to abstract data and functionalities provided by on-board sensors and actuators exposing relevant services outside the vehicle to external cloud-based applications and other vehicles. With these technologies the vehicle is thus transformed in an IoT object which can be part of external IoT platforms. This work focuses on the design and implementation of an in-vehicle IoT platform which exposes internal functionalities as IoT services enabling also the concept of “Virtual Sensor”, which leverages sensor fusion techniques to provide enhanced services combining raw data coming from on-board devices. This IoT platform solution is validated through a use case in which virtual real-time pothole detection sensor is implemented to evaluate the road surface conditions. In such use-case, multi-source sensing information - coming from 6LoWPAN sensors as well as Smartphones and Inertial Measurement Units - is fused, enabling IoT applications such as cooperative safety and early road maintenance.
Download

Paper Nr: 49
Title:

Integrity Issues for IoT: From Experiment to Classification Introducing Integrity Probes

Authors:

Pascal Urien

Abstract: This paper presents a tentative classification of IoT devices. The goal is to provide a qualitative estimation of risks induced by device hardware and software resources involved in firmware update operations. We present technical features available in existing devices, and comment associated threats. From this analysis we extract five basic security attributes: one time programmable memory, firmware downloader, secure firmware downloader, tamper resistant hardware, and diversified keys. From these parameters we deduce and comment six security classes. We describe an innovative integrity probe working with commercial programmers, of which goal is to verify a bootloader integrity.
Download

Paper Nr: 57
Title:

Enabling Distributed Intelligence in the Internet of Things using the IOTA Tangle Architecture

Authors:

Tariq Alsboui, Yongrui Qin and Richard Hill

Abstract: It is estimated that there will be approximately 26 to 30 billion Internet of Things (IoT) devices connected to the Internet by 2020. This presents research challenges in areas such as data processing, infrastructure scalability, and privacy. Several studies have demonstrated the benefits of using distributed intelligence to overcome these challenges. This article reviews existing state-of-the-art distributed intelligence approaches in IoT and focuses on the motivations and challenges for distributed intelligence in IoT. We propose a potential solution based on IOTA (Tangle), a platform that enables highly scalable transaction-based data exchange amongst large quantities of smart things in a peer-to-peer manner, together with mobile agents to support distributed intelligence. Challenges and future research directions are also discussed.
Download

Paper Nr: 29
Title:

IoT based Driver Information System for Monitoring the Load Securing

Authors:

Jurij Kuzmic, Günter Rudolph, Walter Roth and Michael Rübsam

Abstract: This paper presents an electronic cargo strap system for monitoring load securing in trucks and car trailers. Various measuring techniques and sensors for measuring the force on lashing belts are investigated. In addition, a data access layer (back end) and a presentation layer (front end) have been developed for the system in order to be able to monitor the load while driving. Moreover, radio data transmission, encryption of transmission data and power supply of the systems has been realized. Furthermore, some prototypes have been created in order to test the developed systems. A series of practical tests have been performed to test the electronic cargo strap systems under real-world conditions.
Download

Area 7 - Security, Privacy and Trust

Full Papers
Paper Nr: 14
Title:

PSSST! The Privacy System for Smart Service Platforms: An Enabler for Confidable Smart Environments

Authors:

Christoph Stach, Frank Steimle, Clémentine Gritti and Bernhard Mitschang

Abstract: The Internet of Things and its applications are becoming increasingly popular. Especially Smart Service Platforms like Alexa are in high demand. Such a platform retrieves data from sensors, processes them in a back-end, and controls actuators in accordance with the results. Thereby, all aspects of our everyday life can be managed. In this paper, we reveal the downsides of this technology by identifying its privacy threats based on a real-world application. Our studies show that current privacy systems do not tackle these issues adequately. Therefore, we introduce PSSST!, a user-friendly and comprehensive privacy system for Smart Service Platforms limiting the amount of disclosed private information while maximizing the quality of service at the same time.
Download

Paper Nr: 36
Title:

In-depth Comparative Evaluation of Supervised Machine Learning Approaches for Detection of Cybersecurity Threats

Authors:

Laurens D’hooge, Tim Wauters, Bruno Volckaert and Filip De Turck

Abstract: This paper describes the process and results of analyzing CICIDS2017, a modern, labeled data set for testing intrusion detection systems. The data set is divided into several days, each pertaining to different attack classes (Dos, DDoS, infiltration, botnet, etc.). A pipeline has been created that includes nine supervised learning algorithms. The goal was binary classification of benign versus attack traffic. Cross-validated parameter optimization, using a voting mechanism that includes five classification metrics, was employed to select optimal parameters. These results were interpreted to discover whether certain parameter choices were dominant for most (or all) of the attack classes. Ultimately, every algorithm was retested with optimal parameters to obtain the final classification scores. During the review of these results, execution time, both on consumer- and corporate-grade equipment, was taken into account as an additional requirement. The work detailed in this paper establishes a novel supervised machine learning performance baseline for CICIDS2017. Graphics of the results as well as the raw tables are publicly available at https://gitlab.ilabt.imec.be/lpdhooge/cicids2017-ml-graphics.
Download

Paper Nr: 45
Title:

In Reviews We Trust: But Should We? Experiences with Physician Review Websites

Authors:

Joschka Kersting, Frederik S. Bäumer and Michaela Geierhos

Abstract: The ability to openly evaluate products, locations and services is an achievement of the Web 2.0. It has never been easier to inform oneself about the quality of products or services and possible alternatives. Forming one’s own opinion based on the impressions of other people can lead to better experiences. However, this presupposes trust in one’s fellows as well as in the quality of the review platforms. In previous work on physician reviews and the corresponding websites, it was observed that there occurs faulty behavior by some reviewers and there were noteworthy differences in the technical implementation of the portals and in the efforts of site operators to maintain high quality reviews. These experiences raise new questions regarding what trust means on review platforms, how trust arises and how easily it can be destroyed.
Download

Paper Nr: 63
Title:

Semi Fragile Watermarking Technique using IWT and a Two Level Tamper Detection Scheme

Authors:

Nandhini Sivasubramanian and Gunaseelan Konganathan

Abstract: A semi fragile watermarking technique using a two level thresholding scheme for tamper detection is proposed. The proposed embedding technique uses two level IWT (integer wavelet transform) to embed the authentication watermark. The authentication watermark generated from the approximate coefficients is stored in the detail coefficients using least significant substitution to form the watermarked image. The proposed tamper detection technique for identifying attacks in the watermarked image is a two level thresholding scheme using normalized hamming similarity (NHS) and a tamper detection map. The performance of the proposed technique was evaluated for a variety of content preserving manipulations and malicious attacks. The proposed technique produces a better performance in terms of an increased PSNR (Peak Signal to Noise Ratio) of the watermarked image and by localizing the malicious attacks when compared to the existing techniques. The significant performance of the proposed semi fragile watermarking technique is due to the combined results from both the NHS and the tamper detection map which helps in localizing the malicious attacks and identifying the incidental manipulations. Also, the authentication watermark which is a copy of the original image helps in identifying the tampered regions in the attacked watermarked image.
Download

Short Papers
Paper Nr: 30
Title:

Linux Patch Management: With Security Assessment Features

Authors:

Soranut Midtrapanon and Gary Wills

Abstract: The lack of patch management has been identified as the main reason for many ransomware attacks. The cost of patch management is still an obstacle for many small and medium-size businesses. There are many open source, free of charge, patch management systems but these require many pre-configuration steps making them complicated to use. Hence, this paper presents a patch management system that is cost-effective but also efficient in terms of set-up time. We have written the system in Python with Puppet and Mcollective to aid the configuration steps. An additional feature of this system is the ability to assess the security of the system being patched, using CVE scanning.
Download

Paper Nr: 31
Title:

A Survey on RFID Security and Privacy in Smart Medical: Threats and Protections

Authors:

Xinghua Shi, Jinxuan Cao, Tianliang Lu and Victor Chang

Abstract: In recent years, with the rapid development of the Internet of things, smart medical has been gradually integrated into people's lives. Among them, RFID technology in the Internet of things is the most prominent application in the medical and health industry. However, these emerging technologies will bring many security and privacy problems when they are integrated into people's lives. After examining the possible security and privacy threats brought by RFID in smart medical, this paper surveys security requirements most suitable for this industry, and then compares all current RFID security and privacy protection technologies to analysis whether they are suitable for the smart medical industry. Next, the survey sets out the most far-reaching RFID standards and summarized their advantages and shortcomings in smart medical. Finally, the survey puts forward constructive suggestions on security and privacy protections for hospitals and patients involved.
Download

Paper Nr: 41
Title:

A Security Framework to Protect Data in Cloud Storage

Authors:

Farashazillah Yahya, Victor Chang, Robert J. Walters and Gary B. Wills

Abstract: With the success and widespread adoption of Cloud Computing Cloud storage has become the storage option of choice for many computer users wishing to keep their data online. This paper presents a framework to explore and evaluate security threats to data held in Cloud Storage. The Cloud Storage Security Framework (CSSF) has been developed both from consideration of established good practice as described in existing literature and the opinions of cloud storage managers and experts using a questionnaire and separate interviews. The purpose of the framework is to support researchers and managers of Cloud storage to understand the nine identified factors of security in Cloud storage and how to ensure security measures are successful. CSSF can also integrate with another framework to produce a greater impact and strengthens its research contributions.
Download

Paper Nr: 59
Title:

Knowledge Design in the Internet of Things: Blockchain and Connected Refrigerator

Authors:

Samuel Szoniecky and Amri Toumia

Abstract: The Internet of Things takes place in our daily life, but many users do not understand their relationships and interactions with these objects. We assume that dynamic and interactive representations of the power of action of users and objects are means to better understand what these devices are capable of. To do this, we design a secure and privacy-conscious design of knowledge in the connected object environment. We will analyze the example of a connected refrigerator to understand how to use the Blockchain to develop Digital Social Innovations.
Download

Paper Nr: 61
Title:

Search, Find and Resolve: Towards a Taxonomy for Searchable Encryption Schemes

Authors:

Ines Kramer, Silvia Schmidt, Mathias Tausig and Manuel Koschuch

Abstract: Searchable Encryption (SE) schemes are a promising solution to the problem of outsourcing one’s data to a cloud provider in a secure way, while still retaining the ability to search for and easily retrieve specific documents. A multitude of different schemes have been proposed and designed, yet in general they still lack usability/applicability for a specific use case or proper security analysis in order to be widely implemented and used. To address this issue we started a project to determine which SE schemes fit certain use cases - mainly focusing on usability. We examined nearly 400 papers on SE schemes from the last 13 years and extracted categorization domains for SE schemes. Furthermore we took a time-based look at these domains and tried to identify future trends in SE technologies. In this position paper we introduce our methodology and give a short overview of our current work-in-progress.
Download