The new industrial revolution, nowadays called Industry 4.0, pursues the adoption of Cyber Physical Systems (CPS) to enable the creation of a real-time, precise, reliable, monitoring system able to feed analytics solutions to support the automation, the control, and the improvement of the implemented business processes. With respect to the current solutions, pervasiveness of the Internet of Things together with the ability to manage and process big amounts of data in real-time, makes the Industry 4.0 a paradigm that can bring a lot of advantages into the Factories of the Future.

With the development of the IIoT connectivity and the cloud computing infrastructures, a huge amount of data comes up in order to boost new business models supported in the data analytics. These are the three main pillars of innovation and transformation in manufacturing based on IT.

But there is actually a real world, represented by the automation pyramid, where sensors & embedded systems, cyber-physical components, machines, production lines and factories represent the different levels. This real world has a twofold representation in the data realm, represented as a virtual world (Figure 1).

Figure 1. Real and virtual world in manufacturing

Manufacturing firms not only seek manufacturing technique innovations but are also beginning to focus on added value services and new business models, creating a fuzzy boundary between manufacturing industry and service industry.

In this context data is meant to be the main transformation asset. The digitation of the industry is a necessary condition to foster the digital economy and data-driven services. The problem and challenge is complex, but a good solution can be outlined based on envisioning that data is an asset for transformation, the adoption of IT resources for the management of this asset and incorporation of talented data scientists to transform the organization.


A real world example application

In order to address the explained challenges IK4-IDEKO has defined a data management framework (Figure 2) that will be empowered by the DITAS platform.

Figure 2. IK4-IDEKO data management architecture

The data management architecture supports powerful data engineering platforms in order to fulfill current and forthcoming customer needs with a proper quality of service and commissioning cost. The digitation of the industry is not just a matter of technical feasibility but ROI (return on investment) and productivity. The productivity in the area of data-intensive applications in manufacturing has two faces. On the one hand the impact in the shop floor has to be minimal, trying to avoid machine downtimes when setting up the data gathering layer. On the other hand, the Business Intelligence services have to be implemented and published in an efficient way.

The first issue relies on the capability to connect to different types of machines, read data and send it to the cloud for persistence. IK4-IDEKO’s data gathering devices can connect to diverse numeric controls as Heidenhain, Fanuc or Siemens, or even to data gathering sensors or platforms. Once the device is connected to a machine, a web management tool can be used to remotely configure the device and define the signals or variables that will be read from the machine, with no impact on the production.

On the other hand, the productivity in the development of data-intensive applications has to do with some key factors:

  • Data engineering processes to move data from the field level to the cloud level
  • Flexible data persistence model
  • Domain oriented analytical data space
  • Visual Analytics platform

Data architecture

The management, transformation and treatment of the data is the most important stage in a data-driven approach in order to make sense of a myriad of variables like temperature, speed, override, power, revolutions, vibration, etc.

The data persistence layer is divided in two parts. First, a data lake model or/and a NoSQL database are used for general storage, where data from the machines is stored and tagged using metadata to provide data lifecycle and management capabilities. The second part of the persistence layer is an analytical database based on a data mart model supported in a Qlik Business Intelligence Server with different data models defined by machine. There are data engineering processes of Extraction, Transformation and Loading that update and aggregate the analytical database from the raw data storage.

Visualization layers

The visualization layer is also divided in two parts. The first one is focused on machine monitoring. The information is shown in real time and the monitoring can be about production (state, alarms), process (current machining process) or condition (health and symptoms). For this purpose, there is a web interface generator, a toolbox with different visualization widgets and ad-hoc windows can be designed by the end users. A process monitoring interface is displayed in Figure 3.

Figure 3. Process monitoring

For this scenario, the data lake provides the real time stream data to be presented in the web application.

The second part of visualization is focused in visual analytics and insights in order to enhance the EDA (exploratory data analysis) and communication. For this purpose, Qlik Business Intelligence tools are integrated in the platform (Figure 4).

Figure 4. EDA using third party tools


The digitization of the industry in order to boost and benefit from digital opportunities depends in a great extent on the capabilities of setting up strong data engineering and data science foundations. Moving and processing big amounts of data have important costs and the ROI can only be guaranteed by lean development processes and infrastructures based on app and data factories. DITAS will have an important role to play in this purpose to not drown in the data lake.