Intelligent Hardware-Software Processing of High-Frequency Scanning Data

— The constant emission of polluting gases is causing an urgent need for timely detection of harmful gas mixtures in the atmosphere. A method and algorithm of the determining spectral composition of gas with a gas analyzer using an artificial neural network (ANN) were suggested in the article. A small closed gas dynamic system was designed and used as an experimental bench for collecting and quantifying gas concentrations for testing the proposed method. This device was based on AS7265x and BMP180 sensors connected in parallel to a 3.3 V compatible Arduino Uno board via QWIIC. Experimental tests were conducted with air from the laboratory room, carbon dioxide (CO 2 ), and a mixture of pure oxygen (O 2 ) with nitrogen (N 2 ) in a 9:1 ratio. Three ANNs with one input, one hidden and one output layer were built. The ANN had 5, 10, and 20 hidden neurons, respectively. The dataset was divided into three parts: 70% for training, 15% for validation, and 15% for testing. The mean square error (MSE) error and regression were analyzed during training. Training, testing, and validation error analysis were performed to find the optimal iteration, and the MSE versus training iteration was plotted. The best indicators of training and construction were shown by the ANN with 5 (five) hidden layers, and 16 iterations are enough to train, test and verify this neural network. To test the obtained neural network, the program code was written in the MATLAB. The proposed scheme of the gas analyzer is operable and has a high accuracy of gas detection with a given error of 3%. The results of the study can be used in the development of an industrial gas analyzer for the detection of harmful gas mixtures.


INTRODUCTION
Air pollution is one of the current problems in the world, especially in urban areas of developing countries [1]- [3].With the growth of technological process in the modern world, the number of industrial enterprises increases, the safety level of which, must meet high standards.Timely detection of combustible gases and vapors in the air of industrial premises and territory in concentrations much less than explosive ones and their localization is an important task for compliance with safety rules and fire safety standards [4]- [6].
Gaseous pollutants with characteristics of easily diffuse, difficulty of detection, and harsh treatment have become some of the most harmful pollutants to human health among all industrial wastes [7], [8].According to the World Health Organization (WHO), in the beginning of August 2023, 91% of the world's population lives in places where air quality exceeds WHO guideline limits, and 7 million people die every year as a result of exposure to fine particles in polluted air [9].Fig. 1 shows data from the WHO Ambient Air Quality Database for 2012-2022 [10].The atmosphere contains many traces of gases such as ozone (O3), methane (CH4), carbon monoxide (CO), nitrogen dioxide (NO2), hydrogen sulfide (H2S) and sulfur dioxide (SO2), which exist from a certain concentration and maintain a dynamic equilibrium.Сontinuous emissions of polluting gases from industry, power generation, as well as car exhaust emissions gradually lead to a decrease in the concentration of gas in the atmospheric environment.As a result, increasingly serious air pollution problems, such as the greenhouse effect and various lung diseases, are occurring [11], [12].For example, Shwetha, Sharath, Guruprasad, Rudraswamy in [13] state that carbon dioxide (CO2) has a harmful effect on the ecosystem, causing acid rain, increasing global temperatures and ultimately affecting human health.Therefore, carbon dioxide has traditionally been considered one of the most serious pollutants in the atmosphere.X. Yin and et al. in their paper [14] explain that hydrogen sulfide disrupts the biological process of cell oxidation and prevents cellular respiration, which eventually leads to cell suffocation and hypoxia.
Currently, there is a significant technological growth in the field of artificial intelligence design [15] and, in particular, artificial neural networks [16], [17] [18], develop new drugs [19] and treat patients.In finance, ANNs are used to predict the prices of stocks [20], bonds and other financial instruments.In manufacturing, ANNs are used for process optimization, product quality control [21] and logistics management [22].ANNs have the ability to retrain themselves to improve their performance when new data becomes available.
The use of gas analyzers makes it possible to determine the concentration or type of the analyzed substance in a timely manner by measuring its physical or physicalchemical properties.Using ANN in gas analyzers will allow for the detection of a wider variety of gases.Currently, there are a number of studies on the development of gas analyzers with ANNs.These studies show that ANNs can be used to develop gas analyzers that have higher measurement accuracy than existing gas analyzers, and can also be used to develop gas analyzers that are easier to operate and do not require periodic calibration.Table I summarizes examples of research on the development of gas analyzers of different configurations.
The purpose of this study was to develop a method and algorithm for determining the spectral composition of gas with a gas analyzer using an artificial neural network.

II. RELATED WORK
A neural network (ANN) is a computational model in the form of software and hardware embodiment, inspired by the way that biological neural networks work.A neural network (NN) is a massively parallel processor consisting of elementary units of information processing, learning from experience and applying it to other tasks [28].
The concept of ANNs arose from the study of the processes that occur in the brain and the attempt to model these processes.McCulloch and Pitts [29] were the first scientists to attempt to describe these processes.They demonstrated that the brain consists of a large network of neural units, which act as simple and reliable logical elements.Abiodun, et al. wrote in [30] that ANNs are a model of information management that is similar to the function of the biological nervous system of the human brain.
Artificial neural networks have recently become a popular and useful model for classification, clustering, pattern recognition, and prediction in many disciplines.ANNs are a type of machine learning model, and have become relatively competitive with conventional regression and statistical models in terms of utility [31].
An artificial neural network is a system of connected and interacting simple processors, called artificial neurons.These processors are typically much simpler than the processors used in personal computers.Each processor in the network only deals with the signals it receives and sends to other processors.However, when connected into a large enough network with controlled interaction, these individually simple processors can perform rather complicated tasks [32].
For example, Lu, et al. developed an artificial neural network (ANN) and convolutional neural network (CNN) to predict the amorphous forming capacity of various amorphous alloys [42].Bedford and Hanson investigated the performance of a recurrent neural network for image processing to detect delivery errors during portal dosimetry in volume-modulated arc therapy as early as possible in the treatment process [43].Sinzinger, Kerkvoorde, Pahr, and Moreno applied spherical CNNs to estimate the apparent stiffness tensor of trabecular bone [44].Raissi, Perdikaris, and Karniadakis applied physics-based neural networks to solve forward and inverse problems involving nonlinear partial differential equations [45].
Each type of these gas analyzers has a number of advantages and disadvantages and is used in certain cases.For the present study, gas analyzers with infrared sensors are of the greatest interest.
The principle of operation of optical infrared sensors is based on the absorption of energy from a light beam by the molecules of the gas being detected in the ultraviolet, visible, or infrared region of the spectrum [55].Existing gas analyzers mainly operate in the infrared region of the spectrum.Infrared sensors do not alter the sample and do not require oxygen to operate.The output signal of infrared sensors is relatively independent of the sample flow rate.They have a long service life and are not susceptible to corrosion, contamination, or mechanical damage.This type of sensor can be used for selfdiagnostics to verify the sensitivity to the component being detected.Other advantages of this method include: a) high stability; b) no ambiguity of readings at concentrations exceeding the lower concentration limit of flame propagation; c) resistance to poisoning; d) less frequent maintenance due to self-diagnostics Automatic calibration, the ability to monitor the infrared source for proper operation, and the ability to compensate for optical contamination can extend maintenance-free operation.However, special attention should be paid to the timely cleaning of the protective filters in the gas channel, as self-diagnostic tools usually do not detect their contamination.A schematic of the gas analyzer used in this study is shown in Fig. 2.

Fig. 2. Schematic diagram of the gas analyzer
The gas analyzer consists of a gas chamber (1), a set of visible, infrared, and ultraviolet radiation sources (2), controlled by the control unit (4), a sensor matrix (3), a preprocessing unit (5), a recognition unit (6), a display unit (7).
In operation, the gas mixture under analysis enters the gas chamber (1) through a parabolic diffuser (8).Pulsed radiation is generated by the corresponding radiation source (matrix of LEDs) (2), controlled by the control unit (4), and enters the chamber (1).The radiation passes through the measuring chamber (1), where part of the radiation energy is absorbed by the gas components, causing the formation of acoustic waves.These waves are detected by the sensor matrix (3), which consists of a set of spectral sensors and temperature and pressure sensors.The electrical signals from the sensors (3) are fed to the input of the preprocessing unit (5).The preprocessing unit (5) extracts several hundred parameters of the gas mixture.The signal with the results of calculating the parameters of the gas mixture from the preprocessing unit ( 5) is fed to the input of the recognition unit (6), which consists of a trained neural network and a database of gas mixtures.The trained neural network, interacting with the database of gas mixtures, outputs a state "1" on one of its outputs, and this state is displayed by the display unit (7).
If a beam of radiation interrupted at a certain frequency is directed into a vessel containing a gas that can absorb infrared radiation, then a pressure pulsation will occur in the gas, which is subjectively perceived as sound.The pressure pulsation occurs because the gas molecules, absorbing photons of incident radiation, go to an excited state, and then the excitation energy of their vibrational-rotational degrees of freedom is transferred, as a result of inelastic collisions between molecules, into the translational motion energy of the latter, i.e. into heat, which corresponds to an increase in pressure.The use of a parabolic emitter (8) in this design will provide multiple passages of the rays through the chamber (3), thereby increasing the gas pressure in the chamber.

B. Development of a Prototype Gas Analyzer for Data Collection
There are a large number of different sensors available for making measurements [56], [57].Sensors are widely used in various fields, such as scientific research, testing, quality control, automated control systems, and others [58]- [60].As part of the development of a prototype gas analyzer designed to determine the concentration of multicomponent gas mixtures in air in laboratory and industrial conditions, the AS7265x and BMP180 sensors were used.
The SparkFun Triad spectroscopic sensor is a powerful optical control sensor, also known as a spectrophotometer [61].Three AS7265x spectral sensors are combined with UV and IR LEDs to illuminate and test various surfaces for light spectroscopy.The triad consists of three sensors: the AS72651, AS72652, and AS72653, which can detect light from 410 nm (UV) to 940 nm (IR) (Fig. 3).Each sensor has six independent see optical filters.There are a total of 18 output channels.The use of spectral sensors combined with ultraviolet and infrared LEDs greatly simplifies the design of the gas analyzer and increases the measurement accuracy [62].The BMP180 sensor can be used to measure absolute atmospheric pressure in the range of 300 to 1100 hPa (+9000 to -500 meters above sea level) [63].It can be used in home weather stations, flying vehicles, as an altimeter, and other applications.The GY-68 module, which is based on the BMP180 chip, combines an atmospheric pressure sensor and a thermometer.Fig. 4 shows a multispectral sensor system with artificial intelligence (AI) support.The AS7265x and BMP180 sensors are connected in parallel to a 3.3 V compatible Arduino Uno [64] board using a QWIIC cable.The Arduino board is used as a microcontroller to receive and transmit digitized sensor signal data to a laptop computer via the USB port.The received spectral data are stored for preprocessing [65].The preprocessed sample data is transferred to a non-linear neural network for further analysis.These preprocessed data are used to train, test, and validate the neural network [66].Fig. 5 shows a block diagram of the gas analyzer operation.To create the tested devices, an experimental bench was built.It is a small, closed gas dynamic system for collecting and quantitatively determining the gas concentration using sensors.The system consists of the following main components: an Arduino microprocessor board, AS7265x and BMP180 sensors, a container for circulating the analyzed gas mixture, a container for generating gas, a power supply unit and a circulation pump (Fig. 7).

C. Data Collection and Preprocessing
The spectral data collected from the Arduino module is transferred to a PC via the USB port.The data is then formatted in a form that machine learning algorithms can accept using the MS Excel spreadsheet editor.Outliers, duplicate data, and redundant data beyond the standard size were also removed [67].

D. Neural Network Model
The pre-processed spectral data is fed as input to the neural network architecture.Fig. 6 shows the neural network architecture, which has 20 input layers, followed by hidden layers, and a multilayer output for multiclassification.From Fig. 6, the preprocessed spectral data fed to the input layer has 20 inputs, since our circuit has 20 channels with different wavelengths.In this network layout, i1, i2, ..., i20 represent the input neurons, wij represent the weights connecting the input to the hidden layer, wjm represent the weights present in the interconnected hidden layers, and wlk represent the weights connecting the hidden layer to the output layer.
Training of the neural network is performed using the backpropagation of error method [68], [69].In the training process, the weights are adjusted so that a set of inputs leads to the required set of outputs.It is initially assumed that each input set corresponds to its paired target set, which specifies the required output.Together, they form a training pair [70], [71].
Initially, the weights and offsets are assigned randomly and processed together with the input data using the forward propagation method.The output error value is calculated by comparing the actual value and the predicted value.The output error is then minimized using a gradient descent algorithm.The cross-entropy is used as a loss function [72].
To constrain the search space during training, the task is to minimize the NS target error function, which is found using the least squares method [73] as shown in (1).
Where,   is the value of the j-th output of the neural network;   is the target value of the j-th output;  is the number of neurons in the output layer.
The combination of the least squares method with the gradient descent method is called the Levenberg-Marquardt algorithm [74].The block diagram of the backpropagation of error method algorithm is shown in Fig. 8.

IV. RESULTS
For experimental tests of the laboratory bench (Fig. 6), a sample of air from the laboratory room, carbon dioxide (CO2), and a mixture consisting of pure oxygen (O2) with nitrogen (N2) in a 9:1 ratio were fed in turn into the container for circulation of the analyzed gas mixture.The experiments were performed in a well-ventilated room of 20 m 2 .
Using the NNStart package, a neural network (NS) with one input, one hidden, and one output layer was designed.The neurons on the hidden layer have a sigmoidal activation function [77], and those on the output layer have a linear function [78].Two types of analysis were performed to determine the number of neurons in the hidden layer.The first consisted in revealing the dependence of the training time on the number of neurons (Fig. 9).As can be seen from the graph, the training time of the neural network increases sharply when the number of neurons is more than 20.The second analysis showed how the number of neurons affects the magnitude of the gradient of the error function (Fig. 10).As can be seen from the graph, the gradient of the training error sharply increases in the range of 20 to 50 neurons.This indicates the appearance of overfitting.In such cases, it is usually recommended to reduce the number of hidden elements and/or layers, because the network is too powerful for the task.The most favorable situation is when the training error decreases.On the graph, this situation is observed in the range of 4 to 15 layers.After analyzing both graphs, it was decided to build three neural networks with 5, 10, and 20 neurons and compare the results.

A) A Neural Network Model with Five Neurons in a Hidden Layer
The spectral data samples were organized such that 70% of the data set was used for training, 15% was used for validation, and the remaining 15% was used for testing.The training data was used to train the network, and the network was also tuned according to the error.The validation sample was used to evaluate the performance of the network and to stop training when the network was no longer improving.The test samples were used to measure the accuracy of the network on unseen data.The results of training are shown in Table II.In Table III shows the number of observations, root mean square error (MSE), and R-squared during training.Training, testing, and validation error analyses are performed to find the optimal number of epochs.The gradient, cross-entropy, and validation loss during neural network training are shown in Fig. 11.The best validation accuracy is achieved at epoch 16.The error rate is minimized and reaches zero at epoch 16 during validation.This may be a function of the loss to train the model.The cross-entropy loss decreases as the number of iterations increases, which means that the model learns very well from the data.It minimizes the distance between the predicted value and the actual sample value.From the entire error analysis, we can conclude that the model requires only 16 epochs to train and set the optimal weight value.For a perfect match, the data must lie along a line at a 45degree angle where the network output is equal to the target values.As you can see in Fig. 14, the perfect match for all datasets has an R value of 1.  IV.In Table V shows the number of observations, root mean square error (MSE), and R-squared during training.Fig. 15 shows the gradient, cross-entropy, and failure rate during neural network training.Fig. 19 shows the gradient, cross-entropy, and failure rate during neural network training.Fig. 20 shows a graph of the root mean squared error (MSE) as a function of training epochs.From the error analysis, it can be concluded that the model requires only 50 epochs to train and set the optimal weight value, which is significantly more than in the two previous neural network variants.As you can see in Fig. 21, the model is well-optimized because the error value is lower.

D) Approval of the Neural Network Model
To test the neural network, we wrote a program in MATLAB.In turn, we fed a sample of laboratory air, carbon dioxide (CO2), and a mixture of pure oxygen (O2) and nitrogen (N2) in a 9:1 ratio into the circulation container for the analyzed gas mixture of the assembled laboratory bench.The sensor data from the assembled laboratory bench was fed sequentially into the program input.After additional preparation, the data was transferred to the input of each of the neural networks.The program output was a visualization of the gas mixture.Fig. 23 shows the output window of the program.The program results show that there are gases unknown to the database.

V. CONCLUSIONS
The aim of this study was to develop a method and algorithm for determining the spectral composition of gas.Advanced spectroscopic methods and sensors were used, including sensors that allow obtaining data in the UV and IR range.Gas pressure and temperature data were also used to increase the accuracy of the analysis and the measurement range of the input data.Neural networks were used to analyze the spectral composition of gas mixtures with high accuracy, even in the presence of unknown gases.
An experimental bench was built to test this concept.It is a small, closed gas dynamic system for collecting and quantifying gas concentrations using sensors.The system consists of the following main components: Arduino microprocessor board, AS7265x and BMP180 sensors, container for circulating the analyzed gas mixture, container for generating gas, power supply unit, circulation pump.
A sample of laboratory air, carbon dioxide (CO2), and a mixture consisting of pure oxygen (O2) and nitrogen (N2) in a 9:1 ratio were fed into the circulation container for the analyzed gas mixture in turn.
Three neural networks were constructed using the NNStart package of the MATLAB mathematical system.Each neural network has one input layer, one hidden layer, and one output layer.The neural networks differed in the way they sampled spectral data, namely the number of hidden neurons.
Training and construction of each of the obtained neural networks showed good results.However, the graph of the network error variation in the process of training for a neural network with five hidden layers is the most successful and demonstrative.The MSE of training, verification, and test results for a neural network with five hidden layers are also the most optimal.
The results of this study prove that the proposed gas analyzer scheme is functional, and the software created provides an effective analysis of the composition of gas mixtures.Experimental studies have shown a high accuracy of gas detection for this type of device with a given error of 3% of gas detection A neural network with five hidden layers and 16 iterations is sufficient for this purpose.The proposed gas analyzer scheme is based on the model described in the patent «Gas analyzer» No. 5141 dated 10.07.2020 [79], which was further developed in the patent «Intelligent gas analyzer» No. 8288 dated 21.03.2023[80], using neural networks for decision analysis.
In the future, it is planned to implement an autonomous gas analyzer based on Raspberry Pi or Arduino Mega microcontrollers.This will make the device more compact and portable, and the use of a trained database will allow the proposed method to be effectively implemented on FPGAs.All this will make it more convenient and autonomous for use in the field.It is also planned to test this concept using convolutional neural networks, which will simplify calculations while maintaining accuracy.This will allow to increase the speed of data processing, conduct data analysis in real time and obtain more accurate results.These improvements will make the device more reliable and durable.This will allow it to be used in industrial conditions.

Fig. 8 .
Fig. 8. Block diagram of the error backpropagation algorithm The Neural Network Start (NNStart) package of MATLAB [75], [76] was used to implement ANN training.The NNStart package allows for curve approximation, image recognition, object clustering, and time series approximation.
At the initial stage of an neural network design, the volume of training, validation, and test samples is determined.The training sample is used to train the ANN.The test sample is used to evaluate the generalization properties of the network, and to stop training when the generalization stops improving.The test sample has no effect on training, but serves to test the quality of training on data that was not used in training the network.The more sample volume is, the more accurate results the neural network will produce.

Fig. 9 .
Fig. 9. Dependence of NN training time on the number of neurons in the hidden layer

Fig. 10 .
Fig. 10.Dependence of the gradient of the neural network error function on the number of neurons in the hidden layer

Fig. 11 .
Fig. 11.Results of the training process

Fig. 12
Fig. 12 shows a graph of the relationship between the root mean square error (MSE) and the training epoch.The variations of the error for three data sets are shown: training, validation, and test.We can see that the error decreases significantly by the end of the training process.

Fig. 12 .
Fig. 12. Variation of network error during the training process The error histogram provides an additional check on network performance.The blue bars represent training data, the green bars represent validation data, and the red bars represent testing data.The histogram can give you an idea of outliers, that is, data points that match significantly worse than most data.As you can see in Fig. 13, the model is welloptimized at the expense of less error.

Fig. 14 .
Fig. 14.Linear regression of neural network output and targets B) A Neural Network Model with Ten Neurons in a Hidden Layer Sample division parameters: training -70%, validating -15%, testing: 15%.The number of hidden neurons is 10.The results of training are shown in TableIV.In TableVshows the number of observations, root mean square error (MSE), and R-squared during training.Fig.15shows the gradient, cross-entropy, and failure rate during neural network training.

Fig. 16
Fig.16shows a plot of the root mean squared error (MSE) as a function of training epochs.From the error analysis, we can conclude that the model requires 19 epochs to train and set the optimal weight value.As you can see in Fig.17, the model is well-optimized because the error value is lower.
Fig.16shows a plot of the root mean squared error (MSE) as a function of training epochs.From the error analysis, we can conclude that the model requires 19 epochs to train and set the optimal weight value.As you can see in Fig.17, the model is well-optimized because the error value is lower.

Fig. 16 .
Fig. 16.Variation of network error during the training process Fig. 17.Error histogram

Fig. 18
Fig.18shows a linear regression of the neural network's output and targets for the training, validation, and testing datasets.

Fig. 18 .
Fig. 18.Linear regression of neural network output and targets C) A Neural Network Model with Twenty Neurons in a Hidden Layer Sample division parameters: training -70%, validating -15%, testing: 15%.The number of hidden neurons is 20.The results of training are shown in Table VI.In Table VII shows the number of observations, root mean square error (MSE), and R-squared during training.

Fig. 19 .
Fig. 19.Results of the training process

Fig. 22 .
Fig. 22. Linear regression of neural network output and targets

TABLE I .
EXAMPLES OF RESEARCH ON THE DEVELOPMENT OF GAS ANALYZERS

TABLE II .
NEURAL NETWORK TRAINING RESULTS

TABLE IV .
NEURAL NETWORK TRAINING RESULTS

TABLE VI .
NEURAL NETWORK TRAINING RESULTS