Full Text

Research Article

Do Properly Validated Networks Ensure Minimum Physical Consistency with Reality?


1. Abstract

The flexibility and potential of Single-layer Feedforward Neural Networks (SFNN) (typical shallow neural networks) in providing good models in both regression and classification problems has already been widely proven through applications in different areas of knowledge. Additionally, the incorporation of complex phenomenological models into network training (Physics-Informed Neural Networks, PINN) has been able to efficiently combine the identification of a black box model and its consistency with the physics of the analyzed process. On the other hand, between the extremes of incorporating or not a description of the phenomenon into the neural model, there is a vast number of applications (perhaps the majority) in which a classical mathematical model derived from conservation laws is not available. However, even in these cases in which a mathematical representation of the physics of the phenomenon is not available, or even not feasible, some properties of the final black-box model must be consistent with basic features/ behaviors observed in the real problem.

 

2. Consistency of Neural Models with Respect to Elementary Physical Features. Is this Always Guaranteed?

Physics-Informed Neural Networks (PINN) and Physics-Guided Neural Networks (PGNN) incorporate equations derived from conservation laws (partial/ ordinary differential and algebraic equations) into the training of shallow and deep neural networks1. PINN have been applied in different areas, in addition to providing a strategy for solving partial differential equations2-8. All these applications are based on the availability of a phenomenological model capable of describing, with a certain level of complexity, the physics of the phenomenon from which the training data was extracted. On the other hand, there are many situations in which there is no theoretical model capable of mathematically describing the problem analyzed and the knowledge acquired is basically restricted to the available experimental and/ or operational data.

 

Table 1 presents 3 case studies which comprise real data sets used as benchmark in several works involving steady-state regression [UCI Machine Learning Repository]9. The first one [Computer Hardware]10 proposes a model to evaluate the performance of Central Processing Units (CPU) considering one output (CPU relative performance) and 6 inputs (cache memory size, u1; minimum number of I/O channels, u2; maximum number of I/O channels, u3; machine cycle time, u4; minimum main memory, u5; and maximum main memory, u6. The other two case studies are related to predicting the toxicity of chemicals for different species of fish11,12. In both cases, the output is the concentration of product in water that results in the death of 50 percent of the aquatic test specimens within 96 hours. Eight and six different molecular descriptors are considered as inputs in the first (QSAR Aquatic Toxicity) and second (QSAR Fish Toxicity) case studies, respectively (x1 I = 1, …8, zj,j = 1, …,6)

 

Table 1: Data sets.

Data set

Sample size

Number of features

Output

Output type

Reference

Computer Hardware

209

6

CPU relative performance

Integer

10

QSAR Aquatic Toxicity

546

8

Acute aquatic toxicity

Continuous

11

QSAR Fish Toxicity

908

6

Acute aquatic toxicity

Continuous

12

 

The three data sets (Table 1) were used as case studies in a recent work13 which proposes an innovative approach (Weight Initialization based on Linearization combined with a Constructive Algorithm for Regression problems, WILCAR) for initializing weights and defining the number of hidden units in a typical Single-layer Feedforward Neural Networks (SFNN) aimed at developing regression models. Two classic weight initialization and training methods were also used to evaluate the performance of the proposed approach, namely: Random Initialization by Xavier Method [RIXM]14; and Extreme Learning Machine [ELM]15,16. While both WILCAR and RIXM involve a gradient-based training approach (backpropagation), the ELM method is associated with the class of gradient-free learning algorithms. The methods were applied to the three data sets (Table 1) without incorporating any physical information. No theoretical model was available, at any level of complexity, capable of describing the physics of each phenomenon. Validation results based on consolidated metrics (Root Mean Square Error, RMSE; and coefficient of determination, R2) showed good performance of the neural models obtained with all learning methods, especially with the new proposed WILCAR approach13.

 

Although there is no mathematical description of the phenomenon, even in situations of this type it is possible to have, for example, prior knowledge of the direction of the effect of a given input on the output (beyond other qualitative characteristics of the phenomenon behavior), which corresponds to the static gain signal. Therefore, it is appropriate to check whether the neural model predict static gain signals in a way consistent with what is expected for the entire set (or a subset) of input-output pairs.

 

Table 2 to Table 4 present the obtained and expected gain signals for each case study. The expected signals are based on information collected from experts in each phenomenon, in accordance with the references related to each one of the experiments10-12.

 

Table 2: Computer hardware.

Input

WILCAR

RIXM

ELM

Expected

u1

+

+

+

-

u2

+

+

+

+

u3

+

+

-

+

u4

+

+

+

+

u5

-

-

+

+

u6

+

+

+

+

 

Table 3: QSAR aquatic toxicity.

Input

WILCAR

RIXM

ELM

Expected

x1

+

-

+

+

x2

-

-

-

+

x3

-

+

+

+

x4

+

+

+

+

x5

+

+

+

+

x6

-

-

-

-

x7

-

-

-

+

X8

-

-

+

+

 

Table 4: QSAR fish toxicity.

Input

WILCAR

RIXM

ELM

Expected

+

+

+

-

+

+

+

+

-

-

-

-

+

+

+

+

+

+

-

+

+

+

+

+

 

Table 2 to Table 4 show that static gain signal prediction errors are verified in all 3 case studies involving all weight initialization and training methods (WILCAR; RIXM; and ELM). It is reasonable to consider that the direction of the effect of a given input on a given output in a regression model (dynamic or static) constitutes an elementary but important feature which must be evaluated verified and satisfied, even when there is no phenomenological model capable of describing the physics of the problem (if there is sufficient information available, other phenomenon characteristic behavior must also be verified and satisfied). The results presented in Tables 2-4 show that properly validated neural models obtained with different initialization and training strategies do not necessarily guarantee the achievement of static gains consistent with expectations, nor other possible relevant qualitative behaviors.

 

3. References

  1. Kiyani, E., Shukla, K., Karniadakis, G.E., Karttunen, M., 2023. A framework based on symbolic regression coupled with eXtended Physics-Informed Neural Networks for gray-box learning of equations of motion from data. Comput. Methods Appl. Mech. Engrg. 415, 116258.
  2. Liu, J., Jiang, R., Zhao, J., Shen, W., 2023. A quantile-regression physics-informed deep learning for car-following model. Transp. Res. Part C 154, 104275.
  3. Xu, J., Wei, H., Bao, H., 2023. Physics-informed neural networks for studying heat transfer in porous media. Int. J. Heat Mass Transf. 217, 124671.
  4. Zhou, T., Zhang, X., Droguett, E.L., Mosleh, A., 2023. A generic physics-informed neural network-based framework for reliability assessment of multi-state systems. Reliab. Eng. Syst. Saf. 229, 108835.
  5. Mai, H.T., Truong, T.T., Kang, J., Mai, D.D., Lee, J., 2023. A robust physics-informed neural network approach for predicting structural instability. Finite Elem. Anal. Des. 216, 103893.
  6. Qu, H.-Y., Zhang, J.-L., Zhou, F.-J., Peng, Y., Pan, Z.-J., Wu, X.-Y., 2023. Evaluation of hydraulic fracturing of horizontal wells in tight reservoirs based on the deep neural network with physical constraints. Pet. Sci. 20, 1129–1141.
  7. Borkowski, L., Sorini, C., Chattopadhyay, A., 2022. Recurrent neural network-based multiaxial plasticity model with regularization for physics-informed constraints. Comput. Struct. 258, 106678.
  8. Chen, Y., Xu, Y., Wang, L., Li, T., 2023. Modeling water flow in unsaturated soils through physics-informed neural network with principled loss function. Comput. Geotech. 161, 105546.
  9. Bache, K., Lichman, M., 2013. UCI Machine Learning Repository.
  10. Ein-Dor, P., Feldmesser, J., 1987. Attributes of the performance of central processing units: a relative performance prediction model. Commun. ACM 30, 308–317.
  11. Cassotti, M., Ballabio, D., Consonni, V., Mauri, A., Tetko, I. V., Todeschini, R., 2014. Prediction of acute aquatic toxicity towards daphnia magna using GA-kNN method. Altern. to Lab. Anim. 42, 31–41.
  12. Cassotti, M., Ballabio, D., Todeschini, R., Consonni., V., 2015. A similarity-based QSAR model for predicting acute toxicity towards the fathead minnow (Pimephales promelas). Environ. Res. 26, 217–243.
  13. Sá, G.A.G. de, Fontes, C.H., Embiruçu, M., 2022. A new method for building single feedforward neural network models for multivariate static regression problems: a combined weight initialization and constructive algorithm. Evol. Intell.
  14. Glorot, X., Bengio, Y., 2010. Understanding the difficulty of training deep feedforward neural networks. In: N Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS). Volume 9 of JMLR: W&CP 9, Sardinia, Italy.
  15. Cao, W., Wang, X., Ming, Z., Gao, J., 2018. A review on neural networks with random weights. Neurocomputing 275, 278–287.
  16. Khan, W.A., Chung, S.H., Awan, M.U., Wen, X., 2020. Machine learning facilitated business intelligence (Part I) - Neural networks learning algorithms and applications. Ind. Manag. Data Syst. 120, 164–195.