2. Consistency of Neural Models with Respect to
Elementary Physical Features. Is this Always Guaranteed?
Physics-Informed Neural Networks (PINN) and
Physics-Guided Neural Networks (PGNN) incorporate equations derived from
conservation laws (partial/ ordinary differential and algebraic equations) into
the training of shallow and deep neural networks1.
PINN have been applied in different areas, in addition to providing a strategy
for solving partial differential equations2-8.
All these applications are based on the availability of a phenomenological
model capable of describing, with a certain level of complexity, the physics of
the phenomenon from which the training data was extracted. On the other hand,
there are many situations in which there is no theoretical model capable of
mathematically describing the problem analyzed and the knowledge acquired is
basically restricted to the available experimental and/ or operational data.
Table 1 presents 3 case studies which comprise real data sets
used as benchmark in several works involving steady-state regression [UCI Machine Learning Repository]9. The first one
[Computer Hardware]10 proposes a model to evaluate the performance
of Central Processing Units (CPU) considering one output (CPU relative
performance) and 6 inputs (cache memory size, u1; minimum number
of I/O channels, u2; maximum number of I/O channels, u3; machine cycle
time, u4; minimum main memory, u5; and maximum main
memory, u6. The other two case studies are related to predicting the toxicity of chemicals
for different species of fish11,12. In both cases,
the output is the concentration of product in water that results in the death
of 50 percent of the aquatic test specimens within 96 hours. Eight and six
different molecular descriptors are considered as inputs in the first (QSAR Aquatic Toxicity) and second (QSAR Fish Toxicity) case studies, respectively (x1 I = 1, …8, zj,j = 1, …,6)
Table 1: Data sets.
|
Data
set |
Sample
size |
Number
of features |
Output |
Output
type |
Reference |
|
Computer
Hardware |
209 |
6 |
CPU
relative performance |
Integer |
10 |
|
QSAR
Aquatic Toxicity |
546 |
8 |
Acute aquatic toxicity |
Continuous |
11 |
|
908 |
6 |
Acute aquatic toxicity |
Continuous |
12 |
The three data
sets (Table 1) were used as case studies in a recent work13 which proposes an innovative approach (Weight Initialization based on Linearization combined
with a Constructive Algorithm for Regression problems, WILCAR) for initializing weights and
defining the number of hidden units in a typical Single-layer
Feedforward Neural Networks (SFNN) aimed at developing regression
models. Two classic weight initialization and training methods were also used
to evaluate the performance of the proposed approach, namely: Random
Initialization by Xavier Method [RIXM]14;
and Extreme Learning Machine [ELM]15,16.
While both WILCAR and RIXM involve a gradient-based training approach
(backpropagation), the ELM method is associated with the class of gradient-free
learning algorithms. The methods were applied to the three data sets (Table
1) without
incorporating any physical information. No theoretical model was available, at
any level of complexity, capable of describing the physics of each phenomenon.
Validation results based on consolidated metrics (Root Mean Square Error, RMSE;
and coefficient of determination, R2) showed
good performance of the neural models obtained with all learning methods,
especially with the new proposed WILCAR approach13.
Although there is
no mathematical description of the phenomenon, even in situations of this type
it is possible to have, for example, prior knowledge of the direction of the
effect of a given input on the output (beyond other qualitative characteristics
of the phenomenon behavior), which corresponds to the static gain signal.
Therefore, it is appropriate to check whether the neural model predict static
gain signals in a way consistent with what is expected for the entire set (or a
subset) of input-output pairs.
Table 2 to Table 4 present the obtained and expected gain signals for
each case study. The expected signals are based on information collected from
experts in each phenomenon, in accordance with the references related to each
one of the experiments10-12.
Table 2: Computer hardware.
|
Input |
WILCAR |
RIXM |
ELM |
Expected |
|
u1 |
+ |
+ |
+ |
- |
|
u2 |
+ |
+ |
+ |
+ |
|
u3 |
+ |
+ |
- |
+ |
|
u4 |
+ |
+ |
+ |
+ |
|
u5 |
- |
- |
+ |
+ |
|
u6 |
+ |
+ |
+ |
+ |
Table 3: QSAR aquatic
toxicity.
|
Input |
WILCAR |
RIXM |
ELM |
Expected |
|
x1 |
+ |
- |
+ |
+ |
|
x2 |
- |
- |
- |
+ |
|
x3 |
- |
+ |
+ |
+ |
|
x4 |
+ |
+ |
+ |
+ |
|
x5 |
+ |
+ |
+ |
+ |
|
x6 |
- |
- |
- |
- |
|
x7 |
- |
- |
- |
+ |
|
X8 |
- |
- |
+ |
+ |
Table 4: QSAR fish
toxicity.
|
Input |
WILCAR |
RIXM |
ELM |
Expected |
|
|
+ |
+ |
+ |
- |
|
|
+ |
+ |
+ |
+ |
|
|
- |
- |
- |
- |
|
|
+ |
+ |
+ |
+ |
|
|
+ |
+ |
- |
+ |
|
|
+ |
+ |
+ |
+ |
Table 2 to Table 4 show that static gain signal prediction errors are
verified in all 3 case studies involving all weight initialization and training
methods (WILCAR; RIXM; and ELM). It is reasonable to consider that the
direction of the effect of a given input on a given output in a regression
model (dynamic or static) constitutes an elementary but important feature which
must be evaluated verified and satisfied, even when there is no
phenomenological model capable of describing the physics of the problem (if
there is sufficient information available, other phenomenon characteristic
behavior must also be verified and satisfied). The results presented in Tables
2-4 show that properly validated neural models obtained with different
initialization and training strategies do not necessarily guarantee the
achievement of static gains consistent with expectations, nor other possible
relevant qualitative behaviors.
3.
References