20230505-MTOA
Date: 2023-05-05 (Andreas Kalaitzakis)
1) The more deciding for different tasks relies on common properties, the more tackling additional tasks improves accuracy.
2) The more deciding for different tasks relies on common properties, the higher is success rate.
Success rate evaluates the interoperability among agents. It is defined as the proportion of successful interactions, over all performed interactions until the $n^{th}$ interaction.
Task accuracy evaluates the quality of agent ontologies. It adapts the accuracy measure introduced in \cite{Bourahla2021a} to different tasks. It is defined as the proportion of object types for which a correct decision would be taken with respect to a task $ t $, by an agent $ \alpha $ on the $ n^{th} $ iteration of the experiment. Task accuracy is used to measure the average and best task accuracy of agents.
\begin{align*} tacc(\alpha,n,t) = \frac{\vert\{o \in \mathcal{I} : h_n^\alpha(o,t) = h^*(o,t) \}\vert}{\vert \mathcal{I} \vert} \end{align*}The experiment is executed under 6 setups. Each setup is run 20 times and its results are averaged. One run consists of 80000 interactions with each interaction taking place among two agents. These two agents are randomly selected out of a total population of 18 agents. Their environment contains 64 different object types, each one perceivable through 6 different binary properties. The agents are initially trained with respect to all $|\mathcal{T}|=\{3\}$ tasks. Deciding with respect to each task relies on 2 out of the 6 perceivable binary properties. These properties are either the same for all tasks, or different for each task. Agents induce an initial ontology based on a random 10 \% of all existing labeled examples. The agents are assigned 1 to 3 assigned tasks ($|\mathcal{T}_{ass}|=\{1,2,3\}$). For each task, 4 different decisions exist. Between two consecutive interactions, the environment attributes a score to each agent. This score is calculated taking into account the 60 \% of all samples.
Variables independent variables: ['maxAdaptingRank']
dependent variables: ['avg_accuracy', 'avg_max_accuracy', 'success_rate']
Date: 2023-04-01 (Andreas Kalaitzakis)
Computer: Dell Precision-5540 (CC: 12 * Intel(R) Core(TM) i7-9850H CPU @ 2.60GHz with 16GB RAM OS: Linux 5.4.0-92-generic)
Duration : 720 minutes
Lazy lavender hash: ceb1c5d1ca8109373d293b687fc55953fce5241d
Parameter file: params.sh
Executed command (script.sh):
Table 1 consists of the final achieved average success rate values, i.e., the average success rate after the last iteration. Each column corresponds to a different number of adapting tasks, while each row corresponds to a different run, for the same size of scope.
Table 2 consists of the final average minimum accuracy values with respect to the worst task, i.e., the accuracy after the last iteration for the task for which the agent scores the lowest accuracies. Each column corresponds to a different number of undertaken tasks, while each row corresponds to a different run, for the same size of scope.
Table 3 consists of the final achieved average ontology accuracy with respect to all tasks, i.e., the accuracy after the last iteration averaged on all tasks and agents. Each column corresponds to a different number of adapting tasks, while each row corresponds to a different run, for the same size of scope.
Table 4 consists of the final average best task accuracy values with respect to the best task, i.e., the accuracy after the last iteration for the task for which the agent score the highest accuracies. Each column corresponds to a different number of undertaken tasks, while each row corresponds to a different run, for the same size of scope.
We perform one-way ANOVA, testing if the independent variable 'maxAdaptingRank' has a statistically significant effect on different dependent variables.
The presented figure depicts the evolution of the agents (a) average accuracy, (b) accuracy on their best task and (c) success rate, for different number of tasks and common properties.
(a) shows that assigining more tasks to agents, significantly improves their average accuracy. This improvement is higher when agents tackle tasks that rely on the same properties. On the one hand, when tasks rely on different properties, agents tackling 3 tasks are 9\% more accurate than agents tackling 1 task. On the other hand, when tasks rely on common properties, agents tackling 3 tasks are 55\% more accurate than agents tackling 1 task. This shows that when tasks rely on common properties, knowledge is transferable from one task to another. Put differently, agents tackling tasks relying on a common set of properties may improve their accuracy on one task by carrying out another task. Results thus support our hypothesis.
(b) shows two things. First, when tasks rely on different properties, the number of tasks does not affect the agents accuracy on their best task. This indicates that when tasks rely on different properties, learning to decide with respect to one task is not related to learning to decide with respect to a different task. Second, when agents tackle tasks that rely on common properties, tackling additional tasks immproves their accuracy on their best task. Finally, results show that even when agents tackle only 1 task, these agents benefit from tasks that rely on common properties. This indicates that while the agents abstain from all tasks that are not assigned to them, their ontologies contain general-purpose knowledge, acquired during the initial ontology induction phase. These results agree with subfigure (a), further supporting our hypothesis.
(c) shows that tackling less tasks or having tasks that rely on common properties improves the success rate. This is due to two reasons. The first is that the fewer the assigned tasks, the fewer are the decisions over which agents need to agree. The second is that the more tasks rely on common properties, the less non relevant knowledge may be present to an agent's initially induced ontology. Furthermore, while success rate improves over the course of the experiment, it does not converge to 1. This indicates that the final ontologies do not allow agents to reach consensus. This can be explained by the limitation of resources: agents may lack the resources required to learn to decide accurately for all assigned tasks and objects. As a result, they are able to decide accurately for different subsets of the existing object types at a given time. The latter it true even when agents interact over one task.
Analysis of variance shows that the number of common properties among different tasks, has a statistically significant impact (p $\leq$ 0.01) on all measures. The number of assigned tasks has a statistically significant impact on (1) the success rate and (2) the average accuracy.
Based on the results, two conclusions are drawn. The first is that when agents tackle additional tasks relying on common properties, the agents may transfer knowledge from one task to another. The second is that when agents tackle additional tasks that rely on different properties, the number of assigned tasks does not affect their accuracy on their best task.