Experiment 20201001-DOLA

Experiment design

Agent adapt ontologies to agree on decision taking

Date: 20201001 (Yasser Bourahla)

5 runs; 40000 games

Hypothesis: ['here there is no hypotheses, the goal is to situate how much agents improve compared to AMAIL']

Variation of: 20200623-DOLA

Environment: ../input/zoology

test set proportion: 0.1

Experimental setting: Agents learn decision trees (transformed into ontologies); get income from environment; adapt by splitting their leaf nodes; environment generated from dataset

Variables

independent variables: ['numberOfAgents']

dependent variables: ['accuracy', 'precision', 'recall', 'taccuracy', 'tprecision', 'trecall', 'fmeasure', 'tfmeasure']

Experiment

Date: 20201001 (Yasser Bourahla)

LazyLavender hash: 6edb6ab1e77240dec68475d1263707f2086b22f3

Link to lazylavender

Parameter file: params.sh

Executed command (script.sh):




BEWARE: REPRODUCING THE ANALYSIS TAKES A CONSIDERABLE AMOUNT OF TIME.

#!/bin/bash

. paramsCrossVal.sh

OUTPUT=${LABEL}


bash scripts/runexp.sh -p ${OUTPUT} -d ${DIRPREF} java -Dlog.level=INFO -cp ${JPATH} fr.inria.exmo.lazylavender.engine.ExperimentalPlan -Dexperiment=fr.inria.exmo.lazylavender.decisiontaking.CrossValidationExperiment ${OPT} -DresultDir=${OUTPUT}/${DIRPREF}

Experimental plan

The independent variables have been varied as follows:

number of agents: [2, 5, 10, 20, 40]

Raw results

Full results are available on zenodo

Data exploration

average precision/recall/fmeasure/accuracy by number of agents

number of agents
2 5 10 20 40
precision training 0.851953 0.899113 0.933084 0.986072 0.990276
test 0.878333 0.907333 0.941325 0.963298 0.952559
fmeasure training 0.812074 0.85934 0.91888 0.971244 0.984074
test 0.871374 0.893278 0.924962 0.944523 0.939294
recall training 0.775762 0.822937 0.905101 0.956856 0.977949
test 0.864524 0.879651 0.909159 0.926466 0.926394
accuracy training 0.962 0.978 0.990 0.997 0.998
test 0.951 0.965 0.977 0.985 0.983

average precision/recall/fmeasure/accuracy for each targetted decision class

target class
0 1 2 3 4 5 6
precision training 0.998256 0.986339 0.788459 0.998235 0.909054 0.962577 0.881776
test 0.996744 0.97876 0.769017 0.997333 0.8882 0.976467 0.893467
fmeasure training 0.994517 0.989156 0.71132 0.999117 0.937307 0.934884 0.784903
test 0.990904 0.979924 0.671162 0.998665 0.921549 0.949377 0.879416
recall training 0.990806 0.991989 0.64793 1 0.967371 0.908741 0.707208
test 0.985131 0.98109 0.5954 1 0.9575 0.92375 0.8658
accuracy training 0.990 0.991 0.967 0.999 0.985 0.985 0.979
test 0.984 0.982 0.924 0.999 0.974 0.979 0.964

average precision/recall/fmeasure/accuracy for each test fold

testing set
0 1 2 3 4 5 6 7 8 9
precision training 0.933686 0.944817 0.922078 0.927746 0.926387 0.91744 0.932947 0.936907 0.947494 0.931491
test 0.993857 0.881119 0.986286 0.961167 0.938429 0.86573 0.914429 0.839143 0.980157 0.925381
fmeasure training 0.910942 0.928622 0.90589 0.897458 0.907592 0.915444 0.919371 0.888799 0.912593 0.90547
test 0.937313 0.911262 0.993096 0.925523 0.888753 0.912034 0.945496 0.833621 0.882951 0.897746
recall training 0.88928 0.912974 0.89026 0.869085 0.889545 0.913456 0.906185 0.845389 0.880171 0.880864
test 0.886857 0.94354 1 0.892429 0.844071 0.963571 0.978748 0.828171 0.803286 0.871714
accuracy training 0.987 0.989 0.983 0.984 0.983 0.983 0.985 0.984 0.987 0.986
test 0.984 0.977 0.995 0.981 0.967 0.974 0.983 0.924 0.969 0.967
































Discussion

Agents mostly started with already high precision/recall/accuracy. From few examples of the zoology dataset, agents are able to generalise and perform well on the rest of the dataset. There was not much room for improvement.

Results are on par with A-MAIL* a coordinative learning approach which had the following results.

(Comparison is solely indicative since the two approaches do not consider the same information)

on the training set:

Number of agents Precision F-measure Recall Accuracy
2 1.00 0.87 0.77 0.988
3 1.00 0.95 0.91 0.997
4 0.99 0.96 0.93 0.992
5 1.00 0.97 0.95 0.997

on the test set:

Number of agents Precision F-measure Recall Accuracy
2 0.97 0.85 0.75 0.950
3 0.98 0.89 0.81 0.968
4 0.97 0.90 0.84 0.966
5 0.98 0.93 0.88 0.980

* Santiago Ontañón and Enric Plaza. 2015. Coordinated Inductive Learning Using Argumentation-Based Communication. Autonomous Agents and Multi-Agent Systems 29, 2 (2015), 266–304.

to compare: results of simulations by number of agents
precision fmeasure recall accuracy
training test training test training test training test
number of agents 2 0.851953 0.878333 0.812074 0.871374 0.775762 0.864524 0.962 0.951
5 0.899113 0.907333 0.85934 0.893278 0.822937 0.879651 0.978 0.965
10 0.933084 0.941325 0.91888 0.924962 0.905101 0.909159 0.990 0.977
20 0.986072 0.963298 0.971244 0.944523 0.956856 0.926466 0.997 0.985
40 0.990276 0.952559 0.984074 0.939294 0.977949 0.926394 0.998 0.983