Experiment 20140314-NOOR

Six agents converge and improves semantic F-measures over logical repair [euzenat2014c]
Repeated in [20150219-NOOR]

Experiment design

DockerOS DockerEXP

Date: 2014-03-14

Hypotheses: Confirm where the process converges and that it provide a better F-measure than logical repair.

Variation of: 20140304-NOOR

6 agents; 10 runs; 100000 games

Adaptation operators: add

Experimental setting: As [20140304-NOOR], except that: The number of iterations is 100000;


controled variables: []

dependent variables: ['srate', 'size', 'inc', 'fmeas', 'conv']


Date: 2014-03-14

Performer: Jérôme Euzenat (INRIA)

Lazy lavender hash: 2aec5fe496c2b95760dba0ef87e82ac13264879b

Classpath: lib/lazylav/ll.jar:lib/slf4j/logback-classic-1.2.3.jar:lib/slf4j/logback-core-1.2.3.jar:.

OS: wheezy

Parameter file: params.sh

Executed command (script.sh):


. params.sh

java -Xms500M -Xmx1G -cp ${JPATH} fr.inria.exmo.lazylavender.engine.Monitor -DrevisionModality=${OPS} -DnbRuns=${NBRUNS} -DnbAgents=${NBAGENTS} -DnbIterations=${NBITERATIONS} -o results/${LABEL}-Log${NBAGENTS}-${NBITERATIONS}.tsv > results/${LABEL}-Log${NBAGENTS}-${NBITERATIONS}.txt 

Class used: NOOEnvironment, AlignmentAdjustingAgent, AlignmentRevisionExperiment, ActionLogger, AverageLogger, Monitor. .

Execution environment: Debian Linux virtual machine configured with four processors and 20GB of RAM running under a Dell PowerEdge T610 with 2*Intel Xeon Quad Core 2.26GHz E5607 processors and 32GB of RAM, under Linux ProxMox 2 (Debian). - Java 1.6.0 HotSpot

Takes a whole night.

Raw results


Initial results

/scratch/Sakere/20140314-NOOR/pysake/data/loadTables.py:18: DtypeWarning: Columns (0) have mixed types. Specify dtype option on import or set low_memory=False.
  df = pd.read_csv( filename, sep=sep, names =  cols )
Modality Size Success rate Incoherence Syntactic F-measure Convergence
Reference 783 nan 0.00 1.00 nan
Initial 495 nan 0.64 0.06 nan
add 208 0.97 0.00 0.15 13362
Alcomo 180 nan 0.00 0.11 1
LogMap 227 nan 0.00 0.12 1

The convergence given above is now (2023) the average of the convergences.

In addition here are the last games in which F-measure changed, i.e., a failure has occured (reverse order). This combines 10 rounds:

  • 16957
  • 13706
  • 13362
  • 12820
  • 11931
  • 11868
  • 11673
  • 11537
  • 11359
  • 11352
  • 11314


Key points

  • Convergence is way before 20000 games
  • In the long run, average F-measure is better.

This file can be retrieved from URL https://sake.re/20140314-NOOR

It is possible to check out the repository by cloning https://felapton.inrialpes.fr/cakes/20140314-NOOR.git

This experiment has been transferred from its initial location at https://gforge.inria.fr (not available any more)

See original markdown (20140314-NOOR.md) or HTML (20140314-NOOR.html) files.