Skip to content
Snippets Groups Projects
Commit f77c47bb authored by Aurélien Lamercerie's avatar Aurélien Lamercerie
Browse files

Update README.md

parents
Branches
Tags
No related merge requests found
# TENET - Terminology Extraction using Net Extension by (semantic) Transduction
-------------------------------------------------------------------------------
This tool exploits an intermediate semantic representation (UNL-RDF graphs) to
construct an ontology representations of NL sentences. [TODO: compléter]
The treatment is carried out in two stages:
1. Initialization: TODO.
2. UNL sentences Loading: TODO.
3. Transduction Process: the UNL-RDF graphs are extended to obtain semantic nets.
4. Classification / Instanciation
5. Reasonning
[TODO: compléter la description]
## 1 - Implementation
This implementation was made using Python languages, with UNL as pivot structure.
[TODO: talk about UNL-RDF graph (obtained using UNL-RDF schemas)]
The following module is included as main process:
1. Semantic Transduction Process (stp) for semantic analysis with transduction schemes
The python script _tenet.py_ is used to manage the tool's commands, using components of the directory _scripts_.
The data to be processed must be placed in the directory _corpus_. All working data, including the results,
will be processed in the directory _workdata_.
Transduction process configuration includes an ontology definition for semantic net,
and several transduction schemes as SPARQL request.
## 2 - Environment Setup
[TODO: external module souces?]
The python code has been tested on Python 3.7.
All dependencies are listed in requirements.txt. These dependencies are used for external modules.
The input directories contain evaluation files with some test corpus.
## 3 - Execution
The application runs in a terminal using the tenet.py script: **python3 tenet.py <command> <args>**.
This prototype was tested with a standard computer configuration. The processing time is reasonable for both processing steps.
The following times were measured for the processing of a file of 10 sentences:
* about xxx seconds for initialization and UNL sentences loading;
* about xxx second for transduction, classification and instanciation process;
* about xxx second for reasonning process.
## 4 - Commands
Following commands are proposed to execute the different steps of the process:
* **select**: command to select a corpus.
* **load**: command to load the UNL sentences of a given corpus.
* **extraction**: command to extract terminologies data from UNL-RDF graph.
* **reason**: command to reason on terminology.
* **clean**: command to clean the working directories.
These commands are used with the python script _tenet.py_.
## 5 - Example
[TODO: end-to-end example]
## 5 - Evaluation
[TODO: experimentation]
-------------------------------------------------------------------------------
# References
-------------------------------------------------------------------------------
--
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment