Skip to content
Snippets Groups Projects
Commit ed5ca629 authored by Aurélien Lamercerie's avatar Aurélien Lamercerie
Browse files

Update README.md

parent b8246cca
Branches
No related tags found
No related merge requests found
# CM-Tool: Corpus Making Tool
-------------------------------------------------------------------------------
This repository gathers some useful programs to obtain experimental data and
This repository gathers some useful python scripts to obtain experimental data and
enable the construction of corpus about various topic.
## Source
## Input Data
The "source" directory contains source data, which are raw text data
from [DBPedia](https://dbpedia.org/).
The "inputData" directory contains source data, which are raw text data
from different sources (such as [DBPedia](https://dbpedia.org/)).
## Data
## Output Data
The "data" directery contains data in different representations:
The "outputData" directery contains produced data, including a sequence of sentences
('dataRef.sentence.txt'), and, for each sentence, some files corresponding to various
representations such as:
- sequence of sentences ('dataRef.sentence.txt')
- AMRs Graph ('dataRef.amr.graph')
- AMR Linked Data ('dataRef.amr.rdf')
- Textual AMRs Graph in PENMAN format ('dataRef.amr.penman')
- Vizual AMRs Graph in DOT and PNG format ('dataRef.amr.dot', 'dataRef.amr.png')
- AMR Linked Data in nTriple and Turtle format ('dataRef.amr.nt', 'dataRef.amr.ttl')
These data were obtained from the sources, by applying the script
'convert_text_to_amr.py'.
'convert_text_to_amr.py'. The module used can be specified in the file name
(e.g. .stog for the STOG model).
## Script <convert_text_to_amr.py>
......@@ -74,7 +77,7 @@ as argument (for example, 'test'). It can be run using command line:
## Library
The "lib" directory contains useful library.
The "lib" directory contains useful libraries.
# References
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment