diff --git a/README.md b/README.md index 6985530b6794fc5b2504e80539daadfad312abf9..1c86e7e1b77665f6db8a3c88e5a0d650822b8059 100644 --- a/README.md +++ b/README.md @@ -1,26 +1,29 @@ # CM-Tool: Corpus Making Tool ------------------------------------------------------------------------------- -This repository gathers some useful programs to obtain experimental data and +This repository gathers some useful python scripts to obtain experimental data and enable the construction of corpus about various topic. -## Source +## Input Data -The "source" directory contains source data, which are raw text data -from [DBPedia](https://dbpedia.org/). +The "inputData" directory contains source data, which are raw text data +from different sources (such as [DBPedia](https://dbpedia.org/)). -## Data +## Output Data -The "data" directery contains data in different representations: +The "outputData" directery contains produced data, including a sequence of sentences +('dataRef.sentence.txt'), and, for each sentence, some files corresponding to various +representations such as: -- sequence of sentences ('dataRef.sentence.txt') -- AMRs Graph ('dataRef.amr.graph') -- AMR Linked Data ('dataRef.amr.rdf') +- Textual AMRs Graph in PENMAN format ('dataRef.amr.penman') +- Vizual AMRs Graph in DOT and PNG format ('dataRef.amr.dot', 'dataRef.amr.png') +- AMR Linked Data in nTriple and Turtle format ('dataRef.amr.nt', 'dataRef.amr.ttl') These data were obtained from the sources, by applying the script -'convert_text_to_amr.py'. +'convert_text_to_amr.py'. The module used can be specified in the file name +(e.g. .stog for the STOG model). ## Script <convert_text_to_amr.py> @@ -74,7 +77,7 @@ as argument (for example, 'test'). It can be run using command line: ## Library -The "lib" directory contains useful library. +The "lib" directory contains useful libraries. # References