diff --git a/README.md b/README.md index f13a0ee3c36fc816006982e36aef5bbc22066930..66f1a64b604db5910a213955527e1e278de229a1 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # nlreqdataset-unl-enco -This repo will contain all or part of the nlreqdataset of system requirements (http://fmt.isti.cnr.it/nlreqdataset/), enconverted in UNL with http://unl.ru/deco.html and possibly post-edited. +This repo contains all the nlreqdataset of system requirements (http://fmt.isti.cnr.it/nlreqdataset/), enconverted in UNL with http://unl.ru/deco.html. The dataset is presented in the following abstract: @@ -20,4 +20,42 @@ http://nlreqdataset.isti.cnr.it Preprint of the paper available at ResearchGate: -https://goo.gl/HxJD7X \ No newline at end of file +https://goo.gl/HxJD7X + +## Usage of the script to encode a document + +The encoding script work on xml files conforming to `./data/orig/req_document.xsd` + +Examples of an input anf outputs are provided in `./data/examples/` + +First clone the repo (or at least download the scripts folder): +``` +git clone https://gitlab.tetras-libre.fr/unl/nlreqdataset-unl-enco.git +``` + +Then enter the `scripts` folder : +``` +cd nlreqdataset-unl-enco/scripts +``` + +The main Python 3 script to encode is `unlizeXml.py`. + +It relies on the java executable of `unlTools` that is included. You might want to update it with a newer version possibly available at https://gitlab.tetras-libre.fr/unl/unlTools/-/releases + +Basic usage is : +``` +python unlizeXml.py <input-file-path> <output-file-path> +``` + +further options are described using the --help tag : +``` +$ python unlizeXml.py --help +Usage: unlizeXml.py [OPTIONS] INPUT OUTPUT + +Options: + --lang [en|ru] + --dry-run / --no-dry-run if true do not send request to unl.ru + --svg / --no-svg Add svg node representing unl graph + --unltools-path FILE Path of the unltools jar + --help Show this message and exit. +``` diff --git a/public/exemple_2007-ertms.out.xml b/data/examples/exemple_2007-ertms.out.xml similarity index 100% rename from public/exemple_2007-ertms.out.xml rename to data/examples/exemple_2007-ertms.out.xml diff --git a/public/exemple_2007-ertms.xml b/data/examples/exemple_2007-ertms.xml similarity index 100% rename from public/exemple_2007-ertms.xml rename to data/examples/exemple_2007-ertms.xml diff --git a/XMLZIPFile_02.zip b/data/orig/XMLZIPFile_02.zip similarity index 100% rename from XMLZIPFile_02.zip rename to data/orig/XMLZIPFile_02.zip diff --git a/original-req.zip b/data/orig/original-req.zip similarity index 100% rename from original-req.zip rename to data/orig/original-req.zip diff --git a/req_document.xsd b/data/orig/req_document.xsd similarity index 100% rename from req_document.xsd rename to data/orig/req_document.xsd diff --git a/scripts/unl2rdf-app-1.0-SNAPSHOT-jar-with-dependencies.jar b/scripts/unl2rdf-app-1.0-SNAPSHOT-jar-with-dependencies.jar new file mode 100644 index 0000000000000000000000000000000000000000..16eded785118a6c0d87e945f1e4b159d0cf1ea3c Binary files /dev/null and b/scripts/unl2rdf-app-1.0-SNAPSHOT-jar-with-dependencies.jar differ diff --git a/unlizeXml.py b/scripts/unlizeXml.py similarity index 100% rename from unlizeXml.py rename to scripts/unlizeXml.py