@@ -22,11 +22,11 @@ Preprint of the paper available at ResearchGate:
https://goo.gl/HxJD7X
## Usage of the script to encode a document
## Usage of the `unlizeXml.py` script to enconvert a document
The encoding script work on xml files conforming to `./data/orig/req_document.xsd`
The encoding script works on xml files conforming to `./data/orig/req_document.xsd`
Examples of an input anf outputs are provided in `./data/examples/`
Examples of input and outputs are provided in `./data/examples/`
Ziped folders of "unlized" XML files of the corpus are available in the ./data folder.
...
...
@@ -63,3 +63,30 @@ Options:
--unltools-path FILE Path of the unltools jar
--help Show this message and exit.
```
`batch_unlizeXml.sh` is an example of script to encode a batch of files in a folder the folder name is hardcoded in the script)
## Usage of the `unlizeToNotebook.py` script to convert a document to a Jupyter notebook for UNL graph visualisation and post edition
The encoding script works on xml files conforming to `./data/orig/req_document.xsd` or on enconverted files (outputs of of the `unlizeXml.py` script).
In the first case it will firstly enconvert the files using http://unl.ru/deco
Examples of input and outputs are provided in `./data/examples/`
Ziped folders of "unlized" XML files of the corpus are (or will be) available in the ./data folder.
To convert a file use a command like the following (it takes a moment because the notebook is executed before saving) :
```
python unlizeToNotebook.py input.xml output.ipynb
```
`batch_unlizeToNotebook.sh` is an example of script to encode a batch of files in a folder the folder name is hardcoded in the script)
By default, the ipynb notebook uses a [unlTools](https://gitlab.tetras-libre.fr/unl/unlTools) jar executable to convert UNL texts to SVG graphs.
You can download the jar in the [release section of unlTools](https://gitlab.tetras-libre.fr/unl/unlTools/-/releases), it must be placed in the same directory as the notebook.
As an alternative, you can modify the second cell of the notebook so it uses the unl2rdf Webservice instead of a local jar.