First attempt to customise ReadMe

This commit is contained in:
Robert Hunt 2020-11-11 15:21:50 +13:00
parent 94f61c2431
commit 5db574d4f6
1 changed files with 82 additions and 3 deletions

View File

@ -1,5 +1,84 @@
# en_twl
Links from the original languages to Translation Words.
Previously the links in UHB and UGNT were used (but that didn't enable them to be customized for Gateway Languages).
Links from the original languages to Translation Words.
Previously the links in UHB and UGNT were used (but that didn't enable them to be customized for Gateway Languages).
## Editing the UTN
To edit the UTN files there are three options:
* Use LibreOffice (Recommended)
* Use a text editor on your computer
* Use the online web editor in DCS
Each of these options and their caveats are described below.
The first two options require you to clone the repository to your computer first. You may do this on the command line or using a program such as SmartGit. After making changes to the files you will need to commit and push your changes to the server and then create a Pull Request to merge them to the `master` branch.
Alternately, you may [download the master branch as a zip file](https://git.door43.org/unfoldingWord/en_tn2/archive/master.zip) and extract that locally. After editing you would need to use the upload file feature in DCS to get your changes ready for a Pull Request.
### Editing in LibreOffice
This is the recommended way to edit the TSV files. You may [download LibreOffice](https://www.libreoffice.org/download/download/) for free.
After you have the file on your computer, you may open the respective TSV file with LibreOffice. Follow these notes on the Text Import Screen:
* Set “Separated by” to “Tab”
* Set “Text Delimiter” to blank, you will need to highlight the character and use backspace or delete to remove it
It should look like this:
![](https://cdn.door43.org/assets/img/twl/LibreOfficeTextImport.png)
When you are done editing, click Save and then select “Use Text CSV Format” on the pop up dialogue. Note that even though it says CSV, it will use tab characters as the field separators.
**Note:** Other spreadsheet editors **should not** be used because they will add or remove quotation marks which will affect the entries negatively.
### Editing in a Text Editor
You may also use a regular text editor to make changes to the files.
**Note:** You must be careful not to delete or add any tab characters when editing with this method.
### Editing in DCS
If you only need to change a word or two, this may be the quickest way to make your change. See the [protected branch workflow](https://help.door43.org/en/knowledgebase/15-door43-content-service/docs/46-protected-branch-workflow) document for step by step instructions.
**Note:** You must be careful not to delete any tab characters when editing with this method.
## Structure
The UTWL are structured as TSV files to simplify importing and exporting into various formats for translation and presentation. This enables the tWLs to be keyed to the original Greek and Hebrew text instead of only a Gateway Language translation.
### TSV Format Overview
A Tab Separated Value (TSV) file is like a Comma Separated Value file except that the tab character is what divides the values instead of a comma. This makes it easier to include prose text in the files because many languages require the use of commas, single quotes, and double quotes in their sentences and paragraphs.
The UTWL are structured as one file per book of the bible and encoded in TSV format, for example, `GEN_twl.tsv`. The columns are `Reference`, `ID`, `Tags`, `SupportReference`, `Quote`, `Occurrence`, and `OccurrenceNote`.
### UTN TSV Column Description
The following lists each column with a brief description and example.
* `Reference` Chapter number (e.g. `1`) then colon then verse number (e.g. `3`) or `intro`
* `ID` Four character **alphanumeric** string unique *within* the verse for the resource (e.g. `swi9`)
* This will be helpful in identifing which links came from the English resources and which links have been added by GLs.
* The Universal ID (UID) of a note is the combination of the `Book`, `Chapter`, `Verse`, and `ID` fields. For example, `tit/1/3/swi9`.
* This is a useful way to unambiguously refer to links.
* An [RC link](http://resource-container.readthedocs.io/en/latest/linking.html) can resolve to a specific note like this: `rc://en/tn/help/tit/01/01/swi9`.
* `Tags` (optional) any of `keyterm` or `name`, separated by `; ` if there's more than one
* `SupportReference`
* A link to a translation word, like `rc://*/tw/dict/bible/names/paul` or `rc://*/tw/dict/bible/kt/pray`
* `OrigQuote` Original language quote (e.g. `ἐφανέρωσεν↔τὸν λόγον αὐτοῦ`)
* Software (such as tC) should use this for determining what is highlighted
* An left right arrow character (↔) indicates that the quote is discontinuous, software should interpret this in a non-greedy manner
* `Occurrence` Specifies which occurrence in the original language text the entry applies to.
* `-1`: entry applies to every occurrence of OrigQuote in the verse
* `0`: entry does not occur in original language (for example, “Connecting Statement:”)
* `1`: entry applies to first occurrence of OrigQuote only
* `2`: entry applies to second occurrence of OrigQuote only
* etc.
* `Annotation` not used for Translation Word Links