SCHEMATOPOIESIS aims to develop a Greek style checker for technical documents, exploiting techniques from the area of Language Technology.
Both humans and computers may experience difficulties in understanding natural language due to its inherent ambiguity and complexity. Controlled languages handle this problem by simplifying the language. A controlled language is a language with a restricted syntax and vocabulary. Controlled languages are typically applied to technical documents. A consistent technical writing style improves comprehensibility adding to the quality of technical documentation. Such an approach is followed widely in technical documentation. For example, a simple set of style guidelines for user documentation might be: Make positive statements, Keep sentences short, Use only one idea per sentence, Use simple sentence structures, Use the active voice, Avoid conditional tenses, Use correct punctuation.
These restrictions help to preserve uniformity in the writing style, especially in cases where authors tend to follow diverse writing approaches, and to reduce ambiguities in the resulting text.
Using a controlled language also impacts translation. Translation requires a good understanding of the vocabulary, including terminology, and unambiguous understanding of the syntactic constructs used in the original text. Using a controlled language to ensure an unambiguous original text obviously makes translation more efficient and improves the quality of the translated version. Using a controlled language also opens the way to using machine translation systems. This is because the resources already provided for the controlled language (vocabulary, terminology support and syntax rules) can be used for training a machine translation system. This removes the need for post-editing of translated texts, and significantly reduces the turnaround time for texts and the resources required for translation.
SCHEMATOPOIESIS aims to develop the first Greek prototype style checker to assist Greek technical writers as well as to facilitate translation from Greek to other languages. The project covers technical documents from the domain of computational equipment.
User Interface. Technical writers will be able to call the checker through their word processor. The technical document is first converted into an XML format in order to be processed by the checker. The checker outputs XML text containing the error tags in a format “understandable” by the word-processor in order to let the user see his/her errors.
Checker. This tool checks both text structure (e.g. line spacing, fonts used) and language (correct application of controlled language grammar and vocabulary). At first, XML text is checked using a structure DTD (Document Type Definition) and the corresponding error tags are attached. Then, text is processed using linguistic tools (tokeniser, sentence splitter, lexical analyser, syntactic analyser) in order to apply the language checker using the corresponding language DTD. This results to the attachment of error tags related to language use.
SCHEMATOPOIESIS’s technology will facilitate the writing of Greek technical documents as well as their translation into other languages. It can also be exploited in other domains of technical documents.