Unicode character for marking -


We are going to digitize many books, we have the location of the line break in the original book without affecting the flow of the digital book. Want to mark.

(\ n will be used to indicate the end of paragraphs)

  This is a sentence in which an unknown Unicode charter can be used to access a particular file Can be used to mark. The original book that I want to mark line break locations What is the proper character to change   

* :

  This is a sentence in the original sentence * that i want To mark line * break locations, there is no concept of a hidden character in Unicode which represents a line break in some origin, but    

< Div class = "post-text" itemprop = "text">

Rendering does not cause line breaks Unicode encodes plain text data, and line breaks Sector is granted when plain text bearing his control characters.

What matters here is how files are used if they need to be processed with plain text editors, then you need to decide: either line breaks in default rendering Repeats, or when you create a file, they are skipped, you can make them invisible. And different text editors like Notepad and Emacs can use different line control conventions; The end of a line program is another program of paragraph.

If the files will only be processed by programs that you create, then whatever conventions you can use. The most logical is this: a. Line and paragraph separator. Unicode provides two clear letters, U + 2028 line separator and U + 2029 paragraph separator to separate the standard line and paragraphs. In Unicode plain text, they are considered default form of linear and paragraph lines. After each line separator a new line has started. After each paragraph separator a new paragraph has started. Since these characters are the separator codes, it is not necessary to start the first line or paragraph or to end the last line or paragraph with them. Doing this indicates that the empty paragraph or line was the following. The paragraph separator can be inserted between the paragraphs of the text. It allows the creation of plain text files, which can be placed at a different line width at the end. Line separator can be used to indicate the end of the line without the line. (Pdf in page 6 and 7)

Be careful that U + 2028 and U + 209 are not generally understood by the text editors, they are suitable for storing data in plain text format . In order to present the text, the rendering software has the option of ignoring the original partition in lines and treating a space equal to U + 2028, except that a hyphen (which is a problem that solves without the high level Information can not be made: A line that ends with 'FUU' and is indexed by a line starting with a 'bar', the word can represent the word "FOB" As the line breaking the Haifnet, or a Haifnet compound containing a Few-time or in some cases, the combination of "Fu-time.).

Comments