Saturday, January 30, 2010

New Line Character's Character Count

Dear All,

Each New Line character(Enter Key) occupies 2 character space.



A newline(Enter Key), also known as a line break or end-of-line (EOL) character is a special character. It actually contains sequence of characters for some configurations(OperatingSystem + Application) signifying the end of a line of text.



The name comes from the fact that the next character after the newline will appear on a new line—that is, on the next line below the text immediately preceding the newline. The actual codes representing a newline vary across operating systems, which can be a problem when exchanging data between systems with different representations.



In above example I have entered
"Hi< NewLine >< NewLine >Bye" that means total 7 characters if we consider "NewLine" character as single character..... till now this is what we think.

I have selected preview option and clicked on Send.



Here in above image character count says
9 character, because each "NewLine"(Enter Key) character occupies 2 characters.


How To Use New Line In Different Programming Languages


VB : vbCrLf
VB.Net : vbCrLf
C & C++ : "\n"
HTML : "< b r />"



Now what does CrLf means!!


The Unicode standard defines a large number of characters that conforming applications should recognize as line terminators:


LF: Line Feed, U+000A
CR: Carriage Return, U+000D
CR+LF: CR (U+000D) followed by LF (U+000A)
NEL: Next Line, U+0085
FF: Form Feed, U+000C
LS: Line Separator, U+2028
PS: Paragraph Separator, U+2029


LF – Line feed


The line feed character is one of the characters in the ASCII character set that has been misused. Originaly, the LF character was ment to move the head of a printer one line down. A second control character CR would then be used to move the printing head to the left margin. This is the way it was implemented in many serial protocols and in operating systems like MS-DOS and Windows. On the other hand the C programming language and Unix operating system redefined this character as newline which ment a combination of line feed and carriage return. You can argue about which use is wrong. The way C and Unix handle it is certainly more natural from a programming point of view. On the other hand is the MS-DOS implementation closer to the original definition. It would have been better if both line feed and newline were part of the original ASCII definition because the first defines a typical device control functionality where the latter is a logical text separator. But this separation is not the case. Nowadays people tend to use the LF character mainly as newline function and most software that handles plain ASCII text files is capable of handling both single LF and CR/LF combinations. The control character is in the programming language C available as \n.

CR – Carriage return


The carriage return in the ASCII character set in its original form is ment to move the printing head back to the left margin without moving to the next line. Over time this code has also been assigned to the enter key on keyboards to signal that the input of text is finished. With screen oriented representation of data, people wanted that entering data would also imply that the cursor positioned to the next line. Therefore, in the C programming language and the unix operating system, a redefinition of the LF control code has taken place to newline. Often software now silently translates an entered CR to the LF ASCII code when the data is stored.