"pageEncoding" is one of the 14 attributes supported by JSP page directive.
pageEncoding is used to read correctly the file data characters available on OS file system by JSP. If pageEncoding attribute is ignored by the Programmer, sometimes, it may lead problems with Cyrillic and Chinese characters especially with PHP language.
A file may contain only text. Even still, it is after all, a sequence bytes. It should be informed to JSP in what format the data is to be read from the file. It is denoted by pageEncoding where we write the name of the charset (character set).
1. What is UTF-8?
- UTF stands for UCS Transformation Format where UCS stands for Unified Computing System developed by Cisco.
- The size of character in UTF-8 may be 1 byte to 4 bytes.
- UTF-8 represents a Unicode character (Java supports Unicode characters with char size of 2 bytes).
- As ASCII code is a subset of Unicode; meaning, UTF-8 supports ASCII characters also.
- UTF-8 is chosen as the best choice for email and Web pages.
- XML code confirms by default to UTF-8.
2. What is ISO-8859-1?
- ISO-8859-1 and ISO-8859-15 are mostly similar in usage.
- ISO-8859-1 includes all the characters of languages of America, much of Africa and Western European countries etc.
- ISO-8859-1 confirms to ASCII characters.
- ISO-8859-1 is the default in JSP.
In real practice, both UTF-8 and ISO-8859-1 are interchangeably used (but problem may come with special characters like ~, %, & and +). If wrongly set and read a file with a different format of characters, it is a page-translation error (when a JSP is converted to Servlet source file).
Some Examples on pageEncoding JSP
<%@ page contentType="text/xml" pageEncoding="UTF-8" %>
<%@ page contentType="text/html; charset=UTF-8" pageEncoding="UTF-8" %>
charset and pageEncoding are equally important and should not be confused in between. charset is the format of data sent as response to client by the Web server. pageEncoding specifies the format of data to be read from a file on the file system. For English text files, both do not have difference.