Character Conversions from Browser to Database in JAVA


Character Conversions from Browser to Database in JAVA

Characters in route to their final storage destination on the World Wide Web,
move through various layers of programming interfaces and can cross software and hardware boundaries.
This article provides helpful hints and best practices for accurately transporting character data from browser to
database and back again.

What if you want to POST non-ASCII data? All is well since you set that URIEncoding flag, right? Wrong. Tomcat doesn’t use the URIEncoding flag for POSTed form data. So, what does it use? ISO-8859-1.

So now, you’re back to where you started, and the simple application still greets Mr. ç°ä, instead of Mr. Japanese characters for Tanaka. Not good. Sun’s application server, however, does correctly
interpret both GET and POST data after setting the parameter-encoding tag as shown earlier, so this coding works well for users on that system.

Unfortunately, these solutions are completely server dependent, and you can’t always control where your applications will be deployed. Fortunately, server-independent solutions exist.

Perhaps the most basic server-independent solution is to set a context parameter indicating the character encoding choice for all forms in the application. Then your application can read the
context parameter and can set the request character encoding before reading any request parameters. You can set the request encoding in either a Java Servlet or in JSP syntax.

Setting the context parameter is done in the WEB-INF/web.xml file.

<context-param>
<param-name>PARAMETER_ENCODING</param-name>
<param-value>UTF-8</param-value>
</context-param>

Add the following code just before reading any parameters in the JSP file.

String paramEncoding = application.getInitParameter(“PARAMETER_ENCODING”);
request.setCharacterEncoding(paramEncoding);

From a control servlet, you can read the parameter during servlet initiation and set the encoding before processing the parameters:

public class MsgHandler extends HttpServlet {

private String encoding;

public void init() throws ServletException {
ServletConfig config = getServletConfig();
encoding = config.getInitParameter(“PARAMETER_ENCODING”);
}

protected void processRequest(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
if (encoding != null) {
request.setCharacterEncoding(encoding);
}

}

protected void doGet(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
processRequest(request, response);
}

protected void doPost(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
processRequest(request, response);
}
}

Now whether you GET or POST form data, your processing JSP or servlet code can correctly read parameters because you’ve explicitly told the web server
which character encoding to use when processing the request.

For more detailed information visit –

http://java.sun.com/developer/technicalArticles/Intl/HTTPCharset/
http://wiki.apache.org/tomcat/FAQ/CharacterEncoding

Related Posts: