First results: puzzling
Guess my JSP is built extremely unfairly as far as our production servers are concerned. I'll just put it right here, for future reference.
So, yesterday I was complaining how Opteron workstation handles 500 threads without breaking a sweat. Today I tried the same test on dual-Xeon box and results are much worse. 300 threads load this one to 50% CPU idle. 500 make it eventually to time out, as it gets extremely busy. Wonder what's the major issue here. The differences between the two solutions are numerous.
My workstation is an Opteron uniprocessor (1MB L2 cache) with 1GB of memory running Solaris and 64-bit JDK 1.5.
Production server uses 2 Xeon processors with 512 kB L2 cache and 2GB of memory; it runs not very modern, (but not too old either) version of RedHat Workstation, and uses 32-bit JDK 1.5.
Apriori, the results should have been quite different. I'm begging for an explanation! Is it that threading in Java running on RedHat is so much worse than in Solaris? Are too many (500, how's that too many?!) open network sockets overload TCP/IP stack? Does the fact that while generating this page I put its thread to sleep 4 times put *ell server at a disadvantage?
Note that in that particular test I was using this query string: "?load=0&size=200&delay=30000". This at least shows that the need to add long integers is out of the picture.
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<%@ page import="java.io.IOException,
java.text.NumberFormat,
java.text.DecimalFormat"
contentType="text/html;charset=UTF-8" language="java" %>
<html>
<head><title>Simple jsp page</title></head>
<body>
<%!
int delay = 300;
int size = 100/* 4 * 256 */;
int load = 100; /* number of heavy operations
taking significant CPU and
some time to execute */
long tempLoadVar;
static int LOAD_ITERATIONS = 1000000;
%>
<%
if (request.getParameter("delay") != null) {
delay = Integer.parseInt(request.getParameter("delay"));
}
if (request.getParameter("size") != null) {
size = Integer.parseInt(request.getParameter("size"));
}
if (request.getParameter("load") != null) {
load = Integer.parseInt(request.getParameter("load"));
}
%>
<%
outputSome(out);
sleepAbit();
workSome();
outputSome(out);
sleepAbit();
workSome();
outputSome(out);
sleepAbit();
outputSome(out);
sleepAbit();
%>
</body>
</html>
<%!
/** Sleep for just a bit.
* Expect to sleep 4 times while page executes. */
private void sleepAbit() {
try {
if (delay != 0) {
Thread.sleep(delay/4);
}
} catch (InterruptedException e) {
e.printStackTrace();
}
}
/** Output part of the content. 4 chunks per page. */
private void outputSome(JspWriter out) throws IOException {
NumberFormat format36 = new DecimalFormat();
format36.setMinimumIntegerDigits(36);
format36.setMaximumFractionDigits(0);
format36.setGroupingUsed(false);
NumberFormat format10 = new DecimalFormat();
format10.setMinimumIntegerDigits(10);
format10.setMaximumFractionDigits(0);
format10.setGroupingUsed(false);
NumberFormat format3 = new DecimalFormat();
format3.setMinimumIntegerDigits(3);
format3.setMaximumFractionDigits(0);
for (int i = 0; i < size; i++) {
out.write("<p><div><pre>\n"); // 14
out.write("\t\r\n"); // 3
out.write(format36.format(tempLoadVar)); // 36
out.write(" : "); // 3
out.write(format10.format(i)); // 10
out.write(": blah\n"); // 7
for (int j = 0; j < 4; j++) {
for (int k = 0; k < 4; k++) {
out.write("\t\t" + format3.format(j)
+ " " + format3.format(k) + "\n"); // 10
}
}
out.write("</pre></div></p>\n<br />"); // 23
}
}
/** Use up some CPU. Do that twice. */
private void workSome() {
for (int i = 0; i < load; i++) {
for (int j = 0; j < LOAD_ITERATIONS; j++) {
tempLoadVar = tempLoadVar + 1;
}
}
}
%>
So, yesterday I was complaining how Opteron workstation handles 500 threads without breaking a sweat. Today I tried the same test on dual-Xeon box and results are much worse. 300 threads load this one to 50% CPU idle. 500 make it eventually to time out, as it gets extremely busy. Wonder what's the major issue here. The differences between the two solutions are numerous.
My workstation is an Opteron uniprocessor (1MB L2 cache) with 1GB of memory running Solaris and 64-bit JDK 1.5.
Production server uses 2 Xeon processors with 512 kB L2 cache and 2GB of memory; it runs not very modern, (but not too old either) version of RedHat Workstation, and uses 32-bit JDK 1.5.
Apriori, the results should have been quite different. I'm begging for an explanation! Is it that threading in Java running on RedHat is so much worse than in Solaris? Are too many (500, how's that too many?!) open network sockets overload TCP/IP stack? Does the fact that while generating this page I put its thread to sleep 4 times put *ell server at a disadvantage?
Note that in that particular test I was using this query string: "?load=0&size=200&delay=30000". This at least shows that the need to add long integers is out of the picture.
2 Comments:
Just to be sure: are you comapring a 2.4 kernel (for Xeon) to 2.6 or even 2.4-el (Opteron)? The old 2.4 had an outrageous thread model: 500 threads was not an option in my experiment.
Besides, the Opteron architecture is superior beyond compare, IMHO. Xeon is only usable if it's given away for free.
dual Xeon box runs RHWS with 2.4 kernel. But machine with Opteron, I tested it under Solaris 10 x86 only... Life's unfair :)
Post a Comment
<< Home