Discussion:
Help with a benchmark.
Eric Berry
2011-10-27 05:28:51 UTC
Permalink
Hello, I've been setting up a benchmark for my company lately where I'm
testing Grails against PHP. The PHP framework is a home-brewed MVC that the
company built in the last few years.

I'm seeing a gross difference in favor of PHP at higher connection counts
(200+).

The PHP app was done by another developer as I don't know PHP well enough.

But here's a summary of the set up.

The benchmark is set up on 4 nodes in EC2 - Large Standard Instances.

Node1: Benchmark tool (Faban)
Set up to do 3 HTTP requests
Request 1: GET the "create" page.
Simple "user" form with fields for first and last name.
Request 2: POST to the "save" action which results in just the
new User's ID rendered as content.
Request 3: GET the "show" page based on the ID from the previous
POST request.

Node2: Mock User-API (REST - Jersey, Jackson)
User objects are stored in memory using a hashmap
ID's created using System.currentTimeMillis()
Maximum recorded request time 93ms
Over the last few days.

Node3: Grails app (Tomcat 6)
Set up in Production mode using in memory DB. The DB is not used,
all data comes from a UserService
UserService uses RESTClient (rest plugin) to make rest calls to mock
user-api.
UserService marked with transactional = false

Node4: PHP app
Uses APC cache

Up to about 200 concurrent users Grails and PHP is neck and neck. After 200
concurrent users Grails starts to drastically fall behind.

I start seeing Tomcat errors complaining about the 200 thread limit being
reached. If I increase it, I get "Too Many Files Opened" errors.

One thing I notice is that there are significantly long requests. Especially
at the beginning of the run and randomly throughout the steady-state. These
long requests take upwards of 45 seconds to complete.

The heap is set to 1024m, and I'm using ConcMarkSweep, and ClassUnloading
(thinking it might be a GC issue).

The odd thing about these long requests is that the monitoring I have in
place on the mock api side shows that the longest running request rarely
goes over 100ms. Once I had a long request go to 200ms.

My issue is that I'm not seeing the same behavior from the PHP app. At 400
concurrent users the Grails app struggles, and fails to keep the avg request
time below 1 sec and maxes out at around 300 req/sec and I start to see
connection timeout errors. The PHP app seems to run smoothly and keeps the
avg. request time below 1 sec. and takes around 500 req/sec with no
timeouts.

I'd like to ask the Grails community for some help. Are there areas in
Grails I should look out for, known issues, or common performance tunes I
can do to Grails/Tomcat? I can't imagine PHP doing so significantly better
than Grails.

Any help, or advice would be greatly appreciated.

Thanks,
Eric
--
Learn from the past. Live in the present. Plan for the future.
Blog: http://eric-berry.blogspot.com
jEdit <http://www.jedit.org> - Programmer's Text Editor
Bazaar <http://bazaar.canonical.com> - Version Control for Humans
Lari Hotari
2011-10-28 05:36:00 UTC
Permalink
Hi,
All dynamic methods will create a new http client when invoked unless
you define an |id:| attribute. When this attribute is supplied the
client will be stored as a property on the instance's metaClass. You
will be able to access it via regular property access or using
the |id:| again.
Have you specified an id-attribute in grails-rest with* calls?

Relates to
http://stackoverflow.com/questions/1281219/best-practice-to-use-httpclient-in-multithreaded-environment
All the while, I am using HttpClient in multithreaded environment. For
every threads, when they initiate a connection, they will create a
complete new HttpClient instance.
Recently, I discover, by using this approach, it can cause the user is
having too many port being opened, and most of the connections are in
TIME_WAIT state.
So a new HttpClient instance leaves a lot of connections in TIME_WAIT
state for some time.
The "too many open files" error you got is probably caused by this
behaviour (Googled:
http://www.linuxquestions.org/questions/linux-newbie-8/var-log-messages-too-many-files-open-335384/)
.

Before fixing anything, re-run your benchmark and check with "netstat
-an|grep WAIT" command to see if it's a long list.

--

Grails creates a session for session-scope and flash-scope .
For a benchmark, I'd recommend setting the session timeout to a low
value since a huge amount of hanging sessions could be created if the
benchmark never "logs out" and keeps creating new sessions. You can set
the timeout in web.xml (after doing "grails install-templates") to
something like 1 minute.

Add this to src/templates/war/web.xml after the servlet-mapping element:
<session-config>
<!-- 1 minute timeout for benchmarking -->
<session-timeout>1</session-timeout>
</session-config>

--

btw. What Grails version are you testing with? What exact JVM version?
exact JVM params?
What Tomcat version? OS version?
Have you set Permgen size to something reasonable (for example:
-XX:PermSize=128M -XX:MaxPermSize=256M)?

Example settings for Tomcat:
Create <tomcat home directory>/bin/setenv.sh file with settings:
CATALINA_OPTS="-server -noverify -Xshare:off -Xms512M -Xmx512M
-XX:MaxPermSize=256M -XX:PermSize=128M -XX:+UseParallelGC"
CATALINA_OPTS="${CATALINA_OPTS} -Dgrails.env=production"
CATALINA_OPTS="${CATALINA_OPTS} -Djava.net.preferIPv4Stack=true
-XX:+EliminateLocks -XX:+UseBiasedLocking"
# optional setting: MaxJavaStackTraceDepth, it reduces performance
overhead of long exception stacktraces by limiting their size to 100
stacktrace elements
# You should eliminate all exceptions (even catched) to get good Grails
performance
# always use some JVM profiler (for example Yourkit YJP) to see if any
exceptions are created during normal program flow
# Grails and Groovy create some exceptions in initializations but after
a few requests, no new exceptions should be created.
CATALINA_OPTS="${CATALINA_OPTS} -XX:MaxJavaStackTraceDepth=100"

--

Have you checked if your OS starts swapping? (Monitoring in Linux:
"vmstat 1", make sure "swap si / so" are 0 during the benchmark).
For Linux, use should usually tune swappiness for running a JVM:
in /etc/sysctl.conf
vm.swappiness=5
(changing temporarily/immediately: "echo 5 | sudo tee
/proc/sys/vm/swappiness")
Checking current value: cat /proc/sys/vm/swappiness

vm.swappiness defaults to 60 on many distros and that means that the OS
starts swapping after 60% of memory is used. Since the JVM allocates a
lot of memory, it will usually cause some swapping if you don't tune
swappiness.

You might also have to tune the Linux OS TCP/IP settings if "netstat
-an" still shows a lot of hanging connections after adding the
id-parameter to grails-rest calls.

--

In benchmarking Grails you usually have to first do about 20000 requests
and give the JVM a breath for about 10 seconds after that (I assume the
JVM delays background compilation when CPU usage is high and it does JIT
compilation during this sleep time).
After this, the throughput performance is usually about 5x better than
in the beginning.

Regards,

Lari
Hello, I've been setting up a benchmark for my company lately where
I'm testing Grails against PHP. The PHP framework is a home-brewed MVC
that the company built in the last few years.
I'm seeing a gross difference in favor of PHP at higher connection
counts (200+).
The PHP app was done by another developer as I don't know PHP well enough.
But here's a summary of the set up.
The benchmark is set up on 4 nodes in EC2 - Large Standard Instances.
Node1: Benchmark tool (Faban)
Set up to do 3 HTTP requests
Request 1: GET the "create" page.
Simple "user" form with fields for first and last name.
Request 2: POST to the "save" action which results in just
the new User's ID rendered as content.
Request 3: GET the "show" page based on the ID from the
previous POST request.
Node2: Mock User-API (REST - Jersey, Jackson)
User objects are stored in memory using a hashmap
ID's created using System.currentTimeMillis()
Maximum recorded request time 93ms
Over the last few days.
Node3: Grails app (Tomcat 6)
Set up in Production mode using in memory DB. The DB is not
used, all data comes from a UserService
UserService uses RESTClient (rest plugin) to make rest calls
to mock user-api.
UserService marked with transactional = false
Node4: PHP app
Uses APC cache
Up to about 200 concurrent users Grails and PHP is neck and neck.
After 200 concurrent users Grails starts to drastically fall behind.
I start seeing Tomcat errors complaining about the 200 thread limit
being reached. If I increase it, I get "Too Many Files Opened" errors.
One thing I notice is that there are significantly long requests.
Especially at the beginning of the run and randomly throughout the
steady-state. These long requests take upwards of 45 seconds to complete.
The heap is set to 1024m, and I'm using ConcMarkSweep, and
ClassUnloading (thinking it might be a GC issue).
The odd thing about these long requests is that the monitoring I have
in place on the mock api side shows that the longest running request
rarely goes over 100ms. Once I had a long request go to 200ms.
My issue is that I'm not seeing the same behavior from the PHP app. At
400 concurrent users the Grails app struggles, and fails to keep the
avg request time below 1 sec and maxes out at around 300 req/sec and I
start to see connection timeout errors. The PHP app seems to run
smoothly and keeps the avg. request time below 1 sec. and takes around
500 req/sec with no timeouts.
I'd like to ask the Grails community for some help. Are there areas in
Grails I should look out for, known issues, or common performance
tunes I can do to Grails/Tomcat? I can't imagine PHP doing so
significantly better than Grails.
Any help, or advice would be greatly appreciated.
Thanks,
Eric
--
Learn from the past. Live in the present. Plan for the future.
Blog: http://eric-berry.blogspot.com <http://eric-berry.blogspot.com/>
jEdit <http://www.jedit.org <http://www.jedit.org/>> - Programmer's
Text Editor
Bazaar <http://bazaar.canonical.com <http://bazaar.canonical.com/>> -
Version Control for Humans
Eric Berry
2011-10-28 06:48:33 UTC
Permalink
Hi Lari,
Thank you very much for the pointers. My comments below.
Post by Lari Hotari
**
Have you specified an id-attribute in grails-rest with* calls?
Relates to
No I did not, I was just using path: 'query/path'.
Post by Lari Hotari
All the while, I am using HttpClient in multithreaded environment. For
every threads, when they initiate a connection, they will create a complete
new HttpClient instance.
Recently, I discover, by using this approach, it can cause the user is
having too many port being opened, and most of the connections are in
TIME_WAIT state.
So a new HttpClient instance leaves a lot of connections in TIME_WAIT state
for some time.
The "too many open files" error you got is probably caused by this
http://www.linuxquestions.org/questions/linux-newbie-8/var-log-messages-too-many-files-open-335384/)
.
Before fixing anything, re-run your benchmark and check with "netstat
-an|grep WAIT" command to see if it's a long list.
I suspected HttpClient might be leaking connections somehow, and I found
this:
http://stackoverflow.com/questions/4724193/how-can-i-ensure-that-my-httpclient-4-1-does-not-leak-sockets

I added some code to call the closeExpiredConnections() method on the
connection manager, which does seem to help some, but I still get the too
many ports being opened exception occasionally and it seems to slow down the
requests some (30-50 ms).

I can't try netstat atm as I'm away from the benchmarking environment, but I
will try this tomorrow and report back any findings. Do you think adding the
ID would help any, or would I still need to call closeExpiredConnections()?
Post by Lari Hotari
Grails creates a session for session-scope and flash-scope .
For a benchmark, I'd recommend setting the session timeout to a low value
since a huge amount of hanging sessions could be created if the benchmark
never "logs out" and keeps creating new sessions. You can set the timeout in
web.xml (after doing "grails install-templates") to something like 1 minute.
<session-config>
<!-- 1 minute timeout for benchmarking -->
<session-timeout>1</session-timeout>
</session-config>
Ah, nice. I hadn't thought of that. I will try this tomorrow as well. Thank
you. :)
Post by Lari Hotari
--
btw. What Grails version are you testing with? What exact JVM version?
exact JVM params?
What Tomcat version? OS version?
-XX:PermSize=128M -XX:MaxPermSize=256M)?
Ah. Sorry, I should have included this: It's Cent OS (not sure of the exact
version, will get it tomorrow too, but I think it's 4.something)

My JAVA_OPTS are as follows:
-Xmx1024m -XX:MaxPermSize=256m -server -Djava.awt.headless=true
-XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled

I'm using Grails 1.3.7, Java is 1.6.0 update 14 (1.6.0_14-b08), and it's
Tomcat 6.0.33
Post by Lari Hotari
Have you checked if your OS starts swapping? (Monitoring in Linux: "vmstat
1", make sure "swap si / so" are 0 during the benchmark).
in /etc/sysctl.conf
vm.swappiness=5
(changing temporarily/immediately: "echo 5 | sudo tee
/proc/sys/vm/swappiness")
Checking current value: cat /proc/sys/vm/swappiness
vm.swappiness defaults to 60 on many distros and that means that the OS
starts swapping after 60% of memory is used. Since the JVM allocates a lot
of memory, it will usually cause some swapping if you don't tune swappiness.
Great. Thank you for the pointer, I'll watch it tomorrow when I run the
benchmark again and see what comes up. Thank you very much for the pointer.
I want to avoid tuning the OS too much to get Grails to perform as well, or
better, because the tunings would have to be "fair". I'll try watching both
the PHP and Grails instances to see how they do with swapping during the
benchmark. If they're both doing it, then I can tune the swappiness on both
to improve both.
Post by Lari Hotari
--
In benchmarking Grails you usually have to first do about 20000 requests
and give the JVM a breath for about 10 seconds after that (I assume the JVM
delays background compilation when CPU usage is high and it does JIT
compilation during this sleep time).
After this, the throughput performance is usually about 5x better than in
the beginning.
That explains why I see really long requests right at the start, and not as
much later on. Indeed running the benchmark for longer resulted in more
stable results.

Thank you very much for all the advice. I have a lot to try tomorrow.

Cheers,
Eric
--
Learn from the past. Live in the present. Plan for the future.
Blog: http://eric-berry.blogspot.com
jEdit <http://www.jedit.org> - Programmer's Text Editor
Bazaar <http://bazaar.canonical.com> - Version Control for Humans
Lari Hotari
2011-10-28 07:25:05 UTC
Permalink
Post by Eric Berry
No I did not, I was just using path: 'query/path'.
I can't try netstat atm as I'm away from the benchmarking environment,
but I will try this tomorrow and report back any findings. Do you
think adding the ID would help any, or would I still need to call
closeExpiredConnections()?
Adding id makes it to re-use a HttpClient instance. Currently when id is
missing, it's creating a new HttpClient instances for each call.
Even after adding id there are still things to tune for grails-rest /
HttpBuilder .

grails-rest creates a HttpBuilder/RestClient instance with no arguments:
http://svn.codehaus.org/grails-plugins/grails-rest/trunk/RestGrailsPlugin.groovy

HttpBuilder creates a DefaultHttpClient:
http://svn.codehaus.org/gmod/httpbuilder/trunk/src/main/java/groovyx/net/http/HTTPBuilder.java

Looking at the code in RestGrailsPlugin you should be able to add the
HttpBuilder initialization to the Service where you use grails-rest
withHttp closures.
(Adding initialization to Controller like this doesn't work since
Controllers are request-scoped.)

add id: 'someClientId' to withHttp closure params.

add this to your Service class where you use withHttp closures:

def someClientId = new CustomHttpBuilder()

private static class CustomHttpBuilder extends HttpBuilder {
protected AbstractHttpClient createClient( HttpParams params ) {
println "\n---------------- custom HttpClient gets created
--------------------"
def connManager = new MultiThreadedHttpConnectionManager()
def connManagerParams = new HttpConnectionManagerParams()
connManagerParams.maxTotalConnections = 50 // default is 20
connManagerParams.defaultMaxConnectionsPerHost = 50 // default is 2
connManager.params = connManagerParams
new HttpClient(connManager)
}
}

I didn't test the code, but this is the main idea how you get
HttpBuilder to use MultiThreadedHttpConnectionManager with custom settings.
Maybe there is also an easier way to tweak grails-rest/HttpBuilder to
use these kind of settings.

I don't use grails-rest myself and maybe some one using grails-rest /
HttpBuilder could give better advice. I've been using Spring3
RestTemplate as a Rest client
(http://blog.springsource.com/2009/03/27/rest-in-spring-3-resttemplate/)
in a Spring3 + Java (no groovy) project and it also has to be properly
configured for good performance (examples:
http://aruld.info/resttemplate-the-spring-way-of-accessing-restful-services/
, old one but shows how to configure HttpClient for RestTemplate).

Lari
Lari Hotari
2011-10-28 07:44:25 UTC
Permalink
Post by Eric Berry
-Xmx1024m -XX:MaxPermSize=256m -server -Djava.awt.headless=true
-XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled
I'm using Grails 1.3.7, Java is 1.6.0 update 14 (1.6.0_14-b08), and
it's Tomcat 6.0.33
Is it a 32-bit or 64-bit JVM? (I recommend 32-bit JVM even on 64-bit OS
for heap sizes <1.9GB).
Your Java JVM is very old. I've ran into problems with old 1.6.0
versions. I can recommend 1.6.0 update 24 or update 27 since I've tested
Grails performance with those (we had problems with update 25, JVM
crashes). I guess latest 1.6.0 update 29 is ok too.

Lari

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email
Eric Berry
2011-10-28 19:17:12 UTC
Permalink
Post by Lari Hotari
Post by Eric Berry
-Xmx1024m -XX:MaxPermSize=256m -server -Djava.awt.headless=true
-XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled
I'm using Grails 1.3.7, Java is 1.6.0 update 14 (1.6.0_14-b08), and
it's Tomcat 6.0.33
Is it a 32-bit or 64-bit JVM? (I recommend 32-bit JVM even on 64-bit OS
for heap sizes <1.9GB).
Your Java JVM is very old. I've ran into problems with old 1.6.0
versions. I can recommend 1.6.0 update 24 or update 27 since I've tested
Grails performance with those (we had problems with update 25, JVM
crashes). I guess latest 1.6.0 update 29 is ok too.
Lari
It's 64bit JVM, and the OS version is CentOS release 5.5 (Final).

Also, I tried to create the custom HTTPBuilder as you suggested, but it
doesn't look like the MultiThreadedHttpConnectionManager isn't on the
classpath. The HttpClient version is 4.0.1, and that class is no longer in
there. There is a ThreadSafeClientConnection manager, but at that version is
wasn't as easy to set the pool settings.

I switched over to use Jersey Client since we are using it in other
applications. This seems to have solved the "Too many files open" errors. It
also seems to eliminate all of the connection errors I would get at the
higher thread count. Response times are still slow, but much better than
before, and they're not failing. Where I used to consistently get odd 45
second requests (at least once each run), the longest request time is now ~9
seconds.

Also, some of the monitoring stuff you asked for. So even with Jersey client
there seem to be a ton of TIME_WAIT entries from netstat. Not sure what I
can do about this though.
[quote] // The tomcat instance is running on port 6085
netstat -an|grep WAIT|grep 6085|wc -l
9403
[/quote]

I don't think wc is giving me exactly what I want, but probably a good idea.

Swapping doesn't seem to be an issue though, both si and so report 0
consistently throughout the benchmark.

Any ideas about the TIME_WAIT issue? I don't get "Too many files open"
errors anymore (after switching to Jersey Client), so I'm not sure it's
really an issue still.

Thanks again for all the help and pointers.

Eric
--
Learn from the past. Live in the present. Plan for the future.
Blog: http://eric-berry.blogspot.com
jEdit <http://www.jedit.org> - Programmer's Text Editor
Bazaar <http://bazaar.canonical.com> - Version Control for Humans
Jeff Scadden
2011-10-28 20:54:45 UTC
Permalink
I believe we had a similar issue with httpBuilder. There is a method called shutdown that will clean things up.

Sent from my iPhone
Post by Lari Hotari
Post by Eric Berry
-Xmx1024m -XX:MaxPermSize=256m -server -Djava.awt.headless=true
-XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled
I'm using Grails 1.3.7, Java is 1.6.0 update 14 (1.6.0_14-b08), and
it's Tomcat 6.0.33
Is it a 32-bit or 64-bit JVM? (I recommend 32-bit JVM even on 64-bit OS
for heap sizes <1.9GB).
Your Java JVM is very old. I've ran into problems with old 1.6.0
versions. I can recommend 1.6.0 update 24 or update 27 since I've tested
Grails performance with those (we had problems with update 25, JVM
crashes). I guess latest 1.6.0 update 29 is ok too.
Lari
It's 64bit JVM, and the OS version is CentOS release 5.5 (Final).
Also, I tried to create the custom HTTPBuilder as you suggested, but it doesn't look like the MultiThreadedHttpConnectionManager isn't on the classpath. The HttpClient version is 4.0.1, and that class is no longer in there. There is a ThreadSafeClientConnection manager, but at that version is wasn't as easy to set the pool settings.
I switched over to use Jersey Client since we are using it in other applications. This seems to have solved the "Too many files open" errors. It also seems to eliminate all of the connection errors I would get at the higher thread count. Response times are still slow, but much better than before, and they're not failing. Where I used to consistently get odd 45 second requests (at least once each run), the longest request time is now ~9 seconds.
Also, some of the monitoring stuff you asked for. So even with Jersey client there seem to be a ton of TIME_WAIT entries from netstat. Not sure what I can do about this though.
[quote] // The tomcat instance is running on port 6085
netstat -an|grep WAIT|grep 6085|wc -l
9403
[/quote]
I don't think wc is giving me exactly what I want, but probably a good idea.
Swapping doesn't seem to be an issue though, both si and so report 0 consistently throughout the benchmark.
Any ideas about the TIME_WAIT issue? I don't get "Too many files open" errors anymore (after switching to Jersey Client), so I'm not sure it's really an issue still.
Thanks again for all the help and pointers.
Eric
--
Learn from the past. Live in the present. Plan for the future.
Blog: http://eric-berry.blogspot.com
jEdit <http://www.jedit.org> - Programmer's Text Editor
Bazaar <http://bazaar.canonical.com> - Version Control for Humans
Eric Berry
2011-11-01 08:05:40 UTC
Permalink
Thanks for all the help folks. No matter what I did, I couldn't get the
performance close enough at higher conc. users to convince our developers
it was worth the change. I also wrote the benchmark in plain Spring
MVC/Java, so there's a chance we'll migrate to that instead, which would
put us in a good position for future interest in Grails as our front-end
framework, but doesn't look like it'll fit our needs for now.

I'll be blogging about the results from my benchmark in the next week or
two, but here's a quick summary:

Grails: tapped out at ~370-400 req/sec, running at ~80% CPU (400 conc.
users)
PHP: tapped out at ~500 req/sec, running at ~40% CPU (600 conc. users)
Java/Spring MVC: Got past 1600 req/sec, running at ~45% CPU (800 conc.
users)

Java and plain Spring MVC rocked it, and my benchmark box ran out of
sockets before I could find the breaking point, but I reckon it could have
gotten past 2000 req/sec before breaking.

I have clearance to publicize most of the code used:
https://github.com/townsfolk/Chegg-Grails-Benchmark

I can't include the PHP code, but the code above includes the Grails App,
Spring MVC app, mock user-api, and the Faban benchmark.

One more question for you though. I haven't been following Grails dev in
depth, but I know there's a 2.0 RC available, does this work with the
latest Groovy, and do you know if it makes use of any Java 7 benefits
(invoke dynamic?). Would it be worth my time to try Grails 2.0 RC with Java
7 in the benchmark?

Cheers, and many thanks again for all the help.
Eric
--
Learn from the past. Live in the present. Plan for the future.
Blog: http://eric-berry.blogspot.com
jEdit <http://www.jedit.org> - Programmer's Text Editor
Bazaar <http://bazaar.canonical.com> - Version Control for Humans
Peter Ledbrook
2011-11-01 10:42:12 UTC
Permalink
Post by Eric Berry
One more question for you though. I haven't been following Grails dev in
depth, but I know there's a 2.0 RC available, does this work with the latest
Groovy, and do you know if it makes use of any Java 7 benefits (invoke
dynamic?). Would it be worth my time to try Grails 2.0 RC with Java 7 in the
benchmark?
Grails 2.0 uses Groovy 1.8. The invokeDynamic changes will be going
into Groovy 1.9, so you won't see any improvements from that side.

BTW, have you tried Spring Insight for profiling the application (both
Grails and Spring MVC versions)? It's trivially easy to set up from
STS:



Regards,

Peter
--
Peter Ledbrook
Grails Advocate
SpringSource - A Division of VMware

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email
Eric Berry
2011-11-01 20:55:47 UTC
Permalink
Hi Peter. Thanks for your input. I'm not familiar with Spring Insight, but
it looks very interesting. I was looking at the grails-melody plugin, but
Insight seems to have a lot more functionality. Is it something that we can
use outside of STS and TCServer? Eg. Is it something we can just plug in to
standard Tomcat?
--
Learn from the past. Live in the present. Plan for the future.
Blog: http://eric-berry.blogspot.com
jEdit <http://www.jedit.org> - Programmer's Text Editor
Bazaar <http://bazaar.canonical.com> - Version Control for Humans
Lari Hotari
2011-11-01 13:09:39 UTC
Permalink
Thanks for opensourcing the benchmark. Great work Eric!

2 things came in to my mind by quickly viewing the grails-app source:

Major issues:
* *uninstall hibernate plugin* since you don't need it. The side effect:
The OSIV interceptor will bind a Hibernate session and get a HSQL
connection from the datasource for each request. The pool size is 8
connections by default.
uninstalling:
grails uninstall-plugin hibernate
rm grails-app/conf/DataSource.groovy
* *also uninstall the rest and melody plugins*
(https://github.com/townsfolk/Chegg-Grails-Benchmark/blob/master/grails/application.properties)
*to be sure* that they don't cause any side-effects.
* *Set the session timeout in the Grails application to a low value (1
minute)*:
grails install-templates
Then add this to src/templates/war/web.xml after the servlet-mapping
element:
<session-config>
<!-- 1 minute timeout for benchmarking -->
<session-timeout>1</session-timeout>
</session-config>
Grails will create a new session for each request (since flash scope is
used) and that will fill up the Tomcat active session list.

-

A minor issue:
* In Grails 1.3.x, I remember that the command object binding is
relative slow and un-optimized. Please try to replace CommandObject
binding in each controller method:
def save = { UserCommand userCommand ->
with:
def save = { ->
UserCommand userCommand = new UserCommand()
bindData(userCommand, params)
to see if it makes any difference.

(do these steps separately)

-

It is worth testing also with Grails 2.0 RC1 + Java 1.6.0_29 (uninstall
hibernate plugin if upgrade re-installs it).
Your app should be compatible with 2.0RC1 without any changes (just
"grails upgrade").


Regarding the Tcp/ip timeout settings, I've been using these values in
/etc/sysctl.conf (actually /etc/sysctl.d/99-mysettings.conf in Ubuntu
server):
# Tune TCP/IP keepalive settings
# http://tldp.org/HOWTO/TCP-Keepalive-HOWTO/usingkeepalive.html
net.ipv4.tcp_keepalive_time = 600
net.ipv4.tcp_keepalive_intvl = 60
net.ipv4.tcp_keepalive_probes = 20
(reboot to make sure they are used)


Lari
Post by Eric Berry
Thanks for all the help folks. No matter what I did, I couldn't get
the performance close enough at higher conc. users to convince our
developers it was worth the change. I also wrote the benchmark in
plain Spring MVC/Java, so there's a chance we'll migrate to that
instead, which would put us in a good position for future interest in
Grails as our front-end framework, but doesn't look like it'll fit our
needs for now.
I'll be blogging about the results from my benchmark in the next week
Grails: tapped out at ~370-400 req/sec, running at ~80% CPU (400 conc.
users)
PHP: tapped out at ~500 req/sec, running at ~40% CPU (600 conc. users)
Java/Spring MVC: Got past 1600 req/sec, running at ~45% CPU (800 conc.
users)
Java and plain Spring MVC rocked it, and my benchmark box ran out of
sockets before I could find the breaking point, but I reckon it could
have gotten past 2000 req/sec before breaking.
https://github.com/townsfolk/Chegg-Grails-Benchmark
I can't include the PHP code, but the code above includes the Grails
App, Spring MVC app, mock user-api, and the Faban benchmark.
One more question for you though. I haven't been following Grails dev
in depth, but I know there's a 2.0 RC available, does this work with
the latest Groovy, and do you know if it makes use of any Java 7
benefits (invoke dynamic?). Would it be worth my time to try Grails
2.0 RC with Java 7 in the benchmark?
Cheers, and many thanks again for all the help.
Eric
--
Learn from the past. Live in the present. Plan for the future.
Blog: http://eric-berry.blogspot.com <http://eric-berry.blogspot.com/>
jEdit <http://www.jedit.org <http://www.jedit.org/>> - Programmer's
Text Editor
Bazaar <http://bazaar.canonical.com <http://bazaar.canonical.com/>> -
Version Control for Humans
Eric Berry
2011-11-01 20:30:57 UTC
Permalink
Lari, Thank you so much for taking a look at the source code.

Removing hibernate has made a HUGE increase in throughput. I reran the 400
conc. user benchmark against the same grails code with only hibernate,
rest, and grails-melody uninstalled and the throughput doubled.

I'm not getting 800+ req/sec. The box still runs pretty hot at ~70% CPU,
but the req/second is very dramatic.

I'm going to run a 600 conc. user benchmark.

Cheers, and many thanks again!
Eric
Post by Lari Hotari
**
Thanks for opensourcing the benchmark. Great work Eric!
The OSIV interceptor will bind a Hibernate session and get a HSQL
connection from the datasource for each request. The pool size is 8
connections by default.
grails uninstall-plugin hibernate
rm grails-app/conf/DataSource.groovy
* *also uninstall the rest and melody plugins* (
https://github.com/townsfolk/Chegg-Grails-Benchmark/blob/master/grails/application.properties)
*to be sure* that they don't cause any side-effects.
* *Set the session timeout in the Grails application to a low value (1
grails install-templates
Then add this to src/templates/war/web.xml after the servlet-mapping
<session-config>
<!-- 1 minute timeout for benchmarking -->
<session-timeout>1</session-timeout>
</session-config>
Grails will create a new session for each request (since flash scope is
used) and that will fill up the Tomcat active session list.
-
* In Grails 1.3.x, I remember that the command object binding is relative
slow and un-optimized. Please try to replace CommandObject binding in each
def save = { UserCommand userCommand ->
def save = { ->
UserCommand userCommand = new UserCommand()
bindData(userCommand, params)
to see if it makes any difference.
(do these steps separately)
-
It is worth testing also with Grails 2.0 RC1 + Java 1.6.0_29 (uninstall
hibernate plugin if upgrade re-installs it).
Your app should be compatible with 2.0RC1 without any changes (just
"grails upgrade").
Regarding the Tcp/ip timeout settings, I've been using these values in
/etc/sysctl.conf (actually /etc/sysctl.d/99-mysettings.conf in Ubuntu
# Tune TCP/IP keepalive settings
# http://tldp.org/HOWTO/TCP-Keepalive-HOWTO/usingkeepalive.html
net.ipv4.tcp_keepalive_time = 600
net.ipv4.tcp_keepalive_intvl = 60
net.ipv4.tcp_keepalive_probes = 20
(reboot to make sure they are used)
Lari
Thanks for all the help folks. No matter what I did, I couldn't get the
performance close enough at higher conc. users to convince our developers
it was worth the change. I also wrote the benchmark in plain Spring
MVC/Java, so there's a chance we'll migrate to that instead, which would
put us in a good position for future interest in Grails as our front-end
framework, but doesn't look like it'll fit our needs for now.
I'll be blogging about the results from my benchmark in the next week or
Grails: tapped out at ~370-400 req/sec, running at ~80% CPU (400 conc.
users)
PHP: tapped out at ~500 req/sec, running at ~40% CPU (600 conc. users)
Java/Spring MVC: Got past 1600 req/sec, running at ~45% CPU (800 conc.
users)
Java and plain Spring MVC rocked it, and my benchmark box ran out of
sockets before I could find the breaking point, but I reckon it could have
gotten past 2000 req/sec before breaking.
https://github.com/townsfolk/Chegg-Grails-Benchmark
I can't include the PHP code, but the code above includes the Grails App,
Spring MVC app, mock user-api, and the Faban benchmark.
One more question for you though. I haven't been following Grails dev in
depth, but I know there's a 2.0 RC available, does this work with the
latest Groovy, and do you know if it makes use of any Java 7 benefits
(invoke dynamic?). Would it be worth my time to try Grails 2.0 RC with Java
7 in the benchmark?
Cheers, and many thanks again for all the help.
Eric
--
Learn from the past. Live in the present. Plan for the future.
Blog: http://eric-berry.blogspot.com
jEdit <http://www.jedit.org> - Programmer's Text Editor
Bazaar <http://bazaar.canonical.com> - Version Control for Humans
--
Learn from the past. Live in the present. Plan for the future.
Blog: http://eric-berry.blogspot.com
jEdit <http://www.jedit.org> - Programmer's Text Editor
Bazaar <http://bazaar.canonical.com> - Version Control for Humans
Eric Berry
2011-11-01 20:32:48 UTC
Permalink
Sorry, not = now. I'm NOW getting 800+ req/sec. :)
Post by Eric Berry
Lari, Thank you so much for taking a look at the source code.
Removing hibernate has made a HUGE increase in throughput. I reran the 400
conc. user benchmark against the same grails code with only hibernate,
rest, and grails-melody uninstalled and the throughput doubled.
I'm not getting 800+ req/sec. The box still runs pretty hot at ~70% CPU,
but the req/second is very dramatic.
I'm going to run a 600 conc. user benchmark.
Cheers, and many thanks again!
Eric
--
Learn from the past. Live in the present. Plan for the future.
Blog: http://eric-berry.blogspot.com
jEdit <http://www.jedit.org> - Programmer's Text Editor
Bazaar <http://bazaar.canonical.com> - Version Control for Humans
Lari Hotari
2011-11-02 06:15:47 UTC
Permalink
Thanks for your patience and doing the re-testing.

I've added a Jira issue about increasing the default connection pool size:
http://jira.grails.org/browse/GRAILS-8238

It would be interesting to get your benchmark results for Grails 2.0RC1 .
If you have enough time, convert the Controller closures to actions
before testing:
http://grails.org/doc/2.0.0.M1/guide/introduction.html#webFeatures

I believe that one of the bottlenecks of your benchmark in 1.3.7 is the
command object binding. (I haven't had time to setup and run the
benchmark myself.)

-

Some information and tips I'd like to share about Grails profiling:

I've been using YJP (Yourkit Java Profiler,
http://www.yourkit.com/java/profiler/index.jsp) for profiling Grails.
You can get an 15-day evaluation license by registering (or just buy it)
if you don't own a license.
There used to be a free license available for Grails core and plugin
developers (for developing Codehaus opensource projects), but it doesn't
seem to be valid anymore.

I've got the most useful information from profiling by using sampling
profiling with "monitors" and "exception telemetry" enabled.
This is the Java command line option I use for profiling 32-bit JVM
(change linux-x86-32 to linux-x86-64 on 64-bit JVM):
-agentpath:/opt/yjp/bin/linux-x86-32/libyjpagent.so=delay=30000,disablealloc,disabletracing,disablej2ee,noj2ee,disablestacktelemetry,builtinprobes=none,sampling,monitors,onexit=snapshot
(http://www.yourkit.com/docs/90/help/additional_agent_options.jsp for
list of available command line options)

For only sampling profiling (minimum overhead), use:
-agentpath:/opt/yjp/bin/linux-x86-32/libyjpagent.so=delay=30000,disableall,sampling,onexit=snapshot

The easier way to profile "grails run-war" with YJP is to add these
lines to BuildConfig.groovy

grails.tomcat.jvmArgs = ["-XX:+DisableExplicitGC", '-Xmx512M',
'-Xms512M', '-XX:PermSize=192m', '-XX:MaxPermSize=192m']
if(System.getProperty('grails.yjp')) {
grails.tomcat.jvmArgs +=
["-agentpath:/opt/yjp/bin/linux-x86-32/libyjpagent.so=delay=30000,disablealloc,disabletracing,disablej2ee,noj2ee,disablestacktelemetry,builtinprobes=none,sampling,monitors,onexit=snapshot"]
}

After this you can simple do:
grails -Dgrails.yjp=1 run-war

And grails-tomcat will be profiled with YJP.

===

Comments about object monitor / thread blocking seen in profiling Grails:

I usually look for CPU hotspots, blocking monitors/threads and created
exceptions when I profile Grails.

In Grails 1.3.x and 2.0.0RC1, object monitors are constantly blocking
because of this Groovy bug:
http://jira.codehaus.org/browse/GROOVY-3557
It's been fixed recently and might be available in Groovy 1.8.4 .

After running Grails 2.0.0-SNAPSHOT with Groovy snapshot, the only
remaining blocker was the Hibernate query cache. There is a Hibernate
bug about that:
https://hibernate.onjira.com/browse/HHH-5927

The Hibernate's legacy 2nd level cache interface (cache provider api) is
also a source of monitor blocking. There is a new interface in the
Hibernate version that Grails 2.0 uses, it's called the RegionFactory
API ( hibernate.cache.region.factory_class setting). The
hibernate.cache.provider_class setting shouldn't be used any more for
newer Hibernate versions.
Grails 2.0 will convert to RegionFactory API on-the-fly if ehcache is used:
http://jira.grails.org/browse/GRAILS-8094

HHH-5927 and GROOVY-3557 are causing most of the object monitor / thread
blocking seen in profiling Grails 2.0RC1.

===

about the exceptions in profiling:

In the profiler, you should always make sure that the application
doesn't use exceptions for the "normal execution flow".

Exceptions are quite heavy weight and filling the stack trace is causing
most of the overhead. If you have no other option than using exceptions
in normal exection flow, you could override the fillInStacktrace()
method in a custom exception class to minimize the overhead (public
Throwable fillInStackTrace() { return this; }).
-XX:MaxJavaStackTraceDepth=100 JVM option is also a way to reduce
overhead if you cannot eliminate the exceptions and want better performance.

Groovy, Tomcat and Grails create a lot of exceptions in startup. That's
not a problem. You could reset the exception counters in YJP before
starting the load test to see if any new exceptions are thrown during
"normal execution flow".

===

Interpreting the Grails YJP profiling results is quite hard but usually
you get the idea by experimenting and putting a lot of time in it. :)
Sampling CPU profiling is better that tracing CPU profiling since the
overhead is minimal.
I have understood that CPU sampling profiling is getting stack traces of
each thread every x milliseconds (where x = telemetryperiod in YJP,
1000ms by default) and doing statistics analyzes based on that
information. Execution times gets calculated by some statistics
algorithm and it's never exact because of that. That's not a problem
when you know this when you interprete the results.

btw. A poor mans sampling profiler is to do a thread dump with "kill -3
[pid_of_java]" (SIGQUIT=3 triggers threaddump in JVM) several times in a
row and analyzing that information by viewing the dumps. I usually do
this when there is a sudden performance problem in production. It
usually shows the correct performance hotspot. :)

I hope this information helps someone interested in profiling Grails.


Lari
Post by Eric Berry
Sorry, not = now. I'm NOW getting 800+ req/sec. :)
Lari, Thank you so much for taking a look at the source code.
Removing hibernate has made a HUGE increase in throughput. I reran
the 400 conc. user benchmark against the same grails code with
only hibernate, rest, and grails-melody uninstalled and the
throughput doubled.
I'm not getting 800+ req/sec. The box still runs pretty hot at
~70% CPU, but the req/second is very dramatic.
I'm going to run a 600 conc. user benchmark.
Cheers, and many thanks again!
Eric
--
Learn from the past. Live in the present. Plan for the future.
Blog: http://eric-berry.blogspot.com <http://eric-berry.blogspot.com/>
jEdit <http://www.jedit.org <http://www.jedit.org/>> - Programmer's
Text Editor
Bazaar <http://bazaar.canonical.com <http://bazaar.canonical.com/>> -
Version Control for Humans
Burt Beckwith
2011-11-02 06:47:54 UTC
Permalink
One small note - you need to specify the environment with run-war. "grails war" defaults to prod, but run-war doesn't - it runs in dev by default. So you probably want 'grails prod run-war", "grails -Dgrails.yjp=1 prod run-war", etc.

Burt
Post by Lari Hotari
Thanks for your patience and doing the re-testing.
http://jira.grails.org/browse/GRAILS-8238
It would be interesting to get your benchmark results for Grails 2.0RC1 .
If you have enough time, convert the Controller closures to actions
http://grails.org/doc/2.0.0.M1/guide/introduction.html#webFeatures
I believe that one of the bottlenecks of your benchmark in 1.3.7 is the
command object binding. (I haven't had time to setup and run the
benchmark myself.)
-
I've been using YJP (Yourkit Java Profiler,
http://www.yourkit.com/java/profiler/index.jsp) for profiling Grails.
You can get an 15-day evaluation license by registering (or just buy it)
if you don't own a license.
There used to be a free license available for Grails core and plugin
developers (for developing Codehaus opensource projects), but it doesn't
seem to be valid anymore.
I've got the most useful information from profiling by using sampling
profiling with "monitors" and "exception telemetry" enabled.
This is the Java command line option I use for profiling 32-bit JVM
-agentpath:/opt/yjp/bin/linux-x86-32/libyjpagent.so=delay=30000,disablealloc,disabletracing,disablej2ee,noj2ee,disablestacktelemetry,builtinprobes=none,sampling,monitors,onexit=snapshot
(http://www.yourkit.com/docs/90/help/additional_agent_options.jsp for
list of available command line options)
-agentpath:/opt/yjp/bin/linux-x86-32/libyjpagent.so=delay=30000,disableall,sampling,onexit=snapshot
The easier way to profile "grails run-war" with YJP is to add these
lines to BuildConfig.groovy
grails.tomcat.jvmArgs = ["-XX:+DisableExplicitGC", '-Xmx512M',
'-Xms512M', '-XX:PermSize=192m', '-XX:MaxPermSize=192m']
if(System.getProperty('grails.yjp')) {
grails.tomcat.jvmArgs +=
["-agentpath:/opt/yjp/bin/linux-x86-32/libyjpagent.so=delay=30000,disablealloc,disabletracing,disablej2ee,noj2ee,disablestacktelemetry,builtinprobes=none,sampling,monitors,onexit=snapshot"]
}
grails -Dgrails.yjp=1 run-war
And grails-tomcat will be profiled with YJP.
===
I usually look for CPU hotspots, blocking monitors/threads and created
exceptions when I profile Grails.
In Grails 1.3.x and 2.0.0RC1, object monitors are constantly blocking
http://jira.codehaus.org/browse/GROOVY-3557
It's been fixed recently and might be available in Groovy 1.8.4 .
After running Grails 2.0.0-SNAPSHOT with Groovy snapshot, the only
remaining blocker was the Hibernate query cache. There is a Hibernate
https://hibernate.onjira.com/browse/HHH-5927
The Hibernate's legacy 2nd level cache interface (cache provider api) is
also a source of monitor blocking. There is a new interface in the
Hibernate version that Grails 2.0 uses, it's called the RegionFactory
API ( hibernate.cache.region.factory_class setting). The
hibernate.cache.provider_class setting shouldn't be used any more for
newer Hibernate versions.
http://jira.grails.org/browse/GRAILS-8094
HHH-5927 and GROOVY-3557 are causing most of the object monitor / thread
blocking seen in profiling Grails 2.0RC1.
===
In the profiler, you should always make sure that the application
doesn't use exceptions for the "normal execution flow".
Exceptions are quite heavy weight and filling the stack trace is causing
most of the overhead. If you have no other option than using exceptions
in normal exection flow, you could override the fillInStacktrace()
method in a custom exception class to minimize the overhead (public
Throwable fillInStackTrace() { return this; }).
-XX:MaxJavaStackTraceDepth=100 JVM option is also a way to reduce
overhead if you cannot eliminate the exceptions and want better performance.
Groovy, Tomcat and Grails create a lot of exceptions in startup. That's
not a problem. You could reset the exception counters in YJP before
starting the load test to see if any new exceptions are thrown during
"normal execution flow".
===
Interpreting the Grails YJP profiling results is quite hard but usually
you get the idea by experimenting and putting a lot of time in it. :)
Sampling CPU profiling is better that tracing CPU profiling since the
overhead is minimal.
I have understood that CPU sampling profiling is getting stack traces of
each thread every x milliseconds (where x = telemetryperiod in YJP,
1000ms by default) and doing statistics analyzes based on that
information. Execution times gets calculated by some statistics
algorithm and it's never exact because of that. That's not a problem
when you know this when you interprete the results.
btw. A poor mans sampling profiler is to do a thread dump with "kill -3
[pid_of_java]" (SIGQUIT=3 triggers threaddump in JVM) several times in a
row and analyzing that information by viewing the dumps. I usually do
this when there is a sudden performance problem in production. It
usually shows the correct performance hotspot. :)
I hope this information helps someone interested in profiling Grails.
Lari
Post by Eric Berry
Sorry, not = now. I'm NOW getting 800+ req/sec. :)
Lari, Thank you so much for taking a look at the source code.
Removing hibernate has made a HUGE increase in throughput. I reran
the 400 conc. user benchmark against the same grails code with
only hibernate, rest, and grails-melody uninstalled and the
throughput doubled.
I'm not getting 800+ req/sec. The box still runs pretty hot at
~70% CPU, but the req/second is very dramatic.
I'm going to run a 600 conc. user benchmark.
Cheers, and many thanks again!
Eric
---------------------------------------------------------------------
To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

Loading...