mjfan80
2011-04-07 12:55:02 UTC
In my project I need to parse a HTML file (well formatted, as Xhtml).
This is html file is not too big... some styles, a Table with many td, and
other stuff.
The HTML file is 8KB (not big)
The same file is parsed by the pdf plugin (so flying saucer) to make a PDF
and this is really quick (less then one second, I think)
But if the same file is parsed by xmlslurper it takes 80 seconds.... yes,
80seconds...
I tryed with XMLSlurper, XMLParser and also the java XMLStreamReader. and it
takes beetween 70 to 80 seconds
I Don't know why is so slowly
The html file is stored locally on the server (so no time for download)
this is the what i do to find in the HTML file a
with the class setted to "report" (then i will do something with this
table)
def docParser = new XmlParser().parse(urlFile)
def body = doc.'body'
def report = trovaTableReport(body);
public GPathResult trovaTableReport(GPathResult nodo) {
if(nodo != null) {
def eventualiTable = nodo.'table'
def report = eventualiTable.find { ***@class.text().contains("report") }
if(report != null && !report.isEmpty()) return report
else {
def reportInterno = null
nodo.children().each() {figlio ->
reportInterno = trovaTableReport(figlio)
if(reportInterno != null && !reportInterno.isEmpty()) report =
reportInterno
}
if(report != null && !report.isEmpty()) return report
else return null
}
}
else return null
}
SomeOne can tell mw why it takes so long to parse a simple html file?
--
View this message in context: http://grails.1312388.n4.nabble.com/XMLSlurper-really-slow-reading-parsing-html-xml-file-tp3433305p3433305.html
Sent from the Grails - user mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe from this list, please visit:
http://xircles.codehaus.org/manage_email
This is html file is not too big... some styles, a Table with many td, and
other stuff.
The HTML file is 8KB (not big)
The same file is parsed by the pdf plugin (so flying saucer) to make a PDF
and this is really quick (less then one second, I think)
But if the same file is parsed by xmlslurper it takes 80 seconds.... yes,
80seconds...
I tryed with XMLSlurper, XMLParser and also the java XMLStreamReader. and it
takes beetween 70 to 80 seconds
I Don't know why is so slowly
The html file is stored locally on the server (so no time for download)
this is the what i do to find in the HTML file a
with the class setted to "report" (then i will do something with this
table)
def docParser = new XmlParser().parse(urlFile)
def body = doc.'body'
def report = trovaTableReport(body);
public GPathResult trovaTableReport(GPathResult nodo) {
if(nodo != null) {
def eventualiTable = nodo.'table'
def report = eventualiTable.find { ***@class.text().contains("report") }
if(report != null && !report.isEmpty()) return report
else {
def reportInterno = null
nodo.children().each() {figlio ->
reportInterno = trovaTableReport(figlio)
if(reportInterno != null && !reportInterno.isEmpty()) report =
reportInterno
}
if(report != null && !report.isEmpty()) return report
else return null
}
}
else return null
}
SomeOne can tell mw why it takes so long to parse a simple html file?
--
View this message in context: http://grails.1312388.n4.nabble.com/XMLSlurper-really-slow-reading-parsing-html-xml-file-tp3433305p3433305.html
Sent from the Grails - user mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe from this list, please visit:
http://xircles.codehaus.org/manage_email