Looking for a solution to extract data from HTML
Thanks
Rob
Looking for a solution to extract data from HTML
Thanks
Rob
You can use Python’s built-in HTMLParser module, but if you’re not handy with programming it probably won’t be an easy task. See here for documentation on how it works.
The following is a quick example that you can put into the actionPerformed of a button to see it in action…
from HTMLParser import HTMLParser
class MyHTMLParser(HTMLParser):
def handle_starttag(self, tag, attrs):
print "Encountered the beginning of a %s tag" % tag
def handle_endtag(self, tag):
print "Encountered the end of a %s tag" % tag
parser = MyHTMLParser()
parser.feed("<html><body><div>hi</div></body></html>")
The following will be printed to console:
Encountered the beginning of a html tag
Encountered the beginning of a body tag
Encountered the beginning of a div tag
Encountered the end of a div tag
Encountered the end of a body tag
Encountered the end of a html tag