org.htmlcleaner
Class HtmlTokenizer
java.lang.Object
org.htmlcleaner.HtmlTokenizer
public abstract class HtmlTokenizer
- extends Object
Main HTML tokenizer.
It's task is to parse HTML and produce list of valid tokens:
open tag tokens, end tag tokens, contents (text) and comments.
As soon as new item is added to token list, cleaner is invoked
to clean current list at the end.
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
HtmlTokenizer
public HtmlTokenizer(Reader reader,
CleanerProperties props,
CleanerTransformations transformations,
ITagInfoProvider tagInfoProvider)
throws IOException
- Constructor - cretes instance of the parser with specified content.
- Parameters:
reader
- props
- transformations
- tagInfoProvider
-
- Throws:
IOException
getDocType
public DoctypeToken getDocType()
Copyright © 2006-2011. All Rights Reserved.