org.htmlcleaner
Class TagNode

java.lang.Object
  extended by org.htmlcleaner.TagToken
      extended by org.htmlcleaner.TagNode
All Implemented Interfaces:
BaseToken, HtmlNode

public class TagNode
extends TagToken
implements HtmlNode

XML node tag - basic node of the cleaned HTML tree. At the same time, it represents start tag token after HTML parsing phase and before cleaning phase. After cleaning process, tree structure remains containing tag nodes (TagNode class), content (text nodes - ContentNode), comments (CommentNode) and optionally doctype node (DoctypeToken).


Nested Class Summary
static interface TagNode.ITagNodeCondition
          Used as base for different node checkers.
 class TagNode.TagAllCondition
          All nodes.
 class TagNode.TagNodeAttExistsCondition
          Checks if node contains specified attribute.
 class TagNode.TagNodeAttValueCondition
          Checks if node has specified attribute with specified value.
 class TagNode.TagNodeNameCondition
          Checks if node has specified name.
 
Field Summary
 
Fields inherited from class org.htmlcleaner.TagToken
name
 
Constructor Summary
TagNode(String name)
           
 
Method Summary
 void addAttribute(String attName, String attValue)
          Deprecated. Use setAttribute instead Adds specified attribute to this tag or overrides existing one.
 void addChild(Object child)
           
 void addChildren(List newChildren)
          Add all elements from specified list to this node.
 void addNamespaceDeclaration(String nsPrefix, String nsURI)
          Adds namespace declaration to the node
 Object[] evaluateXPath(String xPathExpression)
          Evaluates XPath expression on give node.
 TagNode findElementByAttValue(String attName, String attValue, boolean isRecursive, boolean isCaseSensitive)
           
 TagNode findElementByName(String findName, boolean isRecursive)
           
 TagNode findElementHavingAttribute(String attName, boolean isRecursive)
           
 TagNode[] getAllElements(boolean isRecursive)
           
 List getAllElementsList(boolean isRecursive)
           
 String getAttributeByName(String attName)
           
 Map<String,String> getAttributes()
           
 int getChildIndex(HtmlNode child)
           
 List getChildren()
           
 List getChildTagList()
           
 TagNode[] getChildTags()
           
 DoctypeToken getDocType()
           
 List getElementListByAttValue(String attName, String attValue, boolean isRecursive, boolean isCaseSensitive)
           
 List getElementListByName(String findName, boolean isRecursive)
           
 List getElementListHavingAttribute(String attName, boolean isRecursive)
           
 TagNode[] getElementsByAttValue(String attName, String attValue, boolean isRecursive, boolean isCaseSensitive)
           
 TagNode[] getElementsByName(String findName, boolean isRecursive)
           
 TagNode[] getElementsHavingAttribute(String attName, boolean isRecursive)
           
 Map<String,String> getNamespaceDeclarations()
           
 TagNode getParent()
           
 StringBuffer getText()
           
 boolean hasAttribute(String attName)
          Checks existance of specified attribute.
 boolean hasChildren()
           
 void insertChild(int index, HtmlNode childToAdd)
          Inserts specified node at specified position in array of children
 void insertChildAfter(HtmlNode node, HtmlNode nodeToInsert)
          Inserts specified node in the list of children after specified child
 void insertChildBefore(HtmlNode node, HtmlNode nodeToInsert)
          Inserts specified node in the list of children before specified child
 void removeAllChildren()
          Removes all children (subelements and text content).
 void removeAttribute(String attName)
          Removes specified attribute from this tag.
 boolean removeChild(Object child)
          Remove specified child element from this node.
 boolean removeFromTree()
          Remove this node from the tree.
 void replaceChild(HtmlNode childToReplace, HtmlNode replacement)
          Replaces specified child node with specified replacement node.
 void serialize(Serializer serializer, Writer writer)
           
 void setAttribute(String attName, String attValue)
          Adding new attribute ir overriding existing one.
 void setDocType(DoctypeToken docType)
           
 boolean setName(String name)
          Changes name of the tag
 void traverse(TagNodeVisitor visitor)
          Traverses the tree and performs visitor's action on each node.
 
Methods inherited from class org.htmlcleaner.TagToken
getName, toString
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

TagNode

public TagNode(String name)
Method Detail

setName

public boolean setName(String name)
Changes name of the tag

Parameters:
name -
Returns:
True if new name is valid, false otherwise

getAttributeByName

public String getAttributeByName(String attName)
Parameters:
attName -
Returns:
Value of the specified attribute, or null if it this tag doesn't contain it.

getAttributes

public Map<String,String> getAttributes()
Returns:
Map instance containing all attribute name/value pairs.

hasAttribute

public boolean hasAttribute(String attName)
Checks existance of specified attribute.

Parameters:
attName -

addAttribute

@Deprecated
public void addAttribute(String attName,
                                    String attValue)
Deprecated. Use setAttribute instead Adds specified attribute to this tag or overrides existing one.

Parameters:
attName -
attValue -

setAttribute

public void setAttribute(String attName,
                         String attValue)
Adding new attribute ir overriding existing one.

Parameters:
attName -
attValue -

addNamespaceDeclaration

public void addNamespaceDeclaration(String nsPrefix,
                                    String nsURI)
Adds namespace declaration to the node

Parameters:
nsPrefix - Namespace prefix
nsURI - Namespace URI

getNamespaceDeclarations

public Map<String,String> getNamespaceDeclarations()
Returns:
Map of namespace declarations for this node

removeAttribute

public void removeAttribute(String attName)
Removes specified attribute from this tag.

Parameters:
attName -

getChildren

public List getChildren()
Returns:
List of children objects. During the cleanup process there could be different kind of childern inside, however after clean there should be only TagNode instances.

hasChildren

public boolean hasChildren()
Returns:
Whether this node has child elements or not.

getChildTagList

public List getChildTagList()

getChildTags

public TagNode[] getChildTags()
Returns:
An array of child TagNode instances.

getText

public StringBuffer getText()
Returns:
Text content of this node and it's subelements.

getParent

public TagNode getParent()
Returns:
Parent of this node, or null if this is the root node.

getDocType

public DoctypeToken getDocType()

setDocType

public void setDocType(DoctypeToken docType)

addChild

public void addChild(Object child)

addChildren

public void addChildren(List newChildren)
Add all elements from specified list to this node.

Parameters:
newChildren -

getAllElementsList

public List getAllElementsList(boolean isRecursive)

getAllElements

public TagNode[] getAllElements(boolean isRecursive)

findElementByName

public TagNode findElementByName(String findName,
                                 boolean isRecursive)

getElementListByName

public List getElementListByName(String findName,
                                 boolean isRecursive)

getElementsByName

public TagNode[] getElementsByName(String findName,
                                   boolean isRecursive)

findElementHavingAttribute

public TagNode findElementHavingAttribute(String attName,
                                          boolean isRecursive)

getElementListHavingAttribute

public List getElementListHavingAttribute(String attName,
                                          boolean isRecursive)

getElementsHavingAttribute

public TagNode[] getElementsHavingAttribute(String attName,
                                            boolean isRecursive)

findElementByAttValue

public TagNode findElementByAttValue(String attName,
                                     String attValue,
                                     boolean isRecursive,
                                     boolean isCaseSensitive)

getElementListByAttValue

public List getElementListByAttValue(String attName,
                                     String attValue,
                                     boolean isRecursive,
                                     boolean isCaseSensitive)

getElementsByAttValue

public TagNode[] getElementsByAttValue(String attName,
                                       String attValue,
                                       boolean isRecursive,
                                       boolean isCaseSensitive)

evaluateXPath

public Object[] evaluateXPath(String xPathExpression)
                       throws XPatherException
Evaluates XPath expression on give node.
This is not fully supported XPath parser and evaluator. Examples below show supported elements:
  • //div//a
  • //div//a[@id][@class]
  • /body/*[1]/@type
  • //div[3]//a[@id][@href='r/n4']
  • //div[last() >= 4]//./div[position() = last()])[position() > 22]//li[2]//a
  • //div[2]/@*[2]
  • data(//div//a[@id][@class])
  • //p/last()
  • //body//div[3][@class]//span[12.2
  • data(//a['v' < @id])

Parameters:
xPathExpression -
Returns:
Throws:
XPatherException

removeFromTree

public boolean removeFromTree()
Remove this node from the tree.

Returns:
True if element is removed (if it is not root node).

removeChild

public boolean removeChild(Object child)
Remove specified child element from this node.

Parameters:
child -
Returns:
True if child object existed in the children list.

removeAllChildren

public void removeAllChildren()
Removes all children (subelements and text content).


replaceChild

public void replaceChild(HtmlNode childToReplace,
                         HtmlNode replacement)
Replaces specified child node with specified replacement node.

Parameters:
childToReplace - Child node to be replaced
replacement - Replacement node

getChildIndex

public int getChildIndex(HtmlNode child)
Parameters:
child - Child to find index of
Returns:
Index of the specified child node inside this node's children, -1 if node is not the child

insertChild

public void insertChild(int index,
                        HtmlNode childToAdd)
Inserts specified node at specified position in array of children

Parameters:
index -
childToAdd -

insertChildBefore

public void insertChildBefore(HtmlNode node,
                              HtmlNode nodeToInsert)
Inserts specified node in the list of children before specified child

Parameters:
node - Child before which to insert new node
nodeToInsert - Node to be inserted at specified position

insertChildAfter

public void insertChildAfter(HtmlNode node,
                             HtmlNode nodeToInsert)
Inserts specified node in the list of children after specified child

Parameters:
node - Child after which to insert new node
nodeToInsert - Node to be inserted at specified position

traverse

public void traverse(TagNodeVisitor visitor)
Traverses the tree and performs visitor's action on each node. It stops when it finishes all the tree or when visitor returns false.

Parameters:
visitor - TagNodeVisitor implementation

serialize

public void serialize(Serializer serializer,
                      Writer writer)
               throws IOException
Specified by:
serialize in interface BaseToken
Throws:
IOException


Copyright © 2006-2011. All Rights Reserved.