php教程

php手册

The DOMNode class

(PHP 5, PHP 7)

类摘要

DOMNode {

/* 属性 */

public readonly string $nodeName ;

public string $nodeValue ;

public readonly int $nodeType ;

public readonly DOMNode $parentNode ;

public readonly DOMNodeList $childNodes ;

public readonly DOMNode $firstChild ;

public readonly DOMNode $lastChild ;

public readonly DOMNode $previousSibling ;

public readonly DOMNode $nextSibling ;

public readonly DOMNamedNodeMap $attributes ;

public readonly DOMDocument $ownerDocument ;

public readonly string $namespaceURI ;

public string $prefix ;

public readonly string $localName ;

public readonly string $baseURI ;

public string $textContent ;

/* 方法 */

public appendChild ( DOMNode $newnode ) : DOMNode

public C14N ([ bool $exclusive [, bool $with_comments [, array $xpath [, array $ns_prefixes ]]]] ) : string

public C14NFile ( string $uri [, bool $exclusive = FALSE [, bool $with_comments = FALSE [, array $xpath [, array $ns_prefixes ]]]] ) : int

public cloneNode ([ bool $deep ] ) : DOMNode

public getLineNo ( void ) : int

public getNodePath ( void ) : string

public hasAttributes ( void ) : bool

public hasChildNodes ( void ) : bool

public insertBefore ( DOMNode $newnode [, DOMNode $refnode ] ) : DOMNode

public isDefaultNamespace ( string $namespaceURI ) : bool

public isSameNode ( DOMNode $node ) : bool

public isSupported ( string $feature , string $version ) : bool

public lookupNamespaceUri ( string $prefix ) : string

public lookupPrefix ( string $namespaceURI ) : string

public normalize ( void ) : void

public removeChild ( DOMNode $oldnode ) : DOMNode

public replaceChild ( DOMNode $newnode , DOMNode $oldnode ) : DOMNode

}

属性

nodeName: Returns the most accurate name for the current node type
nodeValue: The value of this node, depending on its type. Contrary to the W3C specification, the node value of DOMElement nodes is equal to DOMNode::textContent instead of NULL.
nodeType: Gets the type of the node. One of the predefined XML_xxx_NODE constants
parentNode: The parent of this node. If there is no such node, this returns NULL.
childNodes: A DOMNodeList that contains all children of this node. If there are no children, this is an empty DOMNodeList.
firstChild: The first child of this node. If there is no such node, this returns NULL.
lastChild: The last child of this node. If there is no such node, this returns NULL.
previousSibling: The node immediately preceding this node. If there is no such node, this returns NULL.
nextSibling: The node immediately following this node. If there is no such node, this returns NULL.
attributes: A DOMNamedNodeMap containing the attributes of this node (if it is a DOMElement) or NULL otherwise.
ownerDocument: The DOMDocument object associated with this node, or NULL if this node is a DOMDOcument
namespaceURI: The namespace URI of this node, or NULL if it is unspecified.
prefix: The namespace prefix of this node, or NULL if it is unspecified.
localName: Returns the local part of the qualified name of this node.
baseURI: The absolute base URI of this node or NULL if the implementation wasn't able to obtain an absolute URI.
textContent: The text content of this node and its descendants.

注释

Note:
The DOM extension uses UTF-8 encoding. Use utf8_encode() and utf8_decode() to work with texts in ISO-8859-1 encoding or Iconv for other encodings.

更新日志

版本	说明
5.6.1	The textContent property has been made writable (formerly it has been readonly).

参见

» W3C specification of Node

DOMNode::appendChild — Adds new child at the end of the children
DOMNode::C14N — Canonicalize nodes to a string
DOMNode::C14NFile — Canonicalize nodes to a file
DOMNode::cloneNode — Clones a node
DOMNode::getLineNo — Get line number for a node
DOMNode::getNodePath — Get an XPath for a node
DOMNode::hasAttributes — Checks if node has attributes
DOMNode::hasChildNodes — Checks if node has children
DOMNode::insertBefore — Adds a new child before a reference node
DOMNode::isDefaultNamespace — Checks if the specified namespaceURI is the default namespace or not
DOMNode::isSameNode — Indicates if two nodes are the same node
DOMNode::isSupported — Checks if feature is supported for specified version
DOMNode::lookupNamespaceUri — Gets the namespace URI of the node based on the prefix
DOMNode::lookupPrefix — Gets the namespace prefix of the node based on the namespace URI
DOMNode::normalize — Normalizes the node
DOMNode::removeChild — Removes child from list of children
DOMNode::replaceChild — Replaces a child

User Contributed Notes

Anonymous 06-May-2018 04:12


It would be helpful if docs for concrete properties mentioned readonly status of some properties:

"

ownerDocument



    The DOMDocument object associated with this node.



"

zlk1214 at gmail dot com 15-Jan-2016 01:07


A function that can set the inner HTML without encoding error. $html can be broken content such as "<a ID=id20>ssss"

function setInnerHTML($node, $html) {

    removeChildren($node);

    if (empty($html)) {

        return;

    }

   

    $doc = $node->ownerDocument;

    $htmlclip = new DOMDocument();

    $htmlclip->loadHTML('<meta http-equiv="Content-Type" content="text/html;CHARSET=gb2312"><div>' . $html . '</div>');

    $clipNode = $doc->importNode($htmlclip->documentElement->lastChild->firstChild, true);

    while ($item = $clipNode->firstChild) {

        $node->appendChild($item);

    }

}

metanull 24-Jul-2014 03:11


Yet another DOMNode to php array conversion function. 

Other ones on this page are generating too "complex" arrays; this one should keep the array as tidy as possible.

Note: make sure to set LIBXML_NOBLANKS when calling DOMDocument::load, loadXML or loadHTML

See: http://be2.php.net/manual/en/libxml.constants.php

See: http://be2.php.net/manual/en/domdocument.loadxml.php



<?php

         /**

         * Returns an array representation of a DOMNode

         * Note, make sure to use the LIBXML_NOBLANKS flag when loading XML into the DOMDocument

         * @param DOMDocument $dom

         * @param DOMNode $node

         * @return array

         */

        function nodeToArray( $dom, $node) {

            if(!is_a( $dom, 'DOMDocument' ) || !is_a( $node, 'DOMNode' )) {

                return false;

            }

            $array = false; 

            if( empty( trim( $node->localName ))) {// Discard empty nodes

                return false;

            }

            if( XML_TEXT_NODE == $node->nodeType ) {

                return $node->nodeValue;

            }

            foreach ($node->attributes as $attr) { 

                $array['@'.$attr->localName] = $attr->nodeValue; 

            } 

            foreach ($node->childNodes as $childNode) { 

                if ( 1 == $childNode->childNodes->length && XML_TEXT_NODE == $childNode->firstChild->nodeType ) { 

                    $array[$childNode->localName] = $childNode->nodeValue; 

                }  else {

                    if( false !== ($a = self::nodeToArray( $dom, $childNode))) {

                        $array[$childNode->localName] =     $a;

                    }

                }

            }

            return $array; 

        }

?>

pizarropablo at gmail dot com 16-Apr-2014 02:36


In response to: alastair dot dallas at gmail dot com about "#text" nodes.

"#text" nodes appear when there are spaces or new lines between end tag and next initial tag.



Eg "<data><age>10</age>[SPACES]<other>20</other>[SPACES]</data>"



"data" childNodes has 4 childs:

- age = 10

- #text = spaces

- other = 20

- #text =  spaces

matej dot golian at gmail dot com 29-Aug-2013 01:15


Here is a little function that truncates a DomNode to a specified number of text characters. I use it to generate HTML excerpts for my blog entries.



<?php



function makehtmlexcerpt(DomNode $html, $excerptlength)

{

$remove = 0;

$htmllength = strlen(html_entity_decode($html->textContent, ENT_QUOTES, 'UTF-8'));

$truncate = $htmllength - $excerptlength;

if($htmllength > $excerptlength)

{

if($html->hasChildNodes())

{

$children = $html->childNodes;

for($counter = 0; $counter < $children->length; $counter ++)

{

$child = $children->item($children->length - ($counter + 1));

$childlength = strlen(html_entity_decode($child->textContent, ENT_QUOTES, 'UTF-8'));

if($childlength <= $truncate)

{

$remove ++;

$truncate = $truncate - $childlength;

}

else

{

$child = makehtmlexcerpt($child, $childlength - $truncate);

break;

}

}

if($remove != 0)

{

for($counter = 0; $counter < $remove; $counter ++)

{

$html->removeChild($html->lastChild);

}

}

}

else

{

if($html->nodeName == '#text')

{

$html->nodeValue = substr(html_entity_decode($html->nodeValue, ENT_QUOTES, 'UTF-8'), 0, $htmllength - $truncate);

}

}

}

return $html;

}



?>

alastair dot dallas at gmail dot com 25-Sep-2011 08:44


The issues around mixed content took me some experimentation to remember, so I thought I'd add this note to save others time.



When your markup is something like: <div><p>First text.</p><ul><li><p>First bullet</p></li></ul></div>, you'll get XML_ELEMENT_NODEs that are quite regular. The <div> has children <p> and <ul> and the nodeValue for both <p>s yields the text you expect.



But when your markup is more like <p>This is <b>bold</b> and this is <i>italic</i>.</p>, you realize that the nodeValue for XML_ELEMENT_NODEs is not reliable. In this case, you need to look at the <p>'s child nodes. For this example, the <p> has children: #text, <b>, #text, <i>, #text. 



In this example, the nodeValue of <b> and <i> is the same as their #text children. But you could have markup like: <p>This <b>is bold and <i>bold italic</i></b>, you see?</p>. In this case, you need to look at the children of <b>, which will be #text, <i>, because the nodeValue of <b> will not be sufficient.



XML_TEXT_NODEs have no children and are always named '#text'. Depending on how whitespace is handled, your tree may have "empty" #text nodes as children of <body> and elsewhere.



Attributes are nodes, but I had forgotten that they are not in the tree expressed by childNodes. Walking the full tree using childNodes will not visit any attribute nodes.

imranomar at gmail dot com 20-Mar-2011 06:10


Just discovered that node->nodeValue strips out all the tags

I. Cook 19-Apr-2010 02:43


For a reference with more information about the XML DOM node types, see http://www.w3schools.com/dom/dom_nodetype.asp



(When using PHP DOMNode, these constants need to be prefaced with "XML_")

R. Studer 13-Jan-2010 09:03


For clarification:

The assumingly 'discoverd' by previous posters and seemingly undocumented methods (.getElementsByTagName and .getAttribute) on this class (DOMNode) are in fact methods of the class DOMElement, which inherits from DOMNode.



See: http://www.php.net/manual/en/class.domelement.php

David Rekowski 08-Jan-2010 01:54


You cannot simply overwrite $textContent, to replace the text content of a DOMNode, as the missing readonly flag suggests. Instead you have to do something like this:



<?php



$node->removeChild($node->firstChild);

$node->appendChild(new DOMText('new text content'));



?>



This example shows what happens:



<?php



$doc = DOMDocument::loadXML('<node>old content</node>');

$node = $doc->getElementsByTagName('node')->item(0);

echo "Content 1: ".$node->textContent."\n";



$node->textContent = 'new content';

echo "Content 2: ".$node->textContent."\n";



$newText = new DOMText('new content');



$node->appendChild($newText);

echo "Content 3: ".$node->textContent."\n";



$node->removeChild($node->firstChild);

$node->appendChild($newText);

echo "Content 4: ".$node->textContent."\n";



?>



The output is:



Content 1: old content // starting content

Content 2: old content // trying to replace overwriting $node->textContent

Content 3: old contentnew content // simply appending the new text node

Content 4: new content // removing firstchild before appending the new text node



If you want to have a CDATA section, use this:



<?php

$doc = DOMDocument::loadXML('<node>old content</node>');

$node = $doc->getElementsByTagName('node')->item(0);

$node->removeChild($node->firstChild);

$newText = $doc->createCDATASection('new cdata content');

$node->appendChild($newText);

echo "Content withCDATA: ".$doc->saveXML($node)."\n";

?>

Steve K 03-Nov-2009 11:47


This class apparently also has a getElementsByTagName method.



I was able to confirm this by evaluating the output from DOMNodeList->item() against various tests with the is_a() function.

marc at ermshaus dot org 05-May-2009 08:36


It took me forever to find a mapping for the XML_*_NODE constants. So I thought, it'd be handy to paste it here:



 1 XML_ELEMENT_NODE

 2 XML_ATTRIBUTE_NODE

 3 XML_TEXT_NODE

 4 XML_CDATA_SECTION_NODE

 5 XML_ENTITY_REFERENCE_NODE

 6 XML_ENTITY_NODE

 7 XML_PROCESSING_INSTRUCTION_NODE

 8 XML_COMMENT_NODE

 9 XML_DOCUMENT_NODE

10 XML_DOCUMENT_TYPE_NODE

11 XML_DOCUMENT_FRAGMENT_NODE

12 XML_NOTATION_NODE

matt at lamplightdb dot co dot uk 06-Apr-2009 05:39


And apparently also a setAttribute method too:



$node->setAttribute( 'attrName' , 'value' );

brian wildwoodassociates.info 07-Dec-2008 11:27


This class has a getAttribute method.



Assume that a DOMNode object $ref contained an anchor taken out of a DOMNode List.  Then 



    $url = $ref->getAttribute('href'); 



would isolate the url associated with the href part of the anchor.

php教程 php手册 php手册

The DOMNode class

类摘要

属性

注释

更新日志

参见

Table of Contents

User Contributed Notes

php教程

php手册