Get html of a node
$html .= $dom->saveHTML($node);
(PHP 5, PHP 7)
$exclusive
[, bool $with_comments
[, array $xpath
[, array $ns_prefixes
]]]] ) : string$uri
[, bool $exclusive
= FALSE
[, bool $with_comments
= FALSE
[, array $xpath
[, array $ns_prefixes
]]]] ) : intNot implemented yet, always return NULL
The element name
Note:
The DOM extension uses UTF-8 encoding. Use utf8_encode() and utf8_decode() to work with texts in ISO-8859-1 encoding or Iconv for other encodings.
Get html of a node
$html .= $dom->saveHTML($node);
This page doesn't list the inherited properties from DOMNode, e.g. the quite important textContent property. It would be immensely helpful if it would list those as well.
How to rename an element and preserve attributes:
<?php
// Changes the name of element $element to $newName.
function renameElement($element, $newName) {
$newElement = $element->ownerDocument->createElement($newName);
$parentElement = $element->parentNode;
$parentElement->insertBefore($newElement, $element);
$childNodes = $element->childNodes;
while ($childNodes->length > 0) {
$newElement->appendChild($childNodes->item(0));
}
$attributes = $element->attributes;
while ($attributes->length > 0) {
$attribute = $attributes->item(0);
if (!is_null($attribute->namespaceURI)) {
$newElement->setAttributeNS('http://www.w3.org/2000/xmlns/',
'xmlns:'.$attribute->prefix,
$attribute->namespaceURI);
}
$newElement->setAttributeNode($attribute);
}
$parentElement->removeChild($element);
}
function prettyPrint($d) {
$d->formatOutput = true;
echo '<pre>'.htmlspecialchars($d->saveXML()).'</pre>';
}
$d = new DOMDocument( '1.0' );
$d->loadXML('<?xml version="1.0"?>
<library>
<data a:foo="1" x="bar" xmlns:a="http://example.com/a">
<invite>
<username>jmansa</username>
<userid>1</userid>
</invite>
<update>1</update>
</data>
</library>');
$xpath = new DOMXPath($d);
$elements = $xpath->query('/library/data');
if ($elements->length == 1) {
$element = $elements->item(0);
renameElement($element, 'invites');
}
prettyPrint($d);
?>
This works perfect for me as well:
<?php $xml = $domElement->ownerDocument->saveXML($domElement); ?>
you can use DOMNode::nodeValue
DOMElement inherits this public property.
$elem->nodeValue
Hi!
Combining all th comments, the easiest way to get inner HTML of the node is to use this function:
<?php
function get_inner_html( $node ) {
$innerHTML= '';
$children = $node->childNodes;
foreach ($children as $child) {
$innerHTML .= $child->ownerDocument->saveXML( $child );
}
return $innerHTML;
}
?>
The following code shows can text-only content be extracted from a document.
<?php
function getTextFromNode($Node, $Text = "") {
if ($Node->tagName == null)
return $Text.$Node->textContent;
$Node = $Node->firstChild;
if ($Node != null)
$Text = getTextFromNode($Node, $Text);
while($Node->nextSibling != null) {
$Text = getTextFromNode($Node->nextSibling, $Text);
$Node = $Node->nextSibling;
}
return $Text;
}
function getTextFromDocument($DOMDoc) {
return getTextFromNode($DOMDoc->documentElement);
}
$Doc = new DOMDocument();
$Doc->loadHTMLFile("Test.html");
echo getTextFromDocument($Doc)."\n";
?>
It would be nice to have a function which converts a document/node/element into a string.
Anyways, I use the following code snippet to get the innerHTML value of a DOMNode:
<?php
function getInnerHTML($Node)
{
$Body = $Node->ownerDocument->documentElement->firstChild->firstChild;
$Document = new DOMDocument();
$Document->appendChild($Document->importNode($Body,true));
return $Document->saveHTML();
}
?>
Although it may be preferable to use the dom to manipulate elements, sometimes it's useful to actually get the innerHTML from a document element (e.g. to load into a client-side editor).
To get the innerHTML of a specific element ($elem_id) in a specific html file ($filepath):
<?php
$innerHTML = '';
$doc = new DOMDocument();
$doc->loadHTMLFile($filepath);
$elem = $doc->getElementById($elem_id);
// loop through all childNodes, getting html
$children = $elem->childNodes;
foreach ($children as $child) {
$tmp_doc = new DOMDocument();
$tmp_doc->appendChild($tmp_doc->importNode($child,true));
$innerHTML .= $tmp_doc->saveHTML();
}
?>
Hi to get the value of DOMElement just get the nodeValue public parameter (it is inherited from DOMNode):
<?php
echo $domElement->nodeValue;
?>
Everything is obvious if you now about this thing ;-)
Caveat!
It took me almost an hour to figure this out, so I hope it saves at least one of you some time.
If you want to debug your DOM tree and try var_dump() or similar you will be fooled into thinking the DOMElement that you are looking at is empty, because var_dump() says: object(DOMElement)#1 (0) { }
After much debugging I found out that all DOM objects are invisible to var_dump() and print_r(), my guess is because they are C objects and not PHP objects. So I tried saveXML(), which works fine on DOMDocument, but is not implemented on DOMElement.
The solution is simple (if you know it):
$xml = $domElement->ownerDocument->saveXML($domElement);
This will give you an XML representation of $domElement.
I wanted to find similar Elements - thats why I built an Xpath-String like this - maybe somebody needs it... its not very pretty - but neither is domdocument :)
<?php
$dom->load($xmlFile))
$xpathQuery = '//*';
$xmlNodes = $xpath->query($xpathQuery);
$pathlist = array();
$attrlist = array();
foreach ($xmlNodes as $node) {
$depth = $this->_getDomDepth($node); //get Path-Depth (for array key)
$pathlist[$depth] = $node->tagName; // tagname
$attrs = $node->attributes;
$attr='';
$a=0;
foreach ($attrs as $attrName => $attrNode) // attributes
{
if ($attrName !='reg')
{
if ($a++!=0) $attr .= ' and ';
$attr .= '@'.$attrName.'='."'".$attrNode->value."'";
}
}
$attrlist[$depth] = $attr?'['.$attr.']':'';
$path = ''; for ($i=0;$i<=$depth;$i++) $path .= '/'.$pathlist[$i].$attrlist[$i]; // the xpath of the actual Element
// ... now you can go on and user $path to find similar elements
}
}
}
private function _getDomDepth(DomNode $node)
{
$r = -2;
while ($node) {
$r++;
$node = $node->parentNode;
}
return $r;
}
?>
Hi there.
Remember to append a DOMNode (or any of its descendants) to a DOMDocument __BEFORE__ you try to append a child to it.
I don't know why it has to be this way but it can't be done without it.
bye