DOM
Wrappers around the PHP DOM classes that handle the common DOM extension pitfalls.
Contents
- Features
- Requirements
- Container methods
- Loading documents
- Getting or changing document encoding
- Saving documents
- Getting DOM instances
- Running XPath queries
- Escaping strings
- DOM manipulation and traversal helpers
- Usage examples
- HTML documents
- Loading an existing document
- Creating an new document
- HTML fragments
- Loading an existing fragment
- Creating a new fragment
- XML documents
- Loading an existing document
- Creating a new document
- Handling XML namespaces in XPath queries
- XML fragments
- Loading an existing fragment
- Creating a new fragment
- HTML documents
Features
- HTML documents
- encoding sniffing
- optional tidy support (automatically fix broken HTML)
- HTML fragments
- XML documents
- XML fragments
- XPath queries
- creating documents from scratch
- optional error suppression
- helper methods for common tasks, such as:
- querying multiple or a single node
- checking for containment
- removing a node
- removing all nodes from a list
- prepending a child node
- inserting a node after another node
- fetching
andelements (HTML) - fetching root elements (XML)
Requirements
- PHP 7.1+
Container methods
These methods are shared by both HTML and XML containers.
Loading documents
use Kuria\Dom\HtmlDocument; // or XmlDocument, HtmlFragment, etc.
// using loadString()
$dom = new HtmlDocument();
$dom->setLibxmlFlags($customLibxmlFlags); // optional
$dom->setIgnoreErrors($ignoreErrors); // optional
$dom->loadString($html);
// using static loadString() shortcut
$dom = HtmlDocument::fromString($html);
// using existing document instance
$dom = new HtmlDocument();
$dom->loadDocument($document);
// using static loadDocument() shortcut
$dom = HtmlDocument::fromDocument($document);
// creating an empty document
$dom = new HtmlDocument();
$dom->loadEmpty();
Getting or changing document encoding
// get encoding
$encoding = $dom->getEncoding();
// set encoding
$dom->setEncoding($newEncoding);
Note
The DOM extension uses UTF-8 encoding.
This means that text nodes, attributes, etc.:
- will be encoded using UTF-8 when read (e.g.
$elem->textContent) - should be encoded using UTF-8 when written (e.g.
$elem->setAttribute())
The encoding configured by setEncoding() is used when saving the document,
see Saving documents.
Saving documents
// entire document
$content = $dom->save();
// single element
$content = $dom->save($elem);
// children of a single element
$content = $dom->save($elem, true);
Getting DOM instances
After a document has been loaded, the DOM instances are available via getters:
$document = $dom->getDocument();
$xpath = $dom->getXpath();
Running XPath queries
// get a DOMNodeList
$divs = $dom->query('//div');
// get a single DOMNode (or null)
$div = $dom->query('//div');
// check if a query matches
$divExists = $dom->exists('//div');
Escaping strings
$escapedString = $dom->escape($string);
DOM manipulation and traversal helpers
Helpers for commonly needed tasks that aren't easily achieved via existing DOM methods:
// check if the document contains a node
$hasNode = $dom->contains($node);
// check if a node contains another node
$hasNode = $dom->contains($node, $parentNode);
// remove a node
$dom->remove($node);
// remove a list of nodes
$dom->removeAll($nodes);
// prepend a child node
$dom->prependChild($newNode, $existingNode);
// insert a node after another node
$dom->insertAfter($newNode, $existingNode);
Usage examples
HTML documents
Loading an existing document
use Kuria\Dom\HtmlDocument;
$html = <<
Hello world!
HTML;
$dom = HtmlDocument::fromString($html);
var_dump($dom->queryOne('//title')->textContent);
var_dump($dom->queryOne('//h1')->textContent);
Output:
string(16) "Example document"string(12) "Hello world!"
Optionally, the markup can be fixed by Tidy prior to being loaded.
$dom = new HtmlDocument();
$dom->setTidyEnabled(true);
$dom->loadString($html);
Note
HTML documents ignore errors by default, so there is no need to call
$dom->setIgnoreErrors(true).
Creating an new document
use Kuria\Dom\HtmlDocument;
// initialize empty document
$dom = new HtmlDocument();
$dom->loadEmpty(['formatOutput' => true]);
// add
$title = $dom->getDocument()->createElement('title');
$title->textContent = 'Lorem ipsum';
$dom->getHead()->appendChild($title);
// save
echo $dom->save();
Output:
HTML fragments
Loading an existing fragment
use Kuria\Dom\HtmlFragment;
$dom = HtmlFragment::fromString('
$element = $dom->queryOne('/div[@id="test"]/span');
if ($element) {
var_dump($element->textContent);
}