Light Mode

Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

kuria/dom

Repository files navigation

DOM

Wrappers around the PHP DOM classes that handle the common DOM extension pitfalls.

Contents

  • Features
  • Requirements
  • Container methods
    • Loading documents
    • Getting or changing document encoding
    • Saving documents
    • Getting DOM instances
    • Running XPath queries
    • Escaping strings
    • DOM manipulation and traversal helpers
  • Usage examples
    • HTML documents
      • Loading an existing document
      • Creating an new document
    • HTML fragments
      • Loading an existing fragment
      • Creating a new fragment
    • XML documents
      • Loading an existing document
      • Creating a new document
      • Handling XML namespaces in XPath queries
    • XML fragments
      • Loading an existing fragment
      • Creating a new fragment

Features

  • HTML documents
    • encoding sniffing
    • optional tidy support (automatically fix broken HTML)
  • HTML fragments
  • XML documents
  • XML fragments
  • XPath queries
  • creating documents from scratch
  • optional error suppression
  • helper methods for common tasks, such as:
    • querying multiple or a single node
    • checking for containment
    • removing a node
    • removing all nodes from a list
    • prepending a child node
    • inserting a node after another node
    • fetching and elements (HTML)
    • fetching root elements (XML)

Requirements

  • PHP 7.1+

Container methods

These methods are shared by both HTML and XML containers.

Loading documents



use Kuria\Dom\HtmlDocument; // or XmlDocument, HtmlFragment, etc.

// using loadString()
$dom = new HtmlDocument();
$dom->setLibxmlFlags($customLibxmlFlags); // optional
$dom->setIgnoreErrors($ignoreErrors); // optional
$dom->loadString($html);

// using static loadString() shortcut
$dom = HtmlDocument::fromString($html);

// using existing document instance
$dom = new HtmlDocument();
$dom->loadDocument($document);

// using static loadDocument() shortcut
$dom = HtmlDocument::fromDocument($document);

// creating an empty document
$dom = new HtmlDocument();
$dom->loadEmpty();

Getting or changing document encoding



// get encoding
$encoding = $dom->getEncoding();

// set encoding
$dom->setEncoding($newEncoding);

Note

The DOM extension uses UTF-8 encoding.

This means that text nodes, attributes, etc.:

  • will be encoded using UTF-8 when read (e.g. $elem->textContent)
  • should be encoded using UTF-8 when written (e.g. $elem->setAttribute())

The encoding configured by setEncoding() is used when saving the document, see Saving documents.

Saving documents



// entire document
$content = $dom->save();

// single element
$content = $dom->save($elem);

// children of a single element
$content = $dom->save($elem, true);

Getting DOM instances

After a document has been loaded, the DOM instances are available via getters:



$document = $dom->getDocument();
$xpath = $dom->getXpath();

Running XPath queries

query('//div'); // check if a query matches $divExists = $dom->exists('//div');">

// get a DOMNodeList
$divs = $dom->query('//div');

// get a single DOMNode (or null)
$div = $dom->query('//div');

// check if a query matches
$divExists = $dom->exists('//div');

Escaping strings



$escapedString = $dom->escape($string);

DOM manipulation and traversal helpers

Helpers for commonly needed tasks that aren't easily achieved via existing DOM methods:



// check if the document contains a node
$hasNode = $dom->contains($node);

// check if a node contains another node
$hasNode = $dom->contains($node, $parentNode);

// remove a node
$dom->remove($node);

// remove a list of nodes
$dom->removeAll($nodes);

// prepend a child node
$dom->prependChild($newNode, $existingNode);

// insert a node after another node
$dom->insertAfter($newNode, $existingNode);

Usage examples

HTML documents

Loading an existing document

textContent); var_dump($dom->queryOne('//h1')->textContent);">

use Kuria\Dom\HtmlDocument;

$html = <<





Example document


Hello world!




HTML;

$dom = HtmlDocument::fromString($html);

var_dump($dom->queryOne('//title')->textContent);
var_dump($dom->queryOne('//h1')->textContent);

Output:

string(16) "Example document"
string(12) "Hello world!"

Optionally, the markup can be fixed by Tidy prior to being loaded.



$dom = new HtmlDocument();
$dom->setTidyEnabled(true);
$dom->loadString($html);

Note

HTML documents ignore errors by default, so there is no need to call $dom->setIgnoreErrors(true).

Creating an new document

true]); // add $title = $dom->getDocument()->createElement('title'); $title->textContent = 'Lorem ipsum'; $dom->getHead()->appendChild($title); // save echo $dom->save();"><tt><span><?php</span><br><br><span>use</span> <span>Kuria</span>\<span>Dom</span>\<span>HtmlDocument</span>;<br><br><span>// initialize empty document</span><br><span><span>$</span>dom</span> = <span>new</span> <span>HtmlDocument</span>();<br><span><span>$</span>dom</span>-><span>loadEmpty</span>([<span>'<span>formatOutput</span>'</span> => <span>true</span>]);<br><br><span>// add <title></span><br><span><span>$</span>title</span> = <span><span>$</span>dom</span>-><span>getDocument</span>()-><span>createElement</span>(<span>'<span>title</span>'</span>);<br><span><span>$</span>title</span>-><span>textContent</span> = <span>'<span>Lorem ipsum</span>'</span>;<br><br><span><span>$</span>dom</span>-><span>getHead</span>()-><span>appendChild</span>(<span><span>$</span>title</span>);<br><br><span>// save</span><br><span>echo</span> <span><span>$</span>dom</span>-><span>save</span>();</tt></div> <p dir="auto">Output:</p> <tt><!DOCTYPE html><br><html><br><head><br><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><br><title>Lorem ipsum




HTML fragments

Loading an existing fragment

Hello
'); $element = $dom->queryOne('/div[@id="test"]/span'); if ($element) { var_dump($element->textContent); }">

use Kuria\Dom\HtmlFragment;

$dom = HtmlFragment::fromString('
Hello
'
);

$element = $dom->queryOne('/div[@id="test"]/span');

if ($element) {
var_dump($element->textContent);
}

Output:

string(5) "Hello"

Note

HTML fragments ignore errors by default, so there is no need to call $dom->setIgnoreErrors(true).

Creating a new fragment

Output:

example

XML documents

Loading an existing document

XML; $dom = XmlDocument::fromString($xml); foreach ($dom->query('/library/book') as $book) { /** @var \DOMElement $book */ var_dump("{$book->getAttribute('name')} by {$book->getAttribute('author')}"); }">

use Kuria\Dom\XmlDocument;

$xml = <<







XML;

$dom = XmlDocument::fromString($xml);

foreach ($dom->query('/library/book') as $book) {
/** @var \DOMElement $book */
var_dump("{$book->getAttribute('name')} by {$book->getAttribute('author')}");
}

Output:

string(34) "Don Quixote by Miguel de Cervantes"
string(29) "Hamlet by William Shakespeare"
string(49) "Alice's Adventures in Wonderland by Lewis Carroll"

Creating a new document

true]); // add $document = $dom->getDocument(); $document->appendChild($document->createElement('users')); // add some users $bob = $document->createElement('user'); $bob->setAttribute('username', 'bob'); $bob->setAttribute('access-token', '123456'); $john = $document->createElement('user'); $john->setAttribute('username', 'john'); $john->setAttribute('access-token', 'foobar'); $dom->getRoot()->appendChild($bob); $dom->getRoot()->appendChild($john); // save echo $dom->save();">

use Kuria\Dom\XmlDocument;

// initialize empty document
$dom = new XmlDocument();
$dom->loadEmpty(['formatOutput' => true]);

// add
$document = $dom->getDocument();
$document->appendChild($document->createElement('users'));

// add some users
$bob = $document->createElement('user');
$bob->setAttribute('username', 'bob');
$bob->setAttribute('access-token', '123456');

$john = $document->createElement('user');
$john->setAttribute('username', 'john');
$john->setAttribute('access-token', 'foobar');

$dom->getRoot()->appendChild($bob);
$dom->getRoot()->appendChild($john);

// save
echo $dom->save();

Output:






Handling XML namespaces in XPath queries

XML; $dom = XmlDocument::fromString($xml); // register namespace in XPath $dom->getXpath()->registerNamespace('lib', 'http://example.com/'); // query using the prefix foreach ($dom->query('//lib:book') as $book) { /** @var \DOMElement $book */ var_dump($book->getAttribute('name')); }">

use Kuria\Dom\XmlDocument;

$xml = <<







XML;

$dom = XmlDocument::fromString($xml);

// register namespace in XPath
$dom->getXpath()->registerNamespace('lib', 'http://example.com/');

// query using the prefix
foreach ($dom->query('//lib:book') as $book) {
/** @var \DOMElement $book */
var_dump($book->getAttribute('name'));
}

Output:

string(11) "Don Quixote"
string(6) "Hamlet"
string(32) "Alice's Adventures in Wonderland"

XML fragments

Loading an existing fragment

'); foreach ($dom->query('/fruits/fruit') as $fruit) { /** @var \DOMElement $fruit */ var_dump($fruit->getAttribute('name')); }">

use Kuria\Dom\XmlFragment;

$dom = XmlFragment::fromString('');

foreach ($dom->query('/fruits/fruit') as $fruit) {
/** @var \DOMElement $fruit */
var_dump($fruit->getAttribute('name'));
}

Output:

string(5) "Apple"
string(6) "Banana"

Creating a new fragment

true]); // add a new element $person = $dom->getDocument()->createElement('person'); $person->setAttribute('name', 'John Smith'); $dom->getRoot()->appendChild($person); // save echo $dom->save();">

use Kuria\Dom\XmlFragment;

// initialize empty fragment
$dom = new XmlFragment();
$dom->loadEmpty(['formatOutput' => true]);

// add a new element
$person = $dom->getDocument()->createElement('person');
$person->setAttribute('name', 'John Smith');

$dom->getRoot()->appendChild($person);

// save
echo $dom->save();

Output:


About

Wrappers around the PHP DOM classes

Topics

Resources

Readme

License

MIT license

Stars

Watchers

Forks

Contributors

Languages