How to select values between two nodes in DOM Crawler and PHP?

You can select values between two nodes in DOM Crawler by using the filterXPath method with an XPath expression that selects the nodes between the two nodes you want to use as anchors.

Here is some sample code that prints the text content of all the nodes between the h1 and h2 nodes:

use Symfony\Component\DomCrawler\Crawler;

$html = <<<EOD
  <div>
    <h1>Header 1</h1>
    <p>Paragraph 1</p>
    <p>Paragraph 2</p>
    <h2>Header 2</h2>
    <p>Paragraph 3</p>
  </div>
EOD;

// Load the HTML document
$crawler = new Crawler($html);

// Find all nodes between the h1 and h2 elements
$nodesBetweenHeadings = $crawler->filterXPath('//h1/
following-sibling::h2/
	preceding-sibling::*[
		preceding-sibling::h1
	]');

// Loop over the nodes and print their text content
foreach ($nodesBetweenHeadings as $node) {
    echo $node->textContent . PHP_EOL;
}

The XPath expression used above can be read like this:

  1. //h1: Go to the h1 tag
  2. /following-sibling::h2: Go to the sibling h2 tag
  3. /preceding-sibling::*[preceding-sibling::h1]: Find all the preceding siblings of the h2 tag that have an h1 tag as a preceding sibling (* matches all tags)

Related DOM Crawler web scraping questions: