PHP Classes

HTML Parser: Parse HTML using DOMDocument

Recommend this page to a friend!
  Info   View files Example   View files View files (5)   DownloadInstall with Composer Download .zip   Reputation   Support forum (3)   Blog (1)    
Ratings Unique User Downloads Download Rankings
Not enough user ratingsTotal: 721 This week: 1All time: 4,593 This week: 560Up
Version License PHP version Categories
html-parser 1.0GNU General Publi...5HTML, PHP 5, Parsers
Description 

Author

This class can parse HTML documents using DOMDocument.

It can load the HTML markup either from a file or from a text string.

It can parse the entire document, returning an array of elements.

It can parse the document for a specific element, returning an array of each element found. It also can return the element's child elements.

It can return an element referenced by a given ID.

It can display the returned results in a human readable form.

Picture of Dave Smith
  Performance   Level  
Name: Dave Smith is available for providing paid consulting. Contact Dave Smith .
Classes: 51 packages by
Country: United States United States
Age: 58
All time rank: 618 in United States United States
Week rank: 21 Up4 in United States United States Up
Innovation award
Innovation award
Nominee: 32x

Winner: 7x

Recommendations

Retrieve a page content
I need a crawler to get a data from an url

HTML config
Need to parse

Extract body text in html document
I need to parse a HTML document and extract the text part

Best crawler for specific Web sites
How can choose pertinent paragraphs for indexing a specific site

DOMDocument find tag, child elements and attributes
Find tag and child elements in DOMDocument

Example

<?PHP
error_reporting
(E_ALL ^ E_NOTICE);

//instantiate the class
include('html.parser.class.php');
$parser = new html_parser;

//use this method to load the markup as a file
//comment the line below if you are using the string method

$parser->loadFile('example.html');

//use this method to load the markup as a string
//uncomment both lines to load as a string

//$page = file_get_contents('example.html');
//$parser->loadString($page);

//uncomment the method you want to use to process the markup
//processDocument will parse the entire markup
//processTagName will parse the document for the specified tag name, set option to true to include descendents
//processElementID will parse and return the tag name of the element containing the specified id

$parseResult = $parser->processDocument();
//$parseResult = $parser->processTagName('p',false);
//$parseResult = $parser->processElementID('salesTable');


if( empty($parser->error) ){
    echo
$parser->showResult($parseResult);
}else{
    echo
$parser->error;
}

?>


  Files folder image Files  
File Role Description
Accessible without login HTML file example.html Data Example markup for parsing
Accessible without login Plain text file example.php Example Example Usage
Plain text file html.parser.class.php Class Main Class
Accessible without login Plain text file license.txt Lic. License
Accessible without login Plain text file manual.txt Doc. Documentation

 Version Control Unique User Downloads Download Rankings  
 0%
Total:721
This week:1
All time:4,593
This week:560Up