PHP: Atom Feed Reader: Source Code
Based on the RSS Feed Reader, this is a similar class designed to parse Atom feeds and display them on a webpage as HTML. It's not been tested on all versions of the Atom format, but should be easy enough to customise.
Embedding an Atom Feed as HTML
Again, it's a simple class with a constructor and two public functions: getOutput returns an HTML-formatted version of the Atom feed, while getRawOutput returns all the attributes in a single multi-level array.
<?PHP
use \Chirp\AtomParser;
// where is the feed located?>
$url = "http://www.example.net/atom.xml";
// create object to hold data and display output
$atom_parser = new AtomParser($url);
$output = $atom_parser->getOutput(); # returns string containing HTML
echo $output;
?>
You can limit the number of entries displayed by passing an integer as the first argument to getOutput(), and if the encoding of your webpage doesn't match that of the feed you're subscribing to then you can pass the desired encoding as a second argument to getOutput() (e.g. ISO-8859-1). Otherwise by default the output of this class will use UTF-8 (Unicode).
In some cases, for your request to get through, you will need to set a User Agent before calling the parser script.
Source code of atomparser.php
This class is by no means the be-all and end-all of Atom parsing. It's designed to be simple, functional and easily customisable. Any feedback would be welcome.
File: atomparser.php
<?PHP
namespace Chirp;
// Original PHP code by Chirp Internet: www.chirpinternet.eu
// Please acknowledge use of this code by including this header.
class AtomParser
{
// keeps track of current and preceding elements
var $tags = [];
// array containing all feed data
var $output = [];
// return value for display functions
var $retval = "";
var $errorlevel = 0;
var $encoding = [];
// constructor for new object
function __construct($file)
{
$errorlevel = error_reporting();
error_reporting($errorlevel & ~E_NOTICE);
// instantiate xml-parser and assign event handlers
$xml_parser = xml_parser_create("");
xml_set_object($xml_parser, $this);
xml_set_element_handler($xml_parser, "startElement", "endElement");
xml_set_character_data_handler($xml_parser, "parseData");
// open file for reading and send data to xml-parser
$data = preg_match("/^http/", $file) ? CurlTools::http_get_contents($file) : file_get_contents($file);
xml_parse($xml_parser, $data) or die(
sprintf("myAtomParser: Error <b>%s</b> at line <b>%d</b><br>",
xml_error_string(xml_get_error_code($xml_parser)),
xml_get_current_line_number($xml_parser))
);
// dismiss xml parser
xml_parser_free($xml_parser);
error_reporting($errorlevel);
}
function startElement($parser, $tagname, $attrs = [])
{
if($this->encoding) {
// content is encoded - so keep elements intact
$tmpdata = "<$tagname";
if($attrs) foreach($attrs as $key => $val) $tmpdata .= " $key=\"$val\"";
$tmpdata .= ">";
$this->parseData($parser, $tmpdata);
} else {
if(isset($attrs['HREF'], $attrs['REL']) && $attrs['HREF'] && $attrs['REL'] && ($attrs['REL'] == 'alternate')) {
$this->startElement($parser, 'LINK', []);
$this->parseData($parser, $attrs['HREF']);
$this->endElement($parser, 'LINK');
}
if(isset($attrs['TYPE']) && $attrs['TYPE']) {
$this->encoding[$tagname] = $attrs['TYPE'];
}
// check if this element can contain others - list may be edited
if(preg_match("/^(FEED|ENTRY)$/", $tagname)) {
if($this->tags) {
$depth = count($this->tags);
$tmp = end($this->tags);
$parent = key($tmp);
$num = current($tmp);
if($parent) {
$this->tags[$depth-1][$parent][$tagname] = ($this->tags[$depth-1][$parent][$tagname] ?? 0) + 1;
}
}
array_push($this->tags, [$tagname => []]);
} else {
// add tag to tags array
array_push($this->tags, $tagname);
}
}
}
function endElement($parser, $tagname)
{
// remove tag from tags array
if($this->encoding) {
if(isset($this->encoding[$tagname])) {
unset($this->encoding[$tagname]);
array_pop($this->tags);
} else {
if(!preg_match("/(BR|IMG)/", $tagname)) $this->parseData($parser, "</$tagname>");
}
} else {
array_pop($this->tags);
}
}
function parseData($parser, $data)
{
// return if data contains no text
if(!trim($data)) return;
$evalcode = "\$this->output";
foreach($this->tags as $tag) {
if(is_array($tag)) {
$tagname = key($tag);
$indexes = current($tag);
$evalcode .= "[\"$tagname\"]";
if(isset(${$tagname}) && ${$tagname}) {
$evalcode .= "[" . (${$tagname} - 1) . "]";
}
if($indexes) extract($indexes);
} else {
if(preg_match("/^([A-Z]+):([A-Z]+)$/", $tag, $matches)) {
$evalcode .= "[\"$matches[1]\"][\"$matches[2]\"]";
} else {
$evalcode .= "[\"$tag\"]";
}
}
}
if(isset($this->encoding['CONTENT']) && $this->encoding['CONTENT'] == "text/plain") {
$data = "<pre>$data</pre>";
}
try {
@eval("$evalcode = $evalcode . '" . addslashes($data) . "';");
} catch(ParseError $e) {
error_log($e->message);
}
}
// display a single feed as HTML
function display_feed($data, $limit)
{
extract($data);
if($TITLE) {
// display feed information
$this->retval .= "<h1>";
if(isset($LINK) && $LINK) {
$this->retval .= "<a href=\"$LINK\" target=\"_blank\">";
}
$this->retval .= stripslashes($TITLE);
if(isset($LINK) && $LINK) {
$this->retval .= "</a>";
}
$this->retval .= "</h1>\n";
if(isset($TAGLINE) && $TAGLINE) {
$this->retval .= "<P>" . stripslashes($TAGLINE) . "</P>\n\n";
}
$this->retval .= "<div class=\"divider\"><!-- --></div>\n\n";
}
if($ENTRY) {
// display feed entry(s)
foreach($ENTRY as $item) {
$this->display_entry($item, "FEED");
if(is_int($limit) && --$limit <= 0) break;
}
}
}
// display a single entry as HTML
function display_entry($data, $parent)
{
extract($data);
if(!$TITLE) return;
$this->retval .= "<p><b>";
if($LINK) $this->retval .= "<a href=\"$LINK\" target=\"_blank\">";
$this->retval .= stripslashes($TITLE);
if($LINK) $this->retval .= "</a>";
$this->retval .= "</b>";
if(isset($ISSUED) && $ISSUED) {
$this->retval .= " <small>($ISSUED)</small>";
}
$this->retval .= "</p>\n";
if(isset($AUTHOR) && $AUTHOR) {
$this->retval .= "<P><b>Author:</b> " . stripslashes($AUTHOR['NAME']) . "</P>\n\n";
}
if($CONTENT) {
$this->retval .= "<P>" . stripslashes($CONTENT) . "</P>\n\n";
} elseif($SUMMARY) {
$this->retval .= "<P>" . stripslashes($SUMMARY) . "</P>\n\n";
}
}
function fixEncoding(&$input, $key, $output_encoding)
{
if(!function_exists('mb_detect_encoding')) return $input;
$encoding = mb_detect_encoding($input);
switch($encoding)
{
case 'ASCII':
case $output_encoding:
break;
case '':
$input = mb_convert_encoding($input, $output_encoding);
break;
default:
$input = mb_convert_encoding($input, $output_encoding, $encoding);
}
}
// display entire feed as HTML
function getOutput($limit = FALSE, $output_encoding = 'UTF-8')
{
$this->retval = "";
$start_tag = key($this->output);
switch($start_tag)
{
case "FEED":
foreach($this->output as $feed) $this->display_feed($feed, $limit);
break;
default:
die("Error: unrecognized start tag '$start_tag' in getOutput()");
}
if($this->retval && is_array($this->retval)) {
array_walk_recursive($this->retval, 'myAtomParser::fixEncoding', $output_encoding);
}
return $this->retval;
}
// return raw data as array
function getRawOutput($output_encoding='UTF-8')
{
array_walk_recursive($this->output, 'myAtomParser::fixEncoding', $output_encoding);
return $this->output;
}
}
Here you can copy the code for atomparser.php:
Fields Supported by Default
This script supports the following attributes (fields) by default but can easily be extended. See the Feed Reader Demonstration for examples of parsed Atom (and RSS) feeds.
Channel (FEED)
- Title
- Link (HREF, Rel)
- Tagline
Item (ENTRY)
- Title (required)
- Link (HREF, Rel)
- Issued
- Author (Name)
- Content or Summary
If you think it's worth adding support for other Atom attributes, please let us know using the Feedback link below.
Related Articles - Feed Readers
- PHP RSS Feed Reader Code Example
- PHP Combined RSS and Atom Feed Reader
- PHP Feed Reader with Ajax Updating
- PHP Atom Feed Reader: Source Code
- PHP RSS Feed Reader: Source Code
- PHP RSS and Atom Feed Reader
- PHP YouTube API Feed Reader: Source Code
- PHP Displaying and updating RSS Content using Ajax
Mohammed 3 October, 2016
Hello I got the following error
Fatal error: Call to undefined function Chirp\http_get_contents()
The missing function is defined here.
ABC 18 November, 2011
Is it possible to define a number of characters to display per entry? ie. i want to display the first 40 characters of each entry in the feed
You can add code to truncate the text for each feed entry. It would need to be added to the display_entry function.
Dionn Schaffner 15 April, 2011
I installed all the files, but I keep getting a "Could not open URL" error. I double checked and my php.ini for PHP5 file has the allow_url_fopen option enabled. What am I missing?
There are no problems accessing your feed from our server. If it's still not working you should check your server logs for errors - on the site running the script (for PHP warnings) and the one hosting the RSS feed (for Apache errors).
Ed 15 March, 2011
Excellent! Looked for a couple of days for a php script that does exactly this.
I want this to show just the last two feeds, is this done by changing this line
function getOutput($limit=false, $output_encoding='UTF-8'
to
function getOutput($limit=2, $output_encoding='UTF-8'
??
No, no, no. You change the function call from getOutput() to getOutput(2). Those are optional parameters for the function with default values. Read the instructions at the top of the page
Emery Wooten 29 January, 2011
The reader is not detecting links. Try it with this atom 1.0 feed.
www.weather.gov/alerts-beta/wwaatmget.php?x=ALZ064
The Atom parser class looks for LINK elements with both an 'HREF' attribute and 'REL="alternate"'.
if($attrs['HREF'] && $attrs['REL'] && $attrs['REL'] == 'alternate') {
The feed you're using is missing the 'REL' attribute so you need to change that line to just:
if($attrs['HREF']) {
Some other Atom feeds have multiple LINK tags and normally looking for 'rel="alternate"' tells us which one to use when displaying the feed.
Emery Wooten 29 January, 2011
Thanks for the atom feed reader code. Since the National Weather Service switched from RSS to Atom I needed an atom reader and found yours on Google. It worked right out of the box!
I modified the code to use CSS instead of inserting the html tags. I also modified it to use cURL instead of fopen and also to create cache files.
Thanks again!
gw2297 29 January, 2011
Getting the following error:
Fatal error: Cannot use assign-op operators with overloaded objects nor string offsets in ~/public_html/class.myatomparser.php: eval()'d code on line 1
Any suggestions?
Thanks in advance!
Sorry, not something I've seen. Perhaps you can try echoing the command being eval'd to see what's actually being executed and causing problems?
avantegrate 14 January, 2011
Please explain how to get time field
like "6 hours ago" using the class
First use strtotime to convert the time string into a timestamp, then pass that to the function here.
Josh 2 September, 2010
I am getting an error : "myAtomParser Error: Could not open URL" when I upload to a production server. This works fine locally for testing, any thoughts?
Try removing the '@' from in front of the fopen() command so you can see what the actual error is:
# open file for reading and send data to xml-parser
$fp = @fopen($file, "r") or die("myAtomParser Error: Could not open URL $file for input")
Mostly likely your request is being blocked/denied by the target server based on your IP address or user agent. Or the URL/path is invalid.
Anthony Clough 9 January, 2010
Great resource. I am just beginning to use php and found this very usefull in parsing xml feeds from a variety of sources! Thanks for posting your work!
Michael Blow 17 December, 2009
Using your atom feed reader and I have the same problem as Dave (28/02/2009): a whole list of 'undefined index' statements. I've validated the feed and comes back as valid atom 1.0 - its just a feed from blogspot. Also I have used your class in the past and it worked fine.
Those are just notices/warnings and not actual errors. You can hide them by changing your error_reporting level in PHP. See the comments here. The way the parseData function is written with the eval() statement makes it almost impossible to avoid referencing undefined variables.
G 7 July, 2009
Is there a way to show the publish date of the post? And if possible, the time.
The feed parser extracts all fields from the feed into an array. You can then modify the display_entry function to display them how you want. Or use the raw data as shown in some of the related articles above.
dave 1 March, 2009
i am happy to have found this. question though. not sure what i am doing wrong. i get back tons of these:
Notice: Undefined index: HREF in ...
Notice: Undefined index: TYPE in ...
can't figure it out
Are you sure your feed contains valid XML? It looks like you it might have some HTML that is not properly encoded. You should try validating your feed as a first step.
Evan Robinson 19 September, 2008
I couldn't have added the twitter feed nearly as easily without this code. I'm afraid that I butchered the output unmercifully in order to get it into mySQL instead of on the page, but the underlying code worked great and was easy to understand and modify. Thanks for a great resource!
Kevin Creighton 18 July, 2008
One quick question: Is it possible for me to remove the "Author" field from the output, and if so, how is that accomplished?
Again, let me congratulate on this simple, elegant solution that works for for even PHP dummies like me.
Just remove the code block from display_entry() that starts with if($AUTHOR) as that's where the Author is displayed.
Pascal 25 June, 2008
I've tested this PHP class and therefor I'd like to use it in other PHP projects. Therefor I have to know under which license this PHP class is published. Can you please add this information? Thanks!
Hi Pascal, this class is available under an open source licence.
Ken 6 January, 2008
You say the function getRawOutput() returns an array. What are the keys in the array?
The best way to find that out would be to use the print_r function to display all the contents of the array. There's also an example on this page.
Hank 21 December, 2007
Is there a way to limit the output to say 5 or 6 entries? I am not very good at PHP but I know there has got to be a way to do it...
Just replace "getOutput()" with "getOutput(5)" in the example code to limit the output to the first 5 entries.
trevor 10 October, 2007
This code totally rocks!
I was banging my head against the wall for 3 days until i found your solution for blogspot atom feeds. I called the xml from my php server and granted access to the flash movie on the third server and from that server called the php file which basically echos the blogspot feed.
That sounds a bit complicated to me, but as long as it works
Botond Zalai 22 September, 2007
This script is great, but in order to use a blogger atom feed in my page I had to do an encoding conversion.
Thanks for your feedback Botond. You can see that I've now added some code to handle character encoding differences between the feed source and the page where it's displayed.