Posts Tagged ‘parse’

Getting the Text Between HTML Tags in PHP

Suppose you’re automatically parsing a webpage, and you come across the following kind of thing:
blah blah
some starting text
some useful content
some ending text
blah blah
We want to parse out the useful content from among the non-useful stuff, and we know there’s some starting text and some ending text that wraps the useful content.
A better example:
I like chicken
<div [...]

Parsing Full RSS Content Text in Ruby on Rails

I was parsing an RSS file in Ruby on Rails today, and found that the RSS::Parser.parse seems to throw away the actual content of the RSS items, leaving only the description.
require 'rss'
require 'open-uri'

source = "http://www.website.com/rss.xml"
content = ""
open(source) do |s| content = s.read end
rss = RSS::Parser.parse(content, true)

rss.items.each do |item|
puts item.description
end
This didn’t do me [...]

Obtaining the HTML Title of a URL

Here’s a little code snippet that allows you to grab the Title tag if you have a URL in php:
$url = "http://www.folksonomy.org";
$page = file($url);
$page = implode("",$file);

if(preg_match("/<title>(.+)<\/title>/i",$page,$t))
print "$url has the title: $t";
else
print "No title was found";