Quantcast
Channel: Python feedparser not using atom/WordPress namespace? - Stack Overflow
Viewing all articles
Browse latest Browse all 2

Python feedparser not using atom/WordPress namespace?

0
0

I'm trying to use feedparser (an excellent library) to parse WordPress export files, and a (minor) inconsistency between WordPress version is causing me a huge headache.

WordPress 2.x doesn't include atom:link tags in the XML output (without_atom_tags.xml). When parsed, namespaced elements are available without the prefix:

>>> feed = feedparser.parse("without_atom_tags.xml")>>> print feed.entries[0].comment_statusu'open'

The XML from WordPress 3.x does contain atom:link tags (with_atom_tags.xml), and you must prefix namespace elements:

>>> feed = feedparser.parse("with_atom_tags.xml")>>> feed.entries[0].wp_comment_status              # <-- Note wp_ prefixu'open'>>> feed.entries[0].comment_statusAttributeError: object has no attribute 'comment_status'

Interestingly, the prefixes aren't needed if I add xmlns:atom="http://www.w3.org/2005/Atom" to the root RSS element (with_atom_tags_and_namespace.xml).

I need to parse all these different formats without modifying the XML. Is feedparser broken, or am I doing it wrong? Can I do this without a bunch of nasty conditional code?


Viewing all articles
Browse latest Browse all 2

Latest Images

Trending Articles





Latest Images