A lax Web news feed parser
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 

178 lines
4.9 KiB

Empty entry:
input: >
<rss><channel>
<item/>
</channel></rss>
output:
format: rss
GUID:
input: >
<rss><channel>
<item>
<guid isPermaLink="false">blah</guid>
</item>
</channel></rss>
output:
format: rss
entries:
- id: blah
Language:
input: >
<rss><channel>
<item>
<guid isPermaLink="false">blah</guid>
<language>fr</language>
</item>
</channel></rss>
output:
format: rss
entries:
- id: blah
lang: fr
Entry link:
input: >
<rss><channel>
<item>
<guid isPermaLink="true">http://example.com/</guid>
</item>
<item>
<guid>http://example.com/</guid>
</item>
<item>
<link>http://example.com/</link>
</item>
</channel></rss>
output:
format: rss
entries:
- id: 'http://example.com/'
link: 'http://example.com/'
- id: 'http://example.com/'
link: 'http://example.com/'
- link: 'http://example.com/'
Related link:
input: >
<rss><channel>
<item>
<guid isPermaLink="true">http://example.com/</guid>
<link>http://example.net/</link>
</item>
<item>
<guid>http://example.com/</guid>
<link>http://example.net/</link>
</item>
<item>
<guid isPermaLink="false">http://example.com/</guid>
<link>http://example.net/</link>
</item>
<item>
<guid>http://example.com/</guid>
<link>http://example.com/</link>
</item>
<item>
<guid>http://example.com/</guid>
<link>http://example.com/blah</link>
</item>
</channel></rss>
output:
format: rss
entries:
- id: 'http://example.com/'
link: 'http://example.com/'
relatedLink: 'http://example.net/'
- id: 'http://example.com/'
link: 'http://example.com/'
relatedLink: 'http://example.net/'
- id: 'http://example.com/'
link: 'http://example.net/'
- id: 'http://example.com/'
link: 'http://example.com/'
- id: 'http://example.com/'
link: 'http://example.com/blah'
Update and creation dates:
input: >
<rss><channel>
<item>
<pubDate>2020-03-03T00:00:00Z</pubDate>
</item>
<item>
<pubDate>2020-03-03T00:00:00Z</pubDate>
<lastBuildDate>2020-01-01T00:00:00Z</lastBuildDate>
</item>
<item>
<pubDate>2020-03-03T01:00:00+01:00</pubDate>
<pubDate>2020-03-03T00:00:00Z</pubDate>
</item>
</channel></rss>
output:
format: rss
entries:
- dateModified: '2020-03-03T00:00:00Z'
- dateCreated: '2020-01-01T00:00:00Z'
dateModified: '2020-03-03T00:00:00Z'
- dateModified: '2020-03-03T01:00:00+01:00'
Entry title:
input: >
<rss><channel>
<item>
<title> Loose title</title>
</item>
</channel></rss>
output:
format: rss
entries:
- title: {loose: 'Loose title'}
Entry content:
input: >
<rss><channel>
<item>
<description> Loose content</description>
</item>
</channel></rss>
output:
format: rss
entries:
- content: {loose: 'Loose content'}
Entry author:
input: >
<rss><channel>
<item>
<author>jane.doe@example.com (Jane Doe)</author>
<author>john.doe@example.com (John Doe)</author>
</item>
</channel></rss>
output:
format: rss
entries:
- people:
- name: 'Jane Doe'
mail: 'jane.doe@example.com'
role: author
- name: 'John Doe'
mail: 'john.doe@example.com'
role: author
Categories:
input: >
<rss><channel>
<item>
<category>Category the first </category>
<category domain="ook eek">Category the second </category>
<category/>
</item>
</channel></rss>
output:
format: rss
entries:
- categories:
- name: 'Category the first'
- name: 'Category the second'
domain: 'ook eek'