A lax Web news feed parser
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 

118 lines
3.3 KiB

Empty entry:
input: >
<rss><channel>
<item/>
</channel></rss>
output:
format: rss
GUID:
input: >
<rss><channel>
<item>
<guid isPermaLink="false">blah</guid>
</item>
</channel></rss>
output:
format: rss
entries:
- id: blah
Language:
input: >
<rss><channel>
<item>
<guid isPermaLink="false">blah</guid>
<language>fr</language>
</item>
</channel></rss>
output:
format: rss
entries:
- id: blah
lang: fr
Entry link:
input: >
<rss><channel>
<item>
<guid isPermaLink="true">http://example.com/</guid>
</item>
<item>
<guid>http://example.com/</guid>
</item>
<item>
<link>http://example.com/</link>
</item>
</channel></rss>
output:
format: rss
entries:
- id: 'http://example.com/'
link: 'http://example.com/'
- id: 'http://example.com/'
link: 'http://example.com/'
- link: 'http://example.com/'
Related link:
input: >
<rss><channel>
<item>
<guid isPermaLink="true">http://example.com/</guid>
<link>http://example.net/</link>
</item>
<item>
<guid>http://example.com/</guid>
<link>http://example.net/</link>
</item>
<item>
<guid isPermaLink="false">http://example.com/</guid>
<link>http://example.net/</link>
</item>
<item>
<guid>http://example.com/</guid>
<link>http://example.com/</link>
</item>
<item>
<guid>http://example.com/</guid>
<link>http://example.com/blah</link>
</item>
</channel></rss>
output:
format: rss
entries:
- id: 'http://example.com/'
link: 'http://example.com/'
relatedLink: 'http://example.net/'
- id: 'http://example.com/'
link: 'http://example.com/'
relatedLink: 'http://example.net/'
- id: 'http://example.com/'
link: 'http://example.net/'
- id: 'http://example.com/'
link: 'http://example.com/'
- id: 'http://example.com/'
link: 'http://example.com/blah'
Update and creation dates:
input: >
<rss><channel>
<item>
<pubDate>2020-03-03T00:00:00Z</pubDate>
</item>
<item>
<pubDate>2020-03-03T00:00:00Z</pubDate>
<lastBuildDate>2020-01-01T00:00:00Z</lastBuildDate>
</item>
<item>
<pubDate>2020-03-03T01:00:00+01:00</pubDate>
<pubDate>2020-03-03T00:00:00Z</pubDate>
</item>
</channel></rss>
output:
format: rss
entries:
- dateModified: '2020-03-03T00:00:00Z'
- dateCreated: '2020-01-01T00:00:00Z'
dateModified: '2020-03-03T00:00:00Z'
- dateModified: '2020-03-03T01:00:00+01:00'