Also transfer any existing scraper booleans on database upgrade. It was
previously possible to enable scraping manually by editing the database,
and these settings will be honoured.
- The Database class is now responsible for preparing rules
- Rules are now returned in an array keyed by user
- Empty strings are now passed through during rule preparation
- Whitespace is now collapsed before evaluating rules
- Feed tests are fixed to retrieve a dumy set of rules
- Rule evaluation during feed parsing also filled out
The driver itself has not been expnaded; more is probably required to ensure
metadata is kept in sync and users created when the internal database does
not list a user an external database claims to have
This finally brings PostgreSQL to parity with SQLite and MySQL.
Two tests casting binary data to text were removed since behaviour here
should in fact be undefined
Accountinf for any encoding when retrieving data will be addressed by
a later commit
@ -6,7 +6,7 @@ Information on how to install and use the software can be found in [the manual](
# Installing from source
The main repository for The Arsse can be found at [code.mensbeam.com](https://code.mensbeam.com/MensBeam/arsse/), with a mirror also available [at GitHub](https://github.com/meansbeam/arsse/). The main repository is preferred, as the GitHub mirror can sometimes be out of date.
The main repository for The Arsse can be found at [code.mensbeam.com](https://code.mensbeam.com/MensBeam/arsse/), with a mirror also available [at GitHub](https://github.com/mensbeam/arsse/). The GitHub mirror does not accept bug reports, but the two should otherwise be equivalent.
[Composer](https://getcomposer.org/) is required to manage PHP dependencies. After cloning the repository or downloading a source code tarball, running `composer install` will download all the required dependencies, and will advise if any PHP extensions need to be installed. If not installing as a programming environment, running `composer install --no-dev` is recommended.
@ -88,7 +88,7 @@ There is also a `test:quick` Robo task which excludes slower tests, and a `test:
### Test coverage
Computing the coverage of tests can be done by running `./robo coverage`. Either [phpdbg](https://php.net/manual/en/book.phpdbg.php) or [Xdebug](https://xdebug.org) is required for this. An HTML-format coverage report will be written to `/tests/coverage/`.
Computing the coverage of tests can be done by running `./robo coverage`, after which an HTML-format coverage report will be written to `/tests/coverage/`. Either [PCOV](https://github.com/krakjoe/pcov), [Xdebug](https://xdebug.org), or [phpdbg](https://php.net/manual/en/book.phpdbg.php) is required for this. PCOV is generally recommended as it is faster than Xdebug; phpdbg is faster still, but less accurate. If using either PCOV or Xdebug, the extension need not be enabled globally; PHPUnit will enable it when needed.
## Enforcing coding style
@ -105,8 +105,6 @@ The Arsse's user manual, made using [Daux](https://daux.io/), can be compiled by
The manual employs a custom theme derived from the standard Daux theme. If the standard Daux theme receives improvements, the custom theme can be rebuilt by running `./robo manual:theme`. This requires that [NodeJS](https://nodejs.org) and [Yarn](https://yarnpkg.com/) be installed, but JavaScript tools are not required to modify The Arsse itself, nor the content of the manual.
The Robo task `manual:css` will recompile the theme's stylesheet without rebuilding the entire theme.
## Packaging a release
Producing a release package is done by running `./robo package`. This performs the following operations:
The Advanced RSS Environment (affectionately called "The Arsse") is a news aggregator server which implements multiple synchronization protocols. Unlike most other aggregator servers, The Arsse does not include a Web front-end (though one is planned as a separate project), and it relies on [existing protocols](Supported_Protocols) to maximize compatibility with [existing clients](Compatible_Clients). Supported protocols are:
- A Linux server running Nginx or Apache 2.4 (tested on Ubuntu 16.04 and 18.04)
- PHP 7.0.7 or later with the following extensions:
- A Linux server running Nginx or Apache 2.4
- PHP 7.1.0 or later with the following extensions:
- [intl](http://php.net/manual/en/book.intl.php), [json](http://php.net/manual/en/book.json.php), [hash](http://php.net/manual/en/book.hash.php), and [dom](http://php.net/manual/en/book.dom.php)
- [simplexml](http://php.net/manual/en/book.simplexml.php), and [iconv](http://php.net/manual/en/book.iconv.php)
- One of:
- [sqlite3](http://php.net/manual/en/book.sqlite3.php) or [pdo_sqlite](http://php.net/manual/en/ref.pdo-sqlite.php) for SQLite databases
- [pgsql](http://php.net/manual/en/book.pgsql.php) or [pdo_pgsql](http://php.net/manual/en/ref.pdo-pgsql.php) for PostgreSQL 10 or later databases
- [mysqli](http://php.net/manual/en/book.mysqli.php) or [pdo_mysql](http://php.net/manual/en/ref.pdo-mysql.php) for MySQL/Percona 8.0.11 or later databases
The latest version of The Arsse can be downloaded [from our releases page](https://code.mensbeam.com/MensBeam/arsse/releases). The attachments named _arsse-x.x.x.tar.gz_ should be used rather than those marked "Source Code".
The latest version of The Arsse can be downloaded [from our Web site](https://thearsse.com/). If installing an older release from our archives, the attachments named _arsse-x.x.x.tar.gz_ should be used rather than those marked "Source Code".
Installation from source code is also possible, but the release packages are recommended.
Tha Arsse must then be configured to use the created database. A suitable [configuration file](/en/Getting_Started/Configuration) might look like this:
The Arsse must then be configured to use the created database. A suitable [configuration file](/en/Getting_Started/Configuration) might look like this:
While MySQL can be used as a database for The Arsse, this is **not recommended** due to MySQL's technical limitations. It is fully functional, but may fail with some newsfeeds where other database systems do not. Additionally, it is particularly important before upgrading from one version of The Arsse to the next to back up your database: a failure in a database upgrade can corrupt your database much more easily than when using other database systems.
You are therefore strongly advised not to use MySQL. Though our MySQL test suite ensures functionally identical behaviour to SQLite and PostgreSQL for the supplied test data in a default MySQL configuration, there are [many other subtle ways in which it can fail](https://grimoire.ca/mysql/choose-something-else), and we do not have the manpower to account for most of these with certainty.
You are therefore strongly advised not to use MySQL. Though our MySQL test suite ensures functionally identical behaviour to SQLite and PostgreSQL for the supplied test data in a default MySQL configuration, there are [many other subtle ways in which it can fail](https://web.archive.org/web/20190929090114/https://grimoire.ca/mysql/choose-something-else), and we do not have the manpower to account for most of these with certainty.
Also please note that as of this writing MariaDB cannot be used in place of MySQL as it lacks features of MySQL 8 which The Arsse requires. The awkwardly-named [_Percona Server for MySQL_](https://www.percona.com/software/mysql-database/percona-server), on the other hand, should work, though this has not been tested.
Also please note that as of this writing MariaDB cannot be used in place of MySQL as it lacks features of MySQL 8 which The Arsse requires (see the [relevant MariaDB issue](https://jira.mariadb.org/browse/MDEV-18511) for details). The awkwardly-named [_Percona Server for MySQL_](https://www.percona.com/software/mysql-database/percona-server), on the other hand, will work.
# Set-up
@ -27,7 +27,7 @@ sudo mysql -e "CREATE DATABASE arssedb"
sudo mysql -e "GRANT ALL ON arssedb.* TO 'arsseuser'@'localhost'"
```
Tha Arsse must then be configured to use the created database. A suitable [configuration file](/en/Getting_Started/Configuration) might look like this:
The Arsse must then be configured to use the created database. A suitable [configuration file](/en/Getting_Started/Configuration) might look like this:
@ -321,7 +321,7 @@ It is also possible to specify the fully-qualified name of a class which impleme
The interval the newsfeed fetching service observes between checks for new articles. Note that requests to foreign servers are not necessarily made at this frequency: each newsfeed is assigned its own time at which to be next retrieved. This setting instead defines the length of time the fetching service will sleep between periods of activity.
Consult "[How Often Newsfeeds Are Fetched](/en/Using_The_Arsse/Keeping_Newsfeeds_Up_to_Date#page_Appendix-How-Often-Newsfeeds-Are-Fetched)" for details on how often newsfeeds are fetched.
Consult "[How Often Newsfeeds Are Fetched](/en/Using_The_Arsse/Keeping_Newsfeeds_Up_to_Date#page_Appendix-how-often-newsfeeds-are-fetched)" for details on how often newsfeeds are fetched.
Presently installing and setting up The Arsse is a manual process. We hope to have pre-configured installation packages available for various operating systems eventually, but for now the pages in this section should help get you up and running.
Though The Arsse itself makes no assumptions about the operating system which hosts it, we use and have the most experience with Debian; the instructions contained here therefore are for Debian systems will will probably either not work with other systems or not be consistent with their conventions. Nevertheless, they should still serve as a useful guide.
Though The Arsse itself makes no assumptions about the operating system which hosts it, we use and have the most experience with Debian; the instructions contained here therefore are for Debian systems and will probably either not work with other systems or not be consistent with their conventions. Nevertheless, they should still serve as a useful guide.
[Miniflux](/en/Supported_Protocols/Miniflux) clients may optionally log in using tokens: randomly-generated strings which act as persistent passwords. For now these must be generated using the command-line interface:
There are also commands for listing and revoking tokens. Please consult the integrated help for more details.
# Setting and changing user metadata
Users may also have various metadata properties set. These largely exist for compatibility with [the Miniflux protocol](/en/Supported_Protocols/Miniflux) and have no significant effect. One exception to this, however, is the `admin` flag, which signals whether the user may perform privileged operations where they exist in the supported protocols.
The flag may be changed using the following command:
```sh
sudo -u www-data php arsse.php user set "jane.doe" admin true
```
As a shortcut it is also possible to create administrators directly:
The Miniflux protocol is a fairly well-designed protocol supporting a wide variety of operations on newsfeeds, folders (termed "categories"), and articles; it also allows for user administration, and native OPML importing and exporting. Architecturally it is similar to the Nextcloud News protocol, but has more capabilities.
Miniflux version 2.0.28 is emulated, though not all features are implemented
# Missing features
- JSON Feed format is not suported
- Various feed-related features are not supported; attempting to use them has no effect
- Rewrite rules and scraper rules
- Custom User-Agent strings
- The `disabled`, `ignore_http_cache`, and `fetch_via_proxy` flags
- Changing the URL, username, or password of a feed
- Manually refreshing feeds
- Titles and types are not available during feed discovery and are filled with generic data
- Reading time is not calculated and will always be zero
- Only the first enclosure of an article is retained
- Comment URLs of articles are not exposed
# Differences
- Various error codes and messages differ due to significant implementation differences
- `PUT` requests which return a body respond with `200 OK` rather than `201 Created`
- The "All" category is treated specially (see below for details)
- Feed and category titles consisting only of whitespace are rejected along with the empty string
- Filtering rules may not function identically (see below for details)
- The `checked_at` field of feeds indicates when the feed was last updated rather than when it was last checked
- Creating a feed with the `scrape` property set to `true` might not return scraped content for the initial synchronization
- Querying articles for both read/unread and removed statuses will not return all removed articles
- Search strings will match partial words
- OPML import either succeeds or fails atomically: if one feed fails, no feeds are imported
# Behaviour of filtering (block and keep) rules
The Miniflux documentation gives only a brief example of a pattern for its filtering rules; the allowed syntax is described in full [in Google's documentation for RE2](https://github.com/google/re2/wiki/Syntax). Being a PHP application, The Arsse instead accepts [PCRE syntax](http://www.pcre.org/original/doc/html/pcresyntax.html) (or since PHP 7.3 [PCRE2 syntax](https://www.pcre.org/current/doc/html/pcre2syntax.html)), specifically in UTF-8 mode. Delimiters should not be included, and slashes should not be escaped; anchors may be used if desired. For example `^(?i)RE/MAX$` is a valid pattern.
For convenience the patterns are tested after collapsing whitespace. Unlike Miniflux, The Arsse tests the patterns against an article's author-supplied categories if they do not match its title. Also unlike Miniflux, when filter rules are modified they are re-evaluated against all applicable articles immediately.
# Special handling of the "All" category
Nextcloud News' root folder and Tiny Tiny RSS' "Uncategorized" catgory are mapped to Miniflux's initial "All" category. This Miniflux category can be renamed, but it cannot be deleted. Attempting to do so will delete the child feeds it contains, but not the category itself.
Because the root folder does not existing in the database as a separate entity, it will always sort first when ordering by `category_id` or `category_title`.
# Interaction with nested categories
Tiny Tiny RSS is unique in allowing newsfeeds to be grouped into categories nested to arbitrary depth. When newsfeeds are placed into nested categories, they simply appear in the top-level category when accessed via the Miniflux protocol. This does not affect OPML exports, where full nesting is preserved.
The NextCloud News protocol was the first supported by The Arsse, and has been supported in full since version 0.3.0.
The Nextcloud News protocol was the first supported by The Arsse, and has been supported in full since version 0.3.0.
It allows organizing newsfeeds into single-level folders, and supports a wide range of operations on newsfeeds, folders, and articles.
@ -24,8 +24,7 @@ It allows organizing newsfeeds into single-level folders, and supports a wide ra
- When marking articles as starred the feed ID is ignored, as they are not needed to establish uniqueness
- The feed updater ignores the `userId` parameter: feeds in The Arsse are deduplicated, and have no owner
- The `/feeds/all` route lists only feeds which should be checked for updates, and it also returns all `userId` attributes as empty strings: feeds in The Arsse are deduplicated, and have no owner
- The API's "updater" routes do not require administrator priviledges as The Arsse has no concept of user classes
- The "updater" console commands mentioned in the protocol specification are not implemented, as The Arsse does not implement the required NextCloud subsystems
- The "updater" console commands mentioned in the protocol specification are not implemented, as The Arsse does not implement the required Nextcloud subsystems
- The `lastLoginTimestamp` attribute of the user metadata is always the current time: The Arsse's implementation of the protocol is fully stateless
- Syntactically invalid JSON input will yield a `400 Bad Request` response instead of falling back to GET parameters
- Folder names consisting only of whitespace are rejected along with the empty string
@ -36,4 +35,4 @@ It allows organizing newsfeeds into single-level folders, and supports a wide ra
# Interaction with nested folders
Tiny Tiny RSS is unique in allowing newsfeeds to be grouped into folders nested to arbitrary depth. When newsfeeds are placed into nested folders, they simply appear in the top-level folder when accessed via the NextCloud News protocol.
Tiny Tiny RSS is unique in allowing newsfeeds to be grouped into folders nested to arbitrary depth. When newsfeeds are placed into nested folders, they simply appear in the top-level folder when accessed via the Nextcloud News protocol.
@ -59,7 +59,7 @@ The Arsse does not currently support the entire protocol. Notably missing featur
# Interaction with HTTP authentication
Tiny Tiny RSS itself is unaware of HTTP authentication: if HTTP authentication is used in the server configuration, it has no effect on authentication in the API. The Arsse, however, makes use of HTTP authentication for NextCloud News, and can do so for TT-RSS as well. In a default configuration The Arsse functions in the same way as TT-RSS: HTTP authentication and API authentication are completely separate and independent. Alternative behaviour is summarized below:
Tiny Tiny RSS itself is unaware of HTTP authentication: if HTTP authentication is used in the server configuration, it has no effect on authentication in the API. The Arsse, however, makes use of HTTP authentication for Nextcloud News, and can do so for TT-RSS as well. In a default configuration The Arsse functions in the same way as TT-RSS: HTTP authentication and API authentication are completely separate and independent. Alternative behaviour is summarized below:
@ -23,7 +23,6 @@ The Fever protocol is incomplete, unusual, _and_ a product of proprietary softwa
- All feeds are considered "Kindling"
- The "Hot Links" feature is not implemented; when requested, an empty array will be returned. As there is no way to classify a feed as a "Spark" in the protocol itself and no documentation exists on how link temperature was calculated, an implementation is unlikely to appear in the future
- Favicons are not currently supported; all feeds have a simple blank image as their favicon unless the client finds the icons itself
The Arsse was designed from the start as a server for multiple synchronization protocols which clients can make use of. Currently the following protocols are supported: