Skip to content
Grav 2.0 is officially stable. Read the announcement →

Community guidelines

Please keep discussions civil and on-topic. Repeated violations may lead to a temporary ban.

General

Twigfeeds RSS Feed Labelling/Categorisation

Started by DrPen 1 year ago · 64 replies · 743 views
10 months ago

Took me some time to get back to this; beta.7 is a quick fix that addresses the Admin-issues. It is because of changes or regression-issues with the selectize-field. I've been that road before, and the same problems seem to have returned. Note that contrary to previously, both categories and tags are pluralized and defined as Lists, thus cannot be a string-value and won't get picked up by category or tag.

user/config/plugins/twigfeeds.yaml bears no relation to the cache, and only defines the settings for the parser and is interpreted by Admin for the GUI. I can see the cache-issue however, it is because the config is not carried over into the result. I'll look for a fix shortly. EDIT: Should be fixed with beta.8.

last edited 08/24/25 by Ole Vik
10 months ago

Thanks for this, and your continued great effort to evolve the plugin. Please dont feel harrassed by me. Soon I will have less time also ;)

Update on tests/experience with beta.6:

  • Im testing about 25 feeds. I'll limit it at 30 I think.
  • Im running categories, tags, a merged full feed and a 'snapshot' random 3 entries of the full feed, plus a source list.
  • All tags & categories behave exactly as expected. Using a dynamically rendered page for tag and one for category to call items works brilliantly.
  • Full feed works great, overall as expected. Pagination works great.
  • Anomaly with caching/refreshing Blusesky feeds (see below)
  • Anomaly while using randomize when calling 3 random items from the merged feed as a small preview of the full feed. (see below)

Bluesky cache_time and lastModified

lastModified is calling the most recent refresh of the BS rss by twigfeeds, NOT the original pubDate of the BS item, so it adds the newest refresh date as if its the pubDate - to all the items Im pulling from BS (3 items each for 2 feeds). This only happens with BS feeds, all other feeds retain the original item pubDate and populate the merged feed etc as expected. But when BS feeds are refreshed, all 6 items have the same or very similar new lastModified date and go to the top of the merged feed list and stay there until another feed is refreshed/updated later. They very quickly return to the top if they have short or null cache_time.
I tested this by adding cache_time: 3600 to every feed except the two Bluseky feeds, which have cache_time: 21600. This prevents the BS feeds continuously occupying the top 6 items of the merged feed list. Checking the BS original pubDate in the raw RSS confirms what Im thinking, that lastModified in the case of BS is the time twigfeeds refreshes the BS feed update, and the original pubDate of an item is lost in the twigfeeds cached data. I hope this is understandable.

Odd randomize behaviour

Im doing this to create a small preview of random items from the merged feed. When calling the retrieveTitle field, an odd behaviour occurs - the feed name often just does not show up in the template, even though the item always shows. I cannot work out why this is happening. It affects any feed source. Ive tried numerous configurations of how to call the random 3 entries.

  • For retrievedTitle as
    {% set item = item|merge({ 'retrievedTitle': feed.config.name }) %} or {% set item = item|merge({ 'retrievedTitle': name }) %} both work, but only randomly.

  • For the item call:
    {% for index, item in feed_items|sort_by_key('sortDate')|randomize|slice(0, 3) %}
    Or
    {% for item in feed_items|ksort('retrievedTitle')|randomize|slice(0, 3) %}
    Or
    {% for item in feed_items|sort('sortDate')|randomize|slice(0, 3) %}
    etc

  • The item is called as usual:

    TWIG
    {#  if it has no title, use the description #}
        {% if item.title != true %}
             <li class="">
                <em style="font-weight: 600">{{ item.retrievedTitle }}</em> 
                <a class="feed-desc-url" href="{{ item.link }}">{{ item.content|safe_truncate_html(8)|striptags|raw}}</a>               
                <small>{{ item.lastModified }}</small>
             </li>
        {% else %}         
            {#  otherwise, show the title field #}
             <li class="">
                <em style="font-weight: 600">{{ item.retrievedTitle }}</em> 
                <a class="feed-desc-url" href="{{ item.link }}">{{ item.title }}</a>           
                <small>{{ item.lastModified }}</small>
             </li>
        {% endif %}    
    

    I cannot explain why retrievedTitle only shows up randomly. Any ideas appreciated on this. Maybe I need to do a set random thing but I dont know how.

Ill keep updating the shared Obsidian note with other info as I go along.

last edited 08/25/25 by DrPen
10 months ago

BlueSky's lack of conformance and server-setup will cause this, and as you've discovered the solution with aligning TwigFeed's cache_time to make it behave is insufficient. The offending line is in Parser.php, used as a fallback for this lack of conformance. The parser-library's getLastModified should handle pubDate, but for the time being this looks like an edge-case with BlueSky that would need further debugging.

Thanks for the sort_by_key reminder, and updating the Obsidian-note, I'd forgotten if Twig can do datetime-parsing and comparison by itself. Seemingly not through the native |sort((a, b) => a.lastModified - b.lastModified).

retrievedTitle is not a standardized property, but seemingly based on the pagination-example. My best guess would be misalignment between TwigFeed's and Twig's caching, but the use of the underlying properties feed.config.name and name shouldn't be affected. A trick for debugging is to add in

PHP
echo '<script>window.twig_feeds = ' . json_encode($feed_items) . ';</script>';

before line 372 in twigfeeds.php, to view the actual data passed to Twig on runtime. That is taken directly from the raw data, cached or not. Beyond that, all cached data is in cache://twigfeeds or user://data/twigfeeds, and you'd have to compare if there are any changes to the files that manifest.json refers to in order to discover unexpected variations between processing.

For posterity, the following is an example of flipping the feed-data to group and sort it by taxonomy. Grouping depends on a bit of pre-arranging of metadata:

TWIG
{# Iterate and find unique tag-values #}
{% set twig_feeds_tags = [] %}
{% for name, feed in twig_feeds %}
  {% for value in feed.config['tags'] %}
    {% if value not in twig_feeds_tags %}
      {% set twig_feeds_tags = twig_feeds_tags|merge([value]) %}
    {% endif %}
  {% endfor %}
{% endfor %}
{# Iterate and find unique category-values #}
{% set twig_feeds_categories = [] %}
{% for name, feed in twig_feeds %}
  {% for value in feed.config['categories'] %}
    {% if value not in twig_feeds_categories %}
      {% set twig_feeds_categories = twig_feeds_categories|merge([value]) %}
    {% endif %}
  {% endfor %}
{% endfor %}

So that you could efficiently regroup, sort, and render them:

TWIG
{% set twig_feeds_tags_items = [] %}
{% for value in twig_feeds_tags %}
  {% set twig_feeds_filtered = twig_feeds|filter(v => value in v.config.tags) %}
  {% set twig_feeds_filtered_items = [] %}
  {% for name, feed in twig_feeds_filtered %}
    {% set twig_feeds_filtered_items = twig_feeds_filtered_items|merge(feed.items) %}
  {% endfor %}
  <h4>{{ value }} ({{ print_r(twig_feeds_filtered_items|count) }})</h4>
  {% for item in twig_feeds_filtered_items|sort_by_key('lastModified') %}
    <time>{{ item.lastModified }}</time>
    <small>
      <a href="{{ item.link }}">{{ item.title|default(item.link) }}</a>
    </small>
    <br />
  {% endfor %}
{% endfor %}
10 months ago

(forgot to mention, I'll test beta.7 asap over next few days.)

The offending line is in Parser.php .... parser-library’s getLastModified should handle pubDate , but for the time being this looks like an edge-case with BlueSky

Yes definitely the fault of Bluesky, I can see how its behaving. No matter if there are updated items or not, all Bluesky sources appear again at the top whenever Twigfeeds checks the server at cache_time (atm every 6hrs). Every other feed behaves as it should re pubDate and refreshing so I keep that at 30minutes.

updating the Obsidian-note

I'll tidy this up so its a more logical and usable, for you to see how users encounter and deal with things, and documenting the issues I come across more clearly.

retrievedTitle is not a standardized property

Yes I know, but I was using that for convenience to call the name or feed.config.name, as per the merged feed code pattern/snippet. I will test the code line in twigfeeds.php and see what I can find out. Its very odd. This does not affect item randomization at all.

Thanks also for the more expert code on the tag and category sorting and parsing. I do have a simpler version of doing this but will test your advanced way.

btw I just sorted the source list alphabetically, which is a useful thing to also have in this more advanced and dynamic way of using Twigfeeds with a lot of feeds, it's very simple - {% set twig_feeds = twig_feeds|sort_by_key('name') %}, then calling as normal.
It looks like this currently
the A-Z sorted source list, with CAtegory and tag allocations|690x376, 50%

last edited 08/26/25 by DrPen
10 months ago

quick update. Im a bit slow atm on this as Im sunk in other time sensitive work (mid Sept deadline). I havent tested v7 yet, sorry. But Im testing how much load can be done on the plugin - Im running 36 feeds (3 posts each) and can say no real issues at all aside from slightly slow loading of the website PWA when Im on a phone. But once it opens after a few seconds wait, everything works very fast. On desktop load time is not noticeable. Nearly every feed Ive used works properly. Obv some feeds update less than others but most are fairly active.

Ill test beta7 asap.

10 months ago

So I just broke the site - error in twigfeeds "Call to a member function getFeed() on array", Error user/plugins/twigfeeds/classes/Parser.php:124

This breaks the front end completely, but admin is intact. I was doing some work to get a dynamic page for single sources using uri.param and it was partially working but I couldnt work out why only some of the feeds worked. So, I cleared cache, but instead of using php8.1-cli, I used php8.3-cli instead (Ionos requires this method of using cli commands and the site configuration/info lists php8.3.24). It was at this exact moment it broke so I dont think its anything to do with my templates or with user/config/plugins/twigfeeds.yaml as I removed that to test. I have since updated everything on the site (Grav, the standard plugins and moved to twigfeeds v5beta7, but it makes no difference.

I also tried testing whether its a single feed in the twigfeeds list that broke it (this is possible) but as said, even with no feeds at all, its still broken.

Ill carry on trying to see how to get it back and meanwhile move to beta8.

UPDATES
EDIT: beta8 has cured the problem and accepts my full twigfeeds list. I do think the php8.3 version cli command to clear all caches (inc twigfeeds) did something to cause the error/break, as all was working fine prior to that. (Im still puzzled about my single feed page not working for some feeds though!!)

EDIT2: The error has re-occurred since, but then disappears and the front end works again. Im testing on phones and desktop. It may be to do with twigfeeds overloading (and timing out) wen it refreshes feeds because there are too many. Atm Im testing 36. But it has not happened before yesterday.

last edited 09/06/25 by DrPen
10 months ago

The plugin itself cannot overload, even running without limits on an endless amount of feeds, but PHP will if time, processing, memory or space is exhausted. The error Call to a member function getFeed() on array, Error user/plugins/twigfeeds/classes/Parser.php:124 is rather a result of an underlying error that was logged but its empty result not handled in the static query-method. These errors all arise from your site speaking to the target server, and any errors will be logged in PHP's error_log rather than Grav's or TwigFeeds'. For the next beta-version I'll resolve/improve both of these, but you may want to examine that log for any specific feed giving a bad result.

last edited 09/11/25 by Ole Vik
10 months ago

Thanks for the info about twigfeeds 'load', and php in my server. So, it sounds like a php memory timeout. Ill look into it (php error logs are difficult to locate on Ionos). Maybe I should run a local php.ini with a large memory limit? That may help. I think depending on each feed and its server this kind of problem can happen at any time.

FYI This is currently what I have in user/config/plugins/twigfeeds.yaml settings. Let me know if anything should be changed here.

YAML
enabled: true
cache: true
static_cache: false
debug: false
log_file: twigfeeds.log
cache_time: 900
pass_headers: true
silence_security: false
request_options:
  allow_redefaults: true
  connect_timeout: 30
  timeout: 30
  http_errors: false

The error has not re-occurred since this morning on the phone. All feeds are working and being updated. The Admin for twigfeeds is now working lovely - such great work!

btw I sorted the issue with the single source template calls, so that all feeds now work. But that template was not the cause of the error.

Ive updated the Obsidian shared note for my own benefit but you may find stuff in there of interest.

9 months ago

You should test with a configuration locally that matches the one on your production-server, to ensure stability is as expected. Enabling Error Logs on IONOS would be worthwhile to help you locate any errors that occur.

What will have more of an impact on performance is spreading the caching out when there are many feeds and/or if they contain a lot of data. A large data-load in itself doesn't take much to retrieve, parse, and store, but connecting to multiple remote servers in sequence takes time when they are slow to respond.

That is, align the cache_time with the frequency of the sources' updates, to avoid querying them all at once, and pass_headers of course. The default request_options connect_timeout and timeout should be sufficient to avoid waiting unecessarily for servers who do not respond in a timely fashion. I would perhaps prefer the static_cache enabled to keep the feeds-data separate from Grav's cache, making it easier to clear them independently.

Further, you could set up a cronjob to run the CLI to cache routinely, such that the user is not as exposed to re-caches. It will respect your plugin-configuration, and so running it frequently will not unecessarily process anything that would not run otherwise by a visit to the site.

9 months ago

The errors are more plainly apparent in beta.9, and beta.10 adds a more extensive way to test feeds using PHPUnit, as well as sort out some basic normalization for the direct-mode. beta.11 goes a few steps further to add redundancies to handle errors that happen upstream, and make the CLI easier to use.

last edited 09/14/25 by Ole Vik
9 months ago

Hey. Apologies for not being around. V busy on work stuff.

I need to re-read your last couple of responses bc there's a lot of info there.

I've not had time yet to move to beta10, but have been using the reader a lot. I'm now testing much longer cache times for most sources as I'm trying to get the initial site load to be quicker. Once it's loaded it's very fast but can be 10+ seconds to load. So I'm trying 2hrs, 3hrs, 6hrs, 12 hrs for different feeds. I've enabled the static cache and will look into the cron setup as run crons for another site.

Occasionally I still come across a feed that will break the reader (the array error). Eg this work related feed https://www.3cl.org/feed/. Very odd as it looks ok. Not a problem to not have but thought you might be curious.

I'll update here when I've moved to beta10.

last edited 09/23/25 by DrPen
8 months ago

The 3CL Foundation feed fails with the library, because it cannot parse the feed. It does, however, work in direct mode as far as I can tell.

8 months ago

Hi. Sorry Im not around much atm.

Im running it at beta 9 (as said) but will update to v10 today and then v11 after a few days.

In beta 9, it wont run without grav cache: true, it crashes the front end. It gives this error
"Call to protected method FeedIo\Adapter\Guzzle\Client::request() from scope Grav\Plugin\TwigFeedsPlugin\API\Parser"

It makes no difference whether twigfeeds static cache is enabled or not. So currently Im running with grav cache enabled and testing if twigfeeds static cache makes any difference to speed - <s>it does not appear to affect this</s>. It can affect this. Eg just now, the root page feeds loaded but I could not navigate to any feed/source/category, or any other section of the site. I tried logging in from a phone and that was totally unresponsive. I cleared grav cache again and enabled static cache and everything worked. I dont understand this.

THIS IS CLARIFIED IN THE OBSIDIAN NOTE and also affects beta 11

Also, if I remember correctly, clearing grav cache via cli usually gives a resulting line that twigfeeds cache is cleared. But this is not now there. Maybe Im remembering incorrectly. (Possible, Im doing a lot of other things!).

I have had no other problems with any other feeds. All load as expected.

Re 3CL (or other error feeds) Ill check direct mode more.

Ill update again once Ive tested a bit in v10 and v11 unless odd things happen.

last edited 10/16/25 by DrPen
8 months ago

For the error you encountered in beta.10 - Declaration of Psr\Log\NullLogger::log must be compatible with Psr\Log\LoggerInterface - I've tested and made a note in the repository with an explainer. In essence, because of version-resolution TwigFeeds v5 lands in the tight space where it requires PHP 8 and Grav 1.7, but is not compatible with Grav 1.8.

I am looking to see whether I can overload the inherited method-signatures without making the plugin unusable for either Grav 1.7 or 1.8

8 months ago

The method-signature error is resolved in beta.13.

7 months ago

Hi @OleVik. Sorry Im not around much atm. But, Ive been using vbeta11 almost every day and have not encountered any problems. Im still using grav cache and twigfeeds cache enabled. Initial load time is still slow, until I can set up a cron to run every couple of hours maybe. But after feeds are refreshed everything is very fast.

Over next few days/by end of next week I will update to vbeta13.

Grav version is v1.7.49.5 - Admin v1.10.49.1, I think this the latest. My remote server is Php8.3.

My localhost (Laragon 6, updated to php8.1) issue (fyi) is that the certificate is self signed which is now not permitted. I think Grav is up to date. Everything works on the localhost twigfeeds except the live feeds themselves.

Ill update you again once Ive moved to vbeta13

5 months ago

Hi @OleVik . Hope all good. Ive been running v.beta11 with no issues since last messaged you. I now found time to update to v.beta13. All went smoothly. Im still running both caches. Everything shows up in Admin.

I have also added a cron at server level (not using Grav Schedular) to help with site load time which was excessive. All Im doing is pinging the fullfeed template once an hour with the url, not the filepath. Eg
59 * * * * wget https://domain/folder/fullfeed >/dev/null

Obv I could change 59 to 1 but was messing about ;) . I think this is correct but you may know a lot more or whether its better to use the filepath not the url, or do this a different better way.

Ive only been testing this for a day or so cant say for certain but it seems to have improved load time a lot, cutting down from 20+ seconds down to a few seconds to load the login page. After that its superfast.

5 months ago

Great to hear! With wget the URL is correct, and it works simply enough. You could improve performance by using the plugin's CLI, which won't have any interference from other caching-layers or impose a load on the website's front-end. That will run with or without Grav's scheduler, which is only a wrapper for cron anyway.

5 months ago

I forgot about the Twigfeed cli - I'll check that out asap because it's probably better. But I do now think the webpage ping is making a huge difference to load initial time.

When I said the login page was taking 20+ seconds to load, this is when the site has been dormant eg overnight or a few hours. If its not visited for a while then when you first go to the site it was taking a long time to load - I assume because all the new feed items must be loading. This was even if the session was still alive (3hrs limit), so not to do with that. But pinging every hour overcomes this.

Anyway, Ill look at your cli to check it out. Ill also remove the meta refresh Ive got in two of the templates because I dont think that makes any difference (its only clientside triggered?).

I will also add more feeds! If the ping or similar means we can overcome the load time then I can go to my anticipated limit which was 50 feeds. I currently have about 40.

Suggested topics

Topic Participants Replies Views Activity
General · by Jerry Hunt, 4 days ago
2 80 9 hours ago
General · by pamtbaau, 14 hours ago
1 51 13 hours ago
General · by Andy Miller, 1 day ago
0 44 1 day ago
General · by Marcel, 12 months ago
6 346 5 days ago
General · by Duc , 5 days ago
3 40 5 days ago