The perfect solution and walkthrough Mark, appreciated as always!
Creating posts from RSS feeds in Flarum
-
One of the things that one of my projects has been doing successfully for a few months is querying RSS feeds, then using the Flarum API to create discussions as posts
Want this ? Sure you do ! Below are the steps, including all scripts etc to make this work
Firstly, you will need the
flarum api client
from hereInstallation
composer require maicol07/flarum-api-client
Configuration
In order to start working with the client you might need a Flarum master key:
- Generate a 40 character random, hard to guess string - this is the Token needed for this package (you can use a generator for this - a good example is here)
- Manually add it to the
api_keys
table using phpmyadmin/adminer or another solution.
The master key is required to access non-public discussions and running actions otherwise reserved for Flarum administrators.
Install SimplePie
Next, install
SimplePie
to parse the RSS feedscomposer require simplepie/simplepie
Create storage DB
Now access your database using phpmyadmin (or something similar) and create a new database called “feed”
With the database created, run the following script which will create a table called “queue” with a few simple columns
CREATE TABLE `queue` ( `id` bigint(20) NOT NULL, `url` varchar(500) NOT NULL, `title` varchar(500) NOT NULL, `seen` int(1) NOT NULL DEFAULT 0 ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
As your “feed” database gets bigger, it’ll need some form of index to make it simpler and faster to search. Create as follows in phpmyadmin
ALTER TABLE `queue` ADD PRIMARY KEY (`id`), ADD KEY `title` (`title`), ADD KEY `url` (`url`);
Finally, we’ll set an AUTO INCREMENT on the ID field of the table
ALTER TABLE `queue` MODIFY `id` bigint(20) NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=1; COMMIT;
Create credentials file
For security reasons, we “include” a details.php file (you can call this whatever you like - just remember to reflect any change of name in the below main script) outside of the web root. We are going to be running this from PHP-CLI anyway, so it shouldn’t be exposed
details.php
in my case is being included like the below - it’s located at the root of my domain, but outside of the web rootinclude("/var/www/vhosts/metabullet.com/details.php");
Your
details.php
file should contain this<?php // Variables for posting to Twitter define('CONSUMER_KEY', 'YOUR_KEY'); define('CONSUMER_SECRET', 'YOUR_SECRET'); define('ACCESS_TOKEN', 'YOUR_ACCESS_TOKEN'); define('ACCESS_TOKEN_SECRET', 'YOUR_ACCESS_TOKEN_SECRET); $header = array( "Authorization: Token THE_TOKEN_YOU_GENERATED_EARLIER", "Content-Type: application/json", ); // Create DB connection $servername = "localhost"; $login = "YOUR_DB_USER"; $dbpw = "YOUR_DB_PASSWORD"; $dbname = "feed"; $conn = new mysqli($servername, $login, $dbpw, $dbname); // Check connection if ($conn->connect_error) { die("Connection failed: " . $conn->connect_error); } else { echo "Connected to database\n"; } ?>
Create the RSS parser script
Create a new PHP file called
rssparser.php
- again, located outside of the web root<?php //use Abraham\TwitterOAuth\TwitterOAuth; @$url = $argv[1]; @$max = $argv[2]; if (!$url) { die("\n ***** You must provide a URL to process *****\n"); } if (!$max) { die("\n ***** You must provide a quantity to process *****\n"); } include "details.php"; require 'vendor/autoload.php'; $feed = new SimplePie(); $feed->enable_cache(); $feed->set_cache_location("/home/phenomlab/system/.cache"); $feed->force_feed(); $feed->set_timeout(30); $feed->set_feed_url("$url"); $feed->init(); $feed->handle_content_type(); $feed->enable_order_by_date(true); $number = $feed->get_item_quantity($max); foreach ($feed->get_items(0, $number) as $items) { echo "\033[32m\nProcessing story | " . $items->get_title() . "\n\033[0m"; $description = str_replace("View Entire Post ›", "", $items->get_description()); $description = str_replace("<img", "\n\n<img", $items->get_description()); $description = str_replace('<img src="', '', $items->get_description()); $description = str_replace('" />', '', $items->get_description()); $description = strip_tags(html_entity_decode($items->get_description()), "<img>") . "\n"; $description .= "\n" . '[Link to original article](' . $items->get_link() . ')' . "\n\n"; //echo 'Description: ' . $description . "\n"; $content = $items->get_content(true); //echo '[Link to original article](' .$item->get_link() . ')'."\n"; // Define variables for use later on in the script $subject = $items->get_title(); $body = trim($description); $link = $items->get_link(); // Query the database for each item. Perform action based on results $stmt = $conn->prepare('SELECT url, seen FROM queue WHERE url = ?'); $stmt->bind_param('s', $link); $stmt->execute(); $stmt->store_result(); $stmt->bind_result($checklink, $seen); $stmt->fetch(); // Test to see if we have processed these before. If we have, skip them to avoid duplicates if (!$checklink || !$seen) { echo "Checking " . $link . " \nLine item does not exist - \033[32m\[Processing]\n\033[0m "; // Processing new items. Insert record into database to prevent duplication on subsequent processing runs $seen = 1; $stmt = $conn->prepare('INSERT INTO queue (url, title, seen) VALUES(?, ?, ?)'); $stmt->bind_param("ssi", $link, $subject, $seen); $stmt->execute(); // Process each newly identified unique post into Flarum using the API $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, 'https://hub.phenomlab.net/api/discussions'); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); curl_setopt($ch, CURLOPT_HTTPHEADER, $header); curl_setopt($ch, CURLOPT_POST, 22); curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode((array( 'data' => array( 'type' => "discussions", 'attributes' => array( 'title' => "$subject", 'content' => "$body", ), 'relationships' => array( 'tags' => array( 'data' => array( array( 'type' => 'tags', 'id' => "23", ), ), ), ), ), )))); curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0); $result = curl_exec($ch); echo $result; //$connection = new TwitterOAuth(CONSUMER_KEY, CONSUMER_SECRET, ACCESS_TOKEN, ACCESS_TOKEN_SECRET); //$status = $subject . ' ' . $link . ' #infosec #security #technology #phenomlab'; //$post_tweets = $connection->post("statuses/update", ["status" => $status]); } // Item has already been processed. Continue loop until count exhausted else { echo "Checking " . $checklink . "\nLine item already processed - \033[33m[Ignored]\n\033[0m"; } }
Important notes
@$max = $argv[2];
is the number of RSS items that the script will parse for each resource URLcurl_setopt($ch, CURLOPT_POST, 22);
- “22” in this case is the ID of the user I want to post as. This user needs admin rights.array( 'type' => 'tags', 'id' => "23" )
This array tells the Flarum API in which tag to post. In this case, “23” is the ID of the “news” tag.
Test it !
To test your script to ensure it’s working, run from the CLI and the working directory of where your files are located. Note, that the RSS URL will need to change to the one you’re interested in targeting, and the number afterwards is the amount of articles you want to pull at once.
php rssparser.php http://feeds.bbci.co.uk/news/rss.xml 10
Watch for the output on the screen. The first time this is run, the script will create posts for all new RSS feeds it has no reference for. Whilst each post item is created, the “feed” database is populated so that subsequent runs are not duplicated.
Now what ?
I have this
rssparser.php
scheduled to run every hour.Enjoy - let me know if you have any issues getting this to work.
-
-