Skip to content

OGProxy - a replacement for iFramely

Announcements
  • No problem

  • Mark i m always here as you know 🙂

  • Something of an update…

    I’ve been looking at the overall design for the CORS proxy again today - perhaps with a fresher perspective (and less tired eyes) and found several ways to improve the overall efficiency of the proxy itself.

    I also changed the code base so it will now require the generation of an API key (don’t worry too much about this as there are tons of generators online that can easily convert a phrase to either a sha256 (preferred) or md5 hash. You simply copy this key and place it in both the proxy server and in the custom js code in NodeBB, restart the proxy service, and you’re good to go.

    I suspect you’re wanting to know how the proxy part works… This is a relatively simple nodeJS based server that runs as a daemon (using systemd) which takes input from the js code and processes it as if it were a GET request from a browser - in short, it collects OpenGraph (og) data from the target URL and then returns that data back to the js function which then creates a div block based on the bootstrap card element containing the information about the website - typically title, description, image and url which is the used to render the preview card.

    I’ve switched development of the newer proxy to dev so it doesn’t impact production. As soon as this is ready, the current version running on this site will be replaced.

    Once final testing is completed, I will be in a position to provide an installation guide so you can try this out for yourselves.

    For info, the proxy will not render preview links pointing to your own domain by design. I’d recommend you leave it that way otherwise you’ll encounter issues with performance and odd artefacts that aren’t rendered properly because the lookup failed. I’ll be placing a check inside the code that ignores links that have the origin of your own domain.

    Finally, I’d recommend setting up a subdomain for the proxy service to run under. It’s cleaner, easier to manage, and will run independent of the root domain where your NodeBB instance runs. A simple nginx reverse proxy is required, and I will provide this as part of the guide.

  • @julian has just released nodebb-plugin-link-preview which might be a better fit depending on your needs. I will continue development for this solution though as it performs all requests client-side.

  • @phenomlab

    I have installed the first version published
    After a few tests, I have quite random results but the solution has the advantage of being easy to use

  • @DownPW Yes, that’s true, but I don’t think it’s as efficient. I’ve looked at the library being used in more detail last night, and whilst I like the approach, it has some nuances where I think performance could be an issue.

  • Another quick update. Julian’s image preview plugin has gone though numerous iterations and now appears to work very well.

    However, there are some limitations from what I can see, given that the NodeBB plugin works server side. The implementation I am working on runs client side, and is extremely quick.

    I’ve chosen to use an existing library as this made more sense than effectively reinventing the wheel. This one is updated frequently (in fact yesterday was the last update) so less of a concern from the longevity perspective.

    I expect to have a fully functional beta available in the coming days. If you want to see progress around the latest code, this is available and running on https://sudonix.dev.

  • @phenomlab said in Potential replacement for iFramely:

    and now appears to work very well.

    Are you sure 😆 After One hour !!

    fa4e9a3f-12ed-41cb-8c86-aae5c0934f59-image.png

  • your solution working fine. It should be as simple to implement as Julian’s plugin and it would be perfect ^^

  • @DownPW yes, mine is doing the same now.

  • @DownPW it’s not going to be a plugin as it requires it’s own reverse proxy in a sub domain as I mentioned previously and everything is client side so a plugin would be pointless as it has no settings - only custom JS code.

  • I have literally one bug (a simple one) to resolve, then clean up the code and restyle done of the elements (@DownPW is right in the sense that they are a bit big 🤦) and it’s ready for testing.

    In reality, I expect this to work almost flawlessly on other user forums as the code has been though several iterations to ensure it’s both lean and efficient. If you test it in sudonix.dev you’ll see what I mean. I found an ingenuous way to extract the favicon - believe it or not, despite the year being 2023, web site owners don’t understand how to structure meta tags properly 🤬🤬.

    I learned today that Google in fact has a hidden API that you can use to get the favicon from any site which not only works flawlessly, but it’s extremely efficient and so, I’m using it in my code - pointless reinventing the wheel when you don’t have to.

    The final “bug” isn’t really a “bug” in the traditional sense. In fact, its nothing to do with my code, but one specific variable I look for in the headers of each URL being crawled is

    ogSiteName
    

    The “og” part is short for OpenGraph - an industry standard for years, yet often missing in headers. This has the undesired effect of returning undefined when it doesn’t exist, meaning reviews look like this

    Screenshot_2023-06-11-22-40-27-43_e4424258c8b8649f6e67d283a50a2cbc.jpg

    Notice the “undefined” text where the site title will be - that looks bad, to I’ll likely replace the with the domain name of the site of the title is missing.

    Seriously, if you haven’t tested this out yet, if suggest to do so on sudonix.dev as soon as you can.

    Currently, this site is running nodebb-plugin-link-previews but will be switching to my client side version in the coming days.

    As soon as it’s running here, I’ll release the code and a guide.

  • @phenomlab said in Potential replacement for iFramely:

    believe it or not, despite the year being 2023, web site owners don’t understand how to structure meta tags properly 🤬🤬.

    Ha ha I believe you !!

    @phenomlab said in Potential replacement for iFramely:

    Notice the “undefined” text where the site title will be - that looks bad, to I’ll likely replace the with the domain name of the site of the title is missing.

    yep be clearly better with domain name when OpenGraphi when he is absent in headers

    @phenomlab said in Potential replacement for iFramely:

    Seriously, if you haven’t tested this out yet, if suggest to do so on sudonix.dev as soon as you can.

    Already tested, very fast 😉

    @phenomlab said in Potential replacement for iFramely:

    Currently, this site is running nodebb-plugin-link-previews but will be switching to my client side version in the coming days.

    As soon as it’s running here, I’ll release the code and a guide.

    Hell yeah 😉

  • Ok, spent some time on this late last night, and the good news is that it’s finished, and ready for you to try out.

    Please do remember that this isn’t a plugin, but a client side js function with a server side proxy. However, don’t be put off as the installation is simple, and you should be up and running in around 30 minutes maximum.

    It does require some technical knowledge and ability, but if you’ve setup NodeBB, then you can do this easily - besides, there will be full documentation so you are taken through each step.

    Some other points to note are that not every site returns valid Open Graph data - in fact, some don’t even return an image (yes, Reddit, I’m taking about you) when they are closed to the public, or behind a registration form / membership grant, or in some cases, a paywall.

    When this scenario is met, the problem arises that no valid image is being returned. I did toy with the idea of using a free random image API , and even wrote the code for it - then realised nature scenery didn’t quite align with a tech site like Reddit.

    Ok - the only thing to do here is to generate your own image then, and bundle that in with the proxy. For this purpose, I chose an image (I have a subscription to stockphotosecrets.com, which is an annual cost to me. I’ve cancelled the subscription as I don’t use it, but provided I downloaded the image before the term expires, I have the right to use any images after the fact) and then added some text parts to it so that it could then be used as a placeholder for when the image is absent.

    Here’s that image

    404.webp

    It’s sparse, but functional. And given my comment earlier around membership and paywalls, here’s what that specific scenario would look like when encountered

    Screenshot_2023-06-12-22-37-11-58_e4424258c8b8649f6e67d283a50a2cbc.jpg

    Before I go ahead and provide the documentation, code, and proxy server, I’d recommend you try out the latest code on sudonix.dev.

    Enjoy.

  • Mark could you please install to my site also?

  • @cagatay Let me put the guide together first, and see if you can get it working by yourself. It’s not a difficult installation, and once you understand how the components intersect and work together, I think you’ll be fine. If the worst comes to it however, I’m always happy to help.

  • UPDATE: OGProxy is now live on this forum 🙂

    https://sudonix.org/topic/498/setup-ogproxy-for-use-in-nodebb

  • Already found 2 bugs, which have been committed to live code

    • Relative path is provided in some instances, so a function now exists to return the full path instead so the image is rendered
    • OGProxy does not target chat - this has been fixed
  • phenomlabundefined phenomlab referenced this topic on
  • One thing I never really took into account when developing OGProxy was the potential for CLS (Cumulative Layout Shifting). If you’re sitting there thinking “what is he talking about?” then this goes some way to explain it

    https://blog.hubspot.com/marketing/cumulative-layout-shift

    Not only does this harm the user experience in the sense that the page jumps all over the place when you are trying to read content, but is also harmful when it comes to site performance and SEO, as this is a key measurable when checking page performance and speed.

    Based on this, I’ve made several changes to how OGProxy works - some of which I will outline below

    1. Link Selection and Filtering

    The function selects certain links in the document and filters out unnecessary ones, like internal links or specific classes.
    It further filters links by checking for domains or file types that should be ignored.

    1. Placeholders for Preventing Content Layout Shift (CLS)

    Placeholders are inserted initially with a generic “Loading…” message and a temporary image. This prevents CLS, which happens when the page layout shifts due to asynchronously loaded content. By having a placeholder occupy the same space as the final preview, the layout stays stable while data is fetched.

    1. AJAX Request to Fetch Link Metadata

    The function sends an AJAX request to an OpenGraph proxy service to retrieve metadata (title, description, and image) about each link. It uses the same proxy server and API key to fetch this information as it’s predecessor.

    1. Dynamic Update of Placeholder with Real Data

    Once data is retrieved, the placeholder is replaced with the actual preview. Title, description, and image are updated based on the fetched metadata. If an image is missing or invalid, it defaults to a specified “404” image.

    1. Error Handling and Debugging

    Debug logging and error handling ensure that if data can’t be fetched, the placeholders are either left unchanged or logged for troubleshooting.

    This approach provides a smoother user experience by managing both loading time and visual stability, which are critical for preventing CLS in dynamically loaded content.

    The new code is active on this site, and there’s not only a huge visual improvement, but also serious performance gains. I’ll give it a few weeks, then formally release the new code.

  • And now, changes made to the back-end Proxy Server to increase performance

    Key Changes Made:

    1. Rate Limiting

    Added express-rate-limit to limit requests from a single IP.

    1. Logging

    Integrated morgan for logging HTTP requests.

    1. Health Check Endpoint

    Added a simple endpoint to check the server’s status.

    1. Data Validation

    Implemented input validation for the URL using Joi.

    1. Environment Variables

    Used dotenv for managing sensitive data like API keys and port configuration.

    1. Error Handling

    Enhanced error logging for debugging purposes.

    1. Asynchronous Error Handling

    Utilize a centralized error-handling middleware to manage errors in one place.

    1. Environment Variable Management

    Use environment variables for more configuration options, such as cache duration or allowed origins, making it easier to change configurations without altering the code.

    1. Static Response Handling

    Use a middleware for handling static responses or messages instead of duplicating logic.

    1. Compression Middleware

    Add compression middleware to reduce the size of the response bodies, which can improve performance, especially for larger responses.

    1. Timeout Handling on Requests

    Handle timeouts for the requests made to the target URLs and provide appropriate error responses.

    1. Security Improvements

    Implement security best practices, such as Helmet for setting HTTP headers, which can help protect against well-known vulnerabilities.

    1. Logging Configuration

    Improve logging with different levels (e.g., info, error) using a logging library like winston, which provides more control over logging output.

    1. Graceful Shutdown

    Implement graceful shutdown logic to handle server termination more smoothly, especially during deployment.

    1. Monitoring and Metrics

    Integrate monitoring tools like Prometheus or an APM tool for better insights into the application’s performance and resource usage.

    1. Response Schema Validation

    Use libraries like Joi or Ajv to validate responses sent back to the client, ensuring they conform to expected formats.

    Again, this new code is running here in test for a few weeks.


Related Topics
  • Ch..ch..ch..ch..changes!

    Announcements
    16
    +1
    15 Votes
    16 Posts
    287 Views
    @phenomlab of course, to be recognised is fantastic. @phenomlab said in Ch..ch..ch..ch..changes!: Sadly, no. Web crawlers and scrapers are often JS based and read text only, so styles don’t have any bearing. I’ve read mixed things about this, but no that does make sense, it was something I read a many years back when using Wordpress.
  • Planned sunset of NTFY plugin

    Pinned Announcements
    7
    +0
    8 Votes
    7 Posts
    244 Views
    I’ve noticed that I’m the only one subscribed to the push notifications on this site. If you were using NTFY previously, and have noticed that you’ve not had any alerts for a while, it’s because this feature has been disabled. You’ll now need to use the push notification to replace NTFY as mentioned in the first post.
  • ANNOUNCEMENT: Social Login Changes

    Announcements
    4
    +0
    6 Votes
    4 Posts
    483 Views
    @DownPW Always looking for ways to improve the overall experience.
  • ANNOUCEMENT: New NTFY Server

    Announcements
    9
    +0
    7 Votes
    9 Posts
    736 Views
    @crazycells that’s as good a test as any
  • IMPORTANT: Theme / Swatch changes

    Announcements
    4
    +3
    6 Votes
    4 Posts
    303 Views
    @cagatay these changes aren’t published anywhere presently, so nothing for you to do.
  • Clustering for NodeBB enabled

    Announcements
    22
    +0
    16 Votes
    22 Posts
    673 Views
    @Madchatthew True. I think this is the reason as to why most Open Source projects are abandoned because they are not sustainable in the long-term.
  • Link Not Working

    Solved Customisation
    5
    1 Votes
    5 Posts
    296 Views
    @cagatay Good question, but one that’s likely best answered by the devs themselves. Could easily be done with a simple jQuery regex but that would really just be painting over rotten wood.
  • Testing out Webdock.io

    Moved Announcements
    2
    +4
    5 Votes
    2 Posts
    559 Views
    Just coming back to this thread for review (as I often do), and it looks like Webdock have increased their available offerings - some are extremely powerful, yet very competitive from the pricing perspective. [image: 1692559685163-7cf9a928-ac21-44fe-99c6-90439030d631-image.png] 10 CPU cores, plus 20Gb RAM? Well worth a look (and the asking price) - there’s also a fixed IP which is hugely beneficial. Clearly, this is well beyond what most people will want to spend - it’s more of an example (but interestingly, Sudonix runs on something not too different from the above). However, not all that glitters is gold - just have a walk through the benchmark report I found below and you’ll see a huge difference between Heztner and Webdock https://www.vpsbenchmarks.com/compare/hetzner_vs_webdock That being said, the amount of HTTP requests that Webdock handles in relation to Hetzner is superior - @DownPW you might want to have a look at this - there’s a free 24 hour trial… [image: 1692560710486-5203639b-2f62-47e6-b87b-37580ce5deae-image.png]