Skip to content

OGProxy - a replacement for iFramely

Announcements
  • @DownPW it’s not going to be a plugin as it requires it’s own reverse proxy in a sub domain as I mentioned previously and everything is client side so a plugin would be pointless as it has no settings - only custom JS code.

  • I have literally one bug (a simple one) to resolve, then clean up the code and restyle done of the elements (@DownPW is right in the sense that they are a bit big 🤦) and it’s ready for testing.

    In reality, I expect this to work almost flawlessly on other user forums as the code has been though several iterations to ensure it’s both lean and efficient. If you test it in sudonix.dev you’ll see what I mean. I found an ingenuous way to extract the favicon - believe it or not, despite the year being 2023, web site owners don’t understand how to structure meta tags properly 🤬🤬.

    I learned today that Google in fact has a hidden API that you can use to get the favicon from any site which not only works flawlessly, but it’s extremely efficient and so, I’m using it in my code - pointless reinventing the wheel when you don’t have to.

    The final “bug” isn’t really a “bug” in the traditional sense. In fact, its nothing to do with my code, but one specific variable I look for in the headers of each URL being crawled is

    ogSiteName
    

    The “og” part is short for OpenGraph - an industry standard for years, yet often missing in headers. This has the undesired effect of returning undefined when it doesn’t exist, meaning reviews look like this

    Screenshot_2023-06-11-22-40-27-43_e4424258c8b8649f6e67d283a50a2cbc.jpg

    Notice the “undefined” text where the site title will be - that looks bad, to I’ll likely replace the with the domain name of the site of the title is missing.

    Seriously, if you haven’t tested this out yet, if suggest to do so on sudonix.dev as soon as you can.

    Currently, this site is running nodebb-plugin-link-previews but will be switching to my client side version in the coming days.

    As soon as it’s running here, I’ll release the code and a guide.

  • @phenomlab said in Potential replacement for iFramely:

    believe it or not, despite the year being 2023, web site owners don’t understand how to structure meta tags properly 🤬🤬.

    Ha ha I believe you !!

    @phenomlab said in Potential replacement for iFramely:

    Notice the “undefined” text where the site title will be - that looks bad, to I’ll likely replace the with the domain name of the site of the title is missing.

    yep be clearly better with domain name when OpenGraphi when he is absent in headers

    @phenomlab said in Potential replacement for iFramely:

    Seriously, if you haven’t tested this out yet, if suggest to do so on sudonix.dev as soon as you can.

    Already tested, very fast 😉

    @phenomlab said in Potential replacement for iFramely:

    Currently, this site is running nodebb-plugin-link-previews but will be switching to my client side version in the coming days.

    As soon as it’s running here, I’ll release the code and a guide.

    Hell yeah 😉

  • Ok, spent some time on this late last night, and the good news is that it’s finished, and ready for you to try out.

    Please do remember that this isn’t a plugin, but a client side js function with a server side proxy. However, don’t be put off as the installation is simple, and you should be up and running in around 30 minutes maximum.

    It does require some technical knowledge and ability, but if you’ve setup NodeBB, then you can do this easily - besides, there will be full documentation so you are taken through each step.

    Some other points to note are that not every site returns valid Open Graph data - in fact, some don’t even return an image (yes, Reddit, I’m taking about you) when they are closed to the public, or behind a registration form / membership grant, or in some cases, a paywall.

    When this scenario is met, the problem arises that no valid image is being returned. I did toy with the idea of using a free random image API , and even wrote the code for it - then realised nature scenery didn’t quite align with a tech site like Reddit.

    Ok - the only thing to do here is to generate your own image then, and bundle that in with the proxy. For this purpose, I chose an image (I have a subscription to stockphotosecrets.com, which is an annual cost to me. I’ve cancelled the subscription as I don’t use it, but provided I downloaded the image before the term expires, I have the right to use any images after the fact) and then added some text parts to it so that it could then be used as a placeholder for when the image is absent.

    Here’s that image

    404.webp

    It’s sparse, but functional. And given my comment earlier around membership and paywalls, here’s what that specific scenario would look like when encountered

    Screenshot_2023-06-12-22-37-11-58_e4424258c8b8649f6e67d283a50a2cbc.jpg

    Before I go ahead and provide the documentation, code, and proxy server, I’d recommend you try out the latest code on sudonix.dev.

    Enjoy.

  • Mark could you please install to my site also?

  • @cagatay Let me put the guide together first, and see if you can get it working by yourself. It’s not a difficult installation, and once you understand how the components intersect and work together, I think you’ll be fine. If the worst comes to it however, I’m always happy to help.

  • UPDATE: OGProxy is now live on this forum 🙂

    https://sudonix.org/topic/498/setup-ogproxy-for-use-in-nodebb

  • Already found 2 bugs, which have been committed to live code

    • Relative path is provided in some instances, so a function now exists to return the full path instead so the image is rendered
    • OGProxy does not target chat - this has been fixed
  • phenomlabundefined phenomlab referenced this topic on
  • One thing I never really took into account when developing OGProxy was the potential for CLS (Cumulative Layout Shifting). If you’re sitting there thinking “what is he talking about?” then this goes some way to explain it

    https://blog.hubspot.com/marketing/cumulative-layout-shift

    Not only does this harm the user experience in the sense that the page jumps all over the place when you are trying to read content, but is also harmful when it comes to site performance and SEO, as this is a key measurable when checking page performance and speed.

    Based on this, I’ve made several changes to how OGProxy works - some of which I will outline below

    1. Link Selection and Filtering

    The function selects certain links in the document and filters out unnecessary ones, like internal links or specific classes.
    It further filters links by checking for domains or file types that should be ignored.

    1. Placeholders for Preventing Content Layout Shift (CLS)

    Placeholders are inserted initially with a generic “Loading…” message and a temporary image. This prevents CLS, which happens when the page layout shifts due to asynchronously loaded content. By having a placeholder occupy the same space as the final preview, the layout stays stable while data is fetched.

    1. AJAX Request to Fetch Link Metadata

    The function sends an AJAX request to an OpenGraph proxy service to retrieve metadata (title, description, and image) about each link. It uses the same proxy server and API key to fetch this information as it’s predecessor.

    1. Dynamic Update of Placeholder with Real Data

    Once data is retrieved, the placeholder is replaced with the actual preview. Title, description, and image are updated based on the fetched metadata. If an image is missing or invalid, it defaults to a specified “404” image.

    1. Error Handling and Debugging

    Debug logging and error handling ensure that if data can’t be fetched, the placeholders are either left unchanged or logged for troubleshooting.

    This approach provides a smoother user experience by managing both loading time and visual stability, which are critical for preventing CLS in dynamically loaded content.

    The new code is active on this site, and there’s not only a huge visual improvement, but also serious performance gains. I’ll give it a few weeks, then formally release the new code.

  • And now, changes made to the back-end Proxy Server to increase performance

    Key Changes Made:

    1. Rate Limiting

    Added express-rate-limit to limit requests from a single IP.

    1. Logging

    Integrated morgan for logging HTTP requests.

    1. Health Check Endpoint

    Added a simple endpoint to check the server’s status.

    1. Data Validation

    Implemented input validation for the URL using Joi.

    1. Environment Variables

    Used dotenv for managing sensitive data like API keys and port configuration.

    1. Error Handling

    Enhanced error logging for debugging purposes.

    1. Asynchronous Error Handling

    Utilize a centralized error-handling middleware to manage errors in one place.

    1. Environment Variable Management

    Use environment variables for more configuration options, such as cache duration or allowed origins, making it easier to change configurations without altering the code.

    1. Static Response Handling

    Use a middleware for handling static responses or messages instead of duplicating logic.

    1. Compression Middleware

    Add compression middleware to reduce the size of the response bodies, which can improve performance, especially for larger responses.

    1. Timeout Handling on Requests

    Handle timeouts for the requests made to the target URLs and provide appropriate error responses.

    1. Security Improvements

    Implement security best practices, such as Helmet for setting HTTP headers, which can help protect against well-known vulnerabilities.

    1. Logging Configuration

    Improve logging with different levels (e.g., info, error) using a logging library like winston, which provides more control over logging output.

    1. Graceful Shutdown

    Implement graceful shutdown logic to handle server termination more smoothly, especially during deployment.

    1. Monitoring and Metrics

    Integrate monitoring tools like Prometheus or an APM tool for better insights into the application’s performance and resource usage.

    1. Response Schema Validation

    Use libraries like Joi or Ajv to validate responses sent back to the client, ensuring they conform to expected formats.

    Again, this new code is running here in test for a few weeks.

  • wowww

    Very good work my friend!!

  • @phenomlab the best of the best, great work Mark 👏🏻👏🏻👏🏻👏🏻.

  • Seeing as not every site on the planet has relevant CORS headers that permit data scraping, I thought I’d make this a bit more obvious on the response. The link is still rendered, but by using the below if the remote site refuses to respond to request

    image.png

    Not entirely sold on the image yet - likely will change it, but you get the idea 🙂 It’s more along the lines of graceful failure rather than a URL that simply does nothing.

  • @phenomlab I love that image and think it is perfect! LOL

  • @phenomlab said in OGProxy - a replacement for iFramely:

    And now, changes made to the back-end Proxy Server to increase performance

    Key Changes Made:

    1. Rate Limiting

    Added express-rate-limit to limit requests from a single IP.

    1. Logging

    Integrated morgan for logging HTTP requests.

    1. Health Check Endpoint

    Added a simple endpoint to check the server’s status.

    1. Data Validation

    Implemented input validation for the URL using Joi.

    1. Environment Variables

    Used dotenv for managing sensitive data like API keys and port configuration.

    1. Error Handling

    Enhanced error logging for debugging purposes.

    1. Asynchronous Error Handling

    Utilize a centralized error-handling middleware to manage errors in one place.

    1. Environment Variable Management

    Use environment variables for more configuration options, such as cache duration or allowed origins, making it easier to change configurations without altering the code.

    1. Static Response Handling

    Use a middleware for handling static responses or messages instead of duplicating logic.

    1. Compression Middleware

    Add compression middleware to reduce the size of the response bodies, which can improve performance, especially for larger responses.

    1. Timeout Handling on Requests

    Handle timeouts for the requests made to the target URLs and provide appropriate error responses.

    1. Security Improvements

    Implement security best practices, such as Helmet for setting HTTP headers, which can help protect against well-known vulnerabilities.

    1. Logging Configuration

    Improve logging with different levels (e.g., info, error) using a logging library like winston, which provides more control over logging output.

    1. Graceful Shutdown

    Implement graceful shutdown logic to handle server termination more smoothly, especially during deployment.

    1. Monitoring and Metrics

    Integrate monitoring tools like Prometheus or an APM tool for better insights into the application’s performance and resource usage.

    1. Response Schema Validation

    Use libraries like Joi or Ajv to validate responses sent back to the client, ensuring they conform to expected formats.

    Again, this new code is running here in test for a few weeks.

    code updated on github or not ?

  • @DownPW Not yet. One or two bugs left to resolve, then it’ll be posted.

  • Ok I’ll install the old version in the meantime… Or maybe just wait if it’s not too long

  • Can’t wait 😲

    🙂

  • @DownPW Coming soon…

  • A release date maybe ?


Related Topics
  • Ch..ch..ch..ch..changes!

    Announcements
    16
    15 Votes
    16 Posts
    236 Views

    @phenomlab of course, to be recognised is fantastic.

    @phenomlab said in Ch..ch..ch..ch..changes!:

    Sadly, no. Web crawlers and scrapers are often JS based and read text only, so styles don’t have any bearing.

    I’ve read mixed things about this, but no that does make sense, it was something I read a many years back when using Wordpress.

  • Planned sunset of NTFY plugin

    Pinned Announcements
    7
    8 Votes
    7 Posts
    193 Views

    I’ve noticed that I’m the only one subscribed to the push notifications on this site. If you were using NTFY previously, and have noticed that you’ve not had any alerts for a while, it’s because this feature has been disabled.

    You’ll now need to use the push notification to replace NTFY as mentioned in the first post.

  • ANNOUNCEMENT: Social Login Changes

    Announcements
    4
    6 Votes
    4 Posts
    470 Views

    @DownPW Always looking for ways to improve the overall experience.

  • IMPORTANT: Theme / Swatch changes

    Announcements
    4
    6 Votes
    4 Posts
    289 Views

    @cagatay these changes aren’t published anywhere presently, so nothing for you to do.

  • Clustering for NodeBB enabled

    Announcements
    22
    16 Votes
    22 Posts
    485 Views

    @Madchatthew True. I think this is the reason as to why most Open Source projects are abandoned because they are not sustainable in the long-term.

  • Link Not Working

    Solved Customisation
    5
    1 Votes
    5 Posts
    280 Views

    @cagatay Good question, but one that’s likely best answered by the devs themselves. Could easily be done with a simple jQuery regex but that would really just be painting over rotten wood.

  • Testing out Webdock.io

    Moved Announcements
    2
    5 Votes
    2 Posts
    547 Views

    Just coming back to this thread for review (as I often do), and it looks like Webdock have increased their available offerings - some are extremely powerful, yet very competitive from the pricing perspective.

    image.png

    10 CPU cores, plus 20Gb RAM? Well worth a look (and the asking price) - there’s also a fixed IP which is hugely beneficial.

    Clearly, this is well beyond what most people will want to spend - it’s more of an example (but interestingly, Sudonix runs on something not too different from the above).

    However, not all that glitters is gold 😕 - just have a walk through the benchmark report I found below and you’ll see a huge difference between Heztner and Webdock

    https://www.vpsbenchmarks.com/compare/hetzner_vs_webdock

    That being said, the amount of HTTP requests that Webdock handles in relation to Hetzner is superior - @DownPW you might want to have a look at this - there’s a free 24 hour trial… 🙂

    5203639b-2f62-47e6-b87b-37580ce5deae-image.png

  • Fancybox now used for image handling

    Announcements
    16
    6 Votes
    16 Posts
    986 Views

    And it seems to be less conflicting!