Skip to content

OGProxy - a replacement for iFramely

Announcements
  • @phenomlab

    I have installed the first version published
    After a few tests, I have quite random results but the solution has the advantage of being easy to use

  • @DownPW Yes, that’s true, but I don’t think it’s as efficient. I’ve looked at the library being used in more detail last night, and whilst I like the approach, it has some nuances where I think performance could be an issue.

  • Another quick update. Julian’s image preview plugin has gone though numerous iterations and now appears to work very well.

    However, there are some limitations from what I can see, given that the NodeBB plugin works server side. The implementation I am working on runs client side, and is extremely quick.

    I’ve chosen to use an existing library as this made more sense than effectively reinventing the wheel. This one is updated frequently (in fact yesterday was the last update) so less of a concern from the longevity perspective.

    I expect to have a fully functional beta available in the coming days. If you want to see progress around the latest code, this is available and running on https://sudonix.dev.

  • @phenomlab said in Potential replacement for iFramely:

    and now appears to work very well.

    Are you sure 😆 After One hour !!

    fa4e9a3f-12ed-41cb-8c86-aae5c0934f59-image.png

  • your solution working fine. It should be as simple to implement as Julian’s plugin and it would be perfect ^^

  • @DownPW yes, mine is doing the same now.

  • @DownPW it’s not going to be a plugin as it requires it’s own reverse proxy in a sub domain as I mentioned previously and everything is client side so a plugin would be pointless as it has no settings - only custom JS code.

  • I have literally one bug (a simple one) to resolve, then clean up the code and restyle done of the elements (@DownPW is right in the sense that they are a bit big 🤦) and it’s ready for testing.

    In reality, I expect this to work almost flawlessly on other user forums as the code has been though several iterations to ensure it’s both lean and efficient. If you test it in sudonix.dev you’ll see what I mean. I found an ingenuous way to extract the favicon - believe it or not, despite the year being 2023, web site owners don’t understand how to structure meta tags properly 🤬🤬.

    I learned today that Google in fact has a hidden API that you can use to get the favicon from any site which not only works flawlessly, but it’s extremely efficient and so, I’m using it in my code - pointless reinventing the wheel when you don’t have to.

    The final “bug” isn’t really a “bug” in the traditional sense. In fact, its nothing to do with my code, but one specific variable I look for in the headers of each URL being crawled is

    ogSiteName
    

    The “og” part is short for OpenGraph - an industry standard for years, yet often missing in headers. This has the undesired effect of returning undefined when it doesn’t exist, meaning reviews look like this

    Screenshot_2023-06-11-22-40-27-43_e4424258c8b8649f6e67d283a50a2cbc.jpg

    Notice the “undefined” text where the site title will be - that looks bad, to I’ll likely replace the with the domain name of the site of the title is missing.

    Seriously, if you haven’t tested this out yet, if suggest to do so on sudonix.dev as soon as you can.

    Currently, this site is running nodebb-plugin-link-previews but will be switching to my client side version in the coming days.

    As soon as it’s running here, I’ll release the code and a guide.

  • @phenomlab said in Potential replacement for iFramely:

    believe it or not, despite the year being 2023, web site owners don’t understand how to structure meta tags properly 🤬🤬.

    Ha ha I believe you !!

    @phenomlab said in Potential replacement for iFramely:

    Notice the “undefined” text where the site title will be - that looks bad, to I’ll likely replace the with the domain name of the site of the title is missing.

    yep be clearly better with domain name when OpenGraphi when he is absent in headers

    @phenomlab said in Potential replacement for iFramely:

    Seriously, if you haven’t tested this out yet, if suggest to do so on sudonix.dev as soon as you can.

    Already tested, very fast 😉

    @phenomlab said in Potential replacement for iFramely:

    Currently, this site is running nodebb-plugin-link-previews but will be switching to my client side version in the coming days.

    As soon as it’s running here, I’ll release the code and a guide.

    Hell yeah 😉

  • Ok, spent some time on this late last night, and the good news is that it’s finished, and ready for you to try out.

    Please do remember that this isn’t a plugin, but a client side js function with a server side proxy. However, don’t be put off as the installation is simple, and you should be up and running in around 30 minutes maximum.

    It does require some technical knowledge and ability, but if you’ve setup NodeBB, then you can do this easily - besides, there will be full documentation so you are taken through each step.

    Some other points to note are that not every site returns valid Open Graph data - in fact, some don’t even return an image (yes, Reddit, I’m taking about you) when they are closed to the public, or behind a registration form / membership grant, or in some cases, a paywall.

    When this scenario is met, the problem arises that no valid image is being returned. I did toy with the idea of using a free random image API , and even wrote the code for it - then realised nature scenery didn’t quite align with a tech site like Reddit.

    Ok - the only thing to do here is to generate your own image then, and bundle that in with the proxy. For this purpose, I chose an image (I have a subscription to stockphotosecrets.com, which is an annual cost to me. I’ve cancelled the subscription as I don’t use it, but provided I downloaded the image before the term expires, I have the right to use any images after the fact) and then added some text parts to it so that it could then be used as a placeholder for when the image is absent.

    Here’s that image

    404.webp

    It’s sparse, but functional. And given my comment earlier around membership and paywalls, here’s what that specific scenario would look like when encountered

    Screenshot_2023-06-12-22-37-11-58_e4424258c8b8649f6e67d283a50a2cbc.jpg

    Before I go ahead and provide the documentation, code, and proxy server, I’d recommend you try out the latest code on sudonix.dev.

    Enjoy.

  • Mark could you please install to my site also?

  • @cagatay Let me put the guide together first, and see if you can get it working by yourself. It’s not a difficult installation, and once you understand how the components intersect and work together, I think you’ll be fine. If the worst comes to it however, I’m always happy to help.

  • UPDATE: OGProxy is now live on this forum 🙂

    https://sudonix.org/topic/498/setup-ogproxy-for-use-in-nodebb

  • Already found 2 bugs, which have been committed to live code

    • Relative path is provided in some instances, so a function now exists to return the full path instead so the image is rendered
    • OGProxy does not target chat - this has been fixed
  • phenomlabundefined phenomlab referenced this topic on
  • One thing I never really took into account when developing OGProxy was the potential for CLS (Cumulative Layout Shifting). If you’re sitting there thinking “what is he talking about?” then this goes some way to explain it

    https://blog.hubspot.com/marketing/cumulative-layout-shift

    Not only does this harm the user experience in the sense that the page jumps all over the place when you are trying to read content, but is also harmful when it comes to site performance and SEO, as this is a key measurable when checking page performance and speed.

    Based on this, I’ve made several changes to how OGProxy works - some of which I will outline below

    1. Link Selection and Filtering

    The function selects certain links in the document and filters out unnecessary ones, like internal links or specific classes.
    It further filters links by checking for domains or file types that should be ignored.

    1. Placeholders for Preventing Content Layout Shift (CLS)

    Placeholders are inserted initially with a generic “Loading…” message and a temporary image. This prevents CLS, which happens when the page layout shifts due to asynchronously loaded content. By having a placeholder occupy the same space as the final preview, the layout stays stable while data is fetched.

    1. AJAX Request to Fetch Link Metadata

    The function sends an AJAX request to an OpenGraph proxy service to retrieve metadata (title, description, and image) about each link. It uses the same proxy server and API key to fetch this information as it’s predecessor.

    1. Dynamic Update of Placeholder with Real Data

    Once data is retrieved, the placeholder is replaced with the actual preview. Title, description, and image are updated based on the fetched metadata. If an image is missing or invalid, it defaults to a specified “404” image.

    1. Error Handling and Debugging

    Debug logging and error handling ensure that if data can’t be fetched, the placeholders are either left unchanged or logged for troubleshooting.

    This approach provides a smoother user experience by managing both loading time and visual stability, which are critical for preventing CLS in dynamically loaded content.

    The new code is active on this site, and there’s not only a huge visual improvement, but also serious performance gains. I’ll give it a few weeks, then formally release the new code.

  • And now, changes made to the back-end Proxy Server to increase performance

    Key Changes Made:

    1. Rate Limiting

    Added express-rate-limit to limit requests from a single IP.

    1. Logging

    Integrated morgan for logging HTTP requests.

    1. Health Check Endpoint

    Added a simple endpoint to check the server’s status.

    1. Data Validation

    Implemented input validation for the URL using Joi.

    1. Environment Variables

    Used dotenv for managing sensitive data like API keys and port configuration.

    1. Error Handling

    Enhanced error logging for debugging purposes.

    1. Asynchronous Error Handling

    Utilize a centralized error-handling middleware to manage errors in one place.

    1. Environment Variable Management

    Use environment variables for more configuration options, such as cache duration or allowed origins, making it easier to change configurations without altering the code.

    1. Static Response Handling

    Use a middleware for handling static responses or messages instead of duplicating logic.

    1. Compression Middleware

    Add compression middleware to reduce the size of the response bodies, which can improve performance, especially for larger responses.

    1. Timeout Handling on Requests

    Handle timeouts for the requests made to the target URLs and provide appropriate error responses.

    1. Security Improvements

    Implement security best practices, such as Helmet for setting HTTP headers, which can help protect against well-known vulnerabilities.

    1. Logging Configuration

    Improve logging with different levels (e.g., info, error) using a logging library like winston, which provides more control over logging output.

    1. Graceful Shutdown

    Implement graceful shutdown logic to handle server termination more smoothly, especially during deployment.

    1. Monitoring and Metrics

    Integrate monitoring tools like Prometheus or an APM tool for better insights into the application’s performance and resource usage.

    1. Response Schema Validation

    Use libraries like Joi or Ajv to validate responses sent back to the client, ensuring they conform to expected formats.

    Again, this new code is running here in test for a few weeks.

  • wowww

    Very good work my friend!!

  • @phenomlab the best of the best, great work Mark 👏🏻👏🏻👏🏻👏🏻.

  • Seeing as not every site on the planet has relevant CORS headers that permit data scraping, I thought I’d make this a bit more obvious on the response. The link is still rendered, but by using the below if the remote site refuses to respond to request

    image.png

    Not entirely sold on the image yet - likely will change it, but you get the idea 🙂 It’s more along the lines of graceful failure rather than a URL that simply does nothing.

  • @phenomlab I love that image and think it is perfect! LOL


Related Topics
  • Ch..ch..ch..ch..changes!

    Announcements
    16
    +1
    15 Votes
    16 Posts
    287 Views
    @phenomlab of course, to be recognised is fantastic. @phenomlab said in Ch..ch..ch..ch..changes!: Sadly, no. Web crawlers and scrapers are often JS based and read text only, so styles don’t have any bearing. I’ve read mixed things about this, but no that does make sense, it was something I read a many years back when using Wordpress.
  • Planned sunset of NTFY plugin

    Pinned Announcements
    7
    +0
    8 Votes
    7 Posts
    244 Views
    I’ve noticed that I’m the only one subscribed to the push notifications on this site. If you were using NTFY previously, and have noticed that you’ve not had any alerts for a while, it’s because this feature has been disabled. You’ll now need to use the push notification to replace NTFY as mentioned in the first post.
  • ANNOUCEMENT: New NTFY Server

    Announcements
    9
    +0
    7 Votes
    9 Posts
    736 Views
    @crazycells that’s as good a test as any
  • IMPORTANT: Theme / Swatch changes

    Announcements
    4
    +3
    6 Votes
    4 Posts
    303 Views
    @cagatay these changes aren’t published anywhere presently, so nothing for you to do.
  • Theme retirement

    Announcements
    21
    16 Votes
    21 Posts
    2k Views
    I relented somewhat here and added another swatch - one I missed, which was previous called “blackout”. This specific one has been adapted to work on the new theming engine, but the others have been reclassified, and renamed to suit. [image: 1693924764891-d7f3a7a1-9702-4238-99bd-5c0e0d53f244-image.png] As a result, the theme you might have had will probably be reflecting something else, so you (might) need to change your themes accordingly. The changes are as follows Light -> No Change Cloudy -> Is now the old “Dim” Dim -> Is now the old “Dark” Dark -> Now a new theme based on the revamped “Blackout”
  • Clustering for NodeBB enabled

    Announcements
    22
    +0
    16 Votes
    22 Posts
    673 Views
    @Madchatthew True. I think this is the reason as to why most Open Source projects are abandoned because they are not sustainable in the long-term.
  • Testing out Webdock.io

    Moved Announcements
    2
    +4
    5 Votes
    2 Posts
    559 Views
    Just coming back to this thread for review (as I often do), and it looks like Webdock have increased their available offerings - some are extremely powerful, yet very competitive from the pricing perspective. [image: 1692559685163-7cf9a928-ac21-44fe-99c6-90439030d631-image.png] 10 CPU cores, plus 20Gb RAM? Well worth a look (and the asking price) - there’s also a fixed IP which is hugely beneficial. Clearly, this is well beyond what most people will want to spend - it’s more of an example (but interestingly, Sudonix runs on something not too different from the above). However, not all that glitters is gold - just have a walk through the benchmark report I found below and you’ll see a huge difference between Heztner and Webdock https://www.vpsbenchmarks.com/compare/hetzner_vs_webdock That being said, the amount of HTTP requests that Webdock handles in relation to Hetzner is superior - @DownPW you might want to have a look at this - there’s a free 24 hour trial… [image: 1692560710486-5203639b-2f62-47e6-b87b-37580ce5deae-image.png]
  • Fancybox now used for image handling

    Announcements
    16
    6 Votes
    16 Posts
    1k Views
    And it seems to be less conflicting!