OGProxy : follow-up: second memory leak found & fixed
Context
After this morning’s fixes (download limit, cache TTL, systemd MemoryMax), the server stayed up, but during the afternoon OGProxy slowly climbed to 464 MB RSS with all 4 GB of swap consumed. The systemd MemoryMax=512M guard rail did its job (it capped OGProxy instead of letting it take the whole box down like before), which bought time to diagnose calmly. This was a second, separate leak, slower than the first.
Root cause
The logs showed the smoking gun:
MaxListenersExceededWarning: Possible EventEmitter memory leak detected.
22 terminated listeners added to [Fetch]. MaxListeners is 21.
Stack: ogs 6.1.0 → undici 5.22.1 on Node 24. ogs v6 implements its timeout option via an AbortSignal passed to undici. With this version combo, when a request is aborted by that internal timeout, the abort listener attached to the Fetch object is not removed. Every timed-out request leaks one listener, and they accumulate in memory.
Trigger: a 10-day-old forum post listing ~10 store.ubisoft.com links. Opening that topic fires ~10 previews in parallel, all hitting the timeout, each leaking a listener. Repeated views over the day pushed it to 464 MB + full swap.
There was also a vicious circle: as the process bloated, its own outbound fetches got slow enough to time out, which created more timeouts, which leaked more listeners. That explains the flood of Connect Timeout Error in the afternoon logs, hey were a symptom of the leak, not an external block. Once restarted fresh, those same Ubisoft URLs returned success: true in ~2.4 s.
Fix
Stop using ogs’s internal timeout option (the leaking path). Instead, manage the timeout with our own AbortController + setTimeout, pass the signal via fetchOptions, and always clearTimeout() in a finally block, which detaches the abort listener on every exit path (success, failure, or timeout). Also raised EventEmitter.defaultMaxListeners to 50 as a safety net for legitimate concurrency bursts (like that 10-link post).
Verified: with our own signal aborting at 3 s, a Ubisoft URL completed in 2.4 s (signal is respected by ogs 6.1). After deploy, no more MaxListenersExceededWarning, no more cascade timeouts, and memory now oscillates (217 MB under load → back down to 91 MB at rest) instead of climbing and staying climbed.
Note on RuntimeMaxSec
The existing RuntimeMaxSec=86400 (forced daily restart) was almost certainly an earlier band-aid masking exactly this leak. Now that the cause is fixed, it can be removed once stability is confirmed over 24–48 h, but it’s harmless to keep for now.
Only server.js changed (client ACP + systemd unit unchanged)
const express = require('express');
const ogs = require('open-graph-scraper');
const cors = require('cors');
const { URL } = require('url');
const cache = require('memory-cache');
const dns = require('dns').promises;
const net = require('net');
// Raise the listener ceiling as a safety net against transient concurrency spikes
require('events').EventEmitter.defaultMaxListeners = 50;
const app = express();
const port = 2000;
// API key from environment, fallback to inline value for compatibility
const apiKey = process.env.OGPROXY_API_KEY || 'YOUR_API_KEY_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx';
// --- Limits / safeguards ---
const REQUEST_TIMEOUT = 15000; // 15s max per fetch
const MAX_CONTENT_BYTES = 5 * 1024 * 1024; // 5 MB max downloaded page
const CACHE_TTL_MS = 60 * 60 * 1000; // success cache: 1h
const FAIL_CACHE_TTL_MS = 10 * 60 * 1000; // negative cache: 10 min
const CACHE_MAX_ENTRIES = 1000; // max cached entries
const MAX_REDIRECTS = 3; // cap redirect hops
// Returns true if an IP string is private / loopback / link-local / reserved
function isBlockedIp(ip) {
if (!ip) return true;
if (net.isIPv4(ip)) {
const p = ip.split('.').map(Number);
if (p[0] === 10) return true;
if (p[0] === 127) return true;
if (p[0] === 0) return true;
if (p[0] === 169 && p[1] === 254) return true; // link-local / cloud metadata
if (p[0] === 192 && p[1] === 168) return true;
if (p[0] === 172 && p[1] >= 16 && p[1] <= 31) return true;
if (p[0] === 100 && p[1] >= 64 && p[1] <= 127) return true; // CGNAT
return false;
}
if (net.isIPv6(ip)) {
const v = ip.toLowerCase();
if (v === '::1') return true;
if (v.startsWith('fc') || v.startsWith('fd')) return true; // unique local
if (v.startsWith('fe80')) return true; // link-local
if (v.startsWith('::ffff:')) return isBlockedIp(v.split(':').pop()); // IPv4-mapped
return false;
}
return true; // not a valid IP -> block by default
}
// Static hostname guard (fast reject before any DNS work)
function isBlockedHost(hostname) {
if (!hostname) return true;
const h = hostname.toLowerCase();
return (
h === 'localhost' ||
h.endsWith('.localhost') ||
h.endsWith('.internal') ||
h.endsWith('.local') ||
(net.isIP(h) && isBlockedIp(h)) // literal IP in URL
);
}
// Resolve hostname and ensure no resolved IP is private (anti-SSRF via DNS)
async function resolvesToPublicIp(hostname) {
try {
const records = await dns.lookup(hostname, { all: true });
if (!records || records.length === 0) return false;
return records.every(r => !isBlockedIp(r.address));
} catch (e) {
return false; // DNS failure -> treat as unsafe
}
}
app.use(cors({ origin: 'https://YOUR_DOMAINE.EXT' }));
app.get('/ogproxy', async (req, res) => {
let { url } = req.query;
const requestApiKey = req.headers['x-api-key'];
if (requestApiKey !== apiKey) {
return res.status(401).send('Unauthorized');
}
if (!url || typeof url !== 'string') {
return res.status(400).send('Missing URL parameter');
}
if (!url.startsWith('http')) {
try {
url = new URL(url, `${req.protocol}://${req.get('host')}`).href;
} catch (e) {
return res.status(400).send('Invalid URL');
}
}
// Parse + protocol check
let parsedUrl;
try {
parsedUrl = new URL(url);
} catch (e) {
console.warn(`OGProxy reject [${url}]: invalid URL`);
return res.status(400).send('Invalid URL');
}
if (!['http:', 'https:'].includes(parsedUrl.protocol)) {
console.warn(`OGProxy reject [${url}]: invalid protocol`);
return res.status(400).send('Invalid protocol');
}
// Static host guard
if (isBlockedHost(parsedUrl.hostname)) {
console.warn(`OGProxy reject [${url}]: forbidden host (static guard)`);
return res.status(403).send('Forbidden host');
}
// Cache hit (success OR negative) — checked before DNS to stay fast
const cachedResult = cache.get(url);
if (cachedResult) {
if (cachedResult.__ogproxyFail === true) {
return res.status(500).send('Error scraping Open Graph data (cached)');
}
return res.json(cachedResult);
}
// DNS-based SSRF guard: make sure the hostname doesn't resolve to a private IP
if (!(await resolvesToPublicIp(parsedUrl.hostname))) {
console.warn(`OGProxy reject [${url}]: resolves to private IP or DNS fail (SSRF guard)`);
cache.put(url, { __ogproxyFail: true }, FAIL_CACHE_TTL_MS);
return res.status(403).send('Forbidden host');
}
// Enforce cache cap before inserting a new entry
if (cache.keys().length >= CACHE_MAX_ENTRIES) {
cache.clear();
}
// Manage the timeout ourselves with an AbortController we clean up explicitly.
// This avoids the listener leak from ogs/undici's internal `timeout` option
// (ogs 6.x + undici 5.x on Node 24 leaks an abort listener per timed-out request,
// which slowly fills RAM/swap). clearTimeout() in finally detaches the listener.
const controller = new AbortController();
const timer = setTimeout(() => controller.abort(), REQUEST_TIMEOUT);
const options = {
url,
downloadLimit: MAX_CONTENT_BYTES,
fetchOptions: {
signal: controller.signal,
redirect: 'follow',
follow: MAX_REDIRECTS,
headers: {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8',
'Accept-Language': 'fr-FR,fr;q=0.9,en;q=0.8',
},
},
};
try {
const results = await ogs(options);
cache.put(url, results, CACHE_TTL_MS);
return res.json(results);
} catch (error) {
const reason =
(error && error.result && error.result.error) ||
(error && error.message) ||
'unknown';
const status =
(error && error.response && error.response.status) || 'n/a';
console.error(`OGProxy fail [${url}]: ${reason} (HTTP ${status})`);
cache.put(url, { __ogproxyFail: true }, FAIL_CACHE_TTL_MS);
return res.status(500).send('Error scraping Open Graph data');
} finally {
// Always clear the timer — detaches the abort listener and stops the leak
clearTimeout(timer);
}
});
app.listen(port, () => {
console.log(`OGProxy server listening on port ${port}`);
});
Possible upstream-clean alternative (optional)
Upgrading open-graph-scraper to its latest 6.x (which bundles a newer undici) may fix the listener cleanup at the source, letting you go back to the simpler built-in timeout option. Worth checking when convenient, but the AbortController approach above is robust regardless of the undici version, so there’s no rush.