Fetching fails with ECONNRESET error

vicious_mongoose · April 3, 2023, 12:01pm

Also have a problem that occurs very often. At start, the fetch goes ok. But when it’s running for some hours, suddenly one of the requests fails, and from then on, it allways give an error:

Apr 03 11:14:01 <my-host> npm[204617]: FetchError: request to <https://codegen.plasmic.app/api/v1/loader/code/versioned?cb=16&platform=nextjs&loaderVersion=9&projectId=><my-id>%409.8.0 failed, reason: read ECONNRESET
Apr 03 11:14:01 <my-host> npm[204617]:     at ClientRequest.<anonymous> (/home/<my-path>/node_modules/next/dist/compiled/node-fetch/index.js:1:65756)
Apr 03 11:14:01 <my-host> npm[204617]:     at ClientRequest.emit (node:events:513:28)
Apr 03 11:14:01 <my-host> npm[204617]:     at TLSSocket.socketErrorListener (node:_http_client:494:9)
Apr 03 11:14:01 <my-host> npm[204617]:     at TLSSocket.emit (node:events:513:28)
Apr 03 11:14:01 <my-host> npm[204617]:     at emitErrorNT (node:internal/streams/destroy:151:8)
Apr 03 11:14:01 <my-host> npm[204617]:     at emitErrorCloseNT (node:internal/streams/destroy:116:3)
Apr 03 11:14:01 <my-host> npm[204617]:     at process.processTicksAndRejections (node:internal/process/task_queues:82:21) {
Apr 03 11:14:01 <my-host> npm[204617]:   type: 'system',
Apr 03 11:14:01 <my-host> npm[204617]:   errno: 'ECONNRESET',
Apr 03 11:14:01 <my-host> npm[204617]:   code: 'ECONNRESET'
Apr 03 11:14:01 <my-host> npm[204617]: }

vicious_mongoose · April 3, 2023, 12:05pm

If I restart the nodejs server, then it starts working ok again, until next time, some hours later.

very_crab · April 3, 2023, 1:37pm

Have the same, are you hosting from AWS?
Our datadog catches it from time to time and publishing doesn’t work when it does

very_crab · April 3, 2023, 1:38pm

Apologies for the tag. @chungwu Guess it keeps happening still.

vicious_mongoose · April 3, 2023, 1:38pm

yes, I use aws for hosting

very_crab · April 3, 2023, 1:40pm

We’ve had it for 4-5 months now. We solved it by detecting the error and just restarting our service automatically.

I guess AWS losing connection with Plasmic’s Cloudfront and the IP changes. It really feels low-level.

vicious_mongoose · April 3, 2023, 2:18pm

vicious_mongoose · April 3, 2023, 2:18pm

thanks for the info. How do you detect this? Do you use any tool, or just a hand made script?

vicious_mongoose · April 3, 2023, 2:57pm

or have you changed the node app? Do you know how can we detect it inside the application?

very_crab · April 3, 2023, 3:07pm

No, we catch it on a higher level inside of AWS and act accordingly. Not sure how the script looks like. But it’s a selfhealing server restart.

very_crab · April 3, 2023, 3:08pm

@able_stoat Any idea how our script looks like. Andres also has the disconnect issue.

able_stoat · April 3, 2023, 3:19pm

@vicious_mongoose out app is started in AWS ECS, when we have that issue our app should exit, so ECS will detect that container failed and it will run it again.
I didn’t have opportunity to test what’s going on, but it could be something with DNS caching, because cloudfront IP can change, so if app is trying to connect to the same IP all the time after that change we can get RST.
We are running few similar containers on the same host and issues were for example only on one, so it wasn’t host/network related.

If you have it very often you can try to use some tcpdump to check where it’s trying to connect, what is sending that RST and what is current IPs in host [codegen.plasmic.app](http://codegen.plasmic.app)

vicious_mongoose · April 4, 2023, 1:26pm

thanks! I’ll try to investigate it

vicious_mongoose · April 4, 2023, 1:27pm

in our case, when the fetch fails, the app does not exist. Instead, it keeps trying the request each minute and displays an error in the logs. Don’t know a good way of detecting it.