-
Notifications
You must be signed in to change notification settings - Fork 20.9k
geth consumes all ram; drops blocks, peers #20963
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Could you please provide the command you use to run Geth? The reason for the resync is that the recent state is kept in memory for garbage collection. This however means that a crash loses all that, so when you restart, you need to reprocess the lost blocks. |
Could you also provide a memory chart? Would be nice to see the consumption. |
The command we are using to run Geth:
Memory / system charts for the node in question: FYI...we are observing this same exact behavior on a node that is 100% idle, having absolutely zero requests thrown at it. In case you are wondering why the memory chart has a bunch of sudden drops, this is because we also have a bash script that checks total memory in use, if total memory exceeds 80%, geth is restarted. This was put in place as a "bandaid" till we get to the root cause of the issue. |
Which version of Go did you build it with? 1.14.0 and 1.14.1 had a GC bug (golang/go#37525) that cause Geth to explode on memory use. It was fixed in 1.14.2. |
Another thing that could help, is when your node enters into this strange high memory use, dangerously close to being killed, please run a |
Looks like we are running go 1.13.8:
Thank you for the tip! I'll definitely get back to you with the |
By adding |
We are running into similar issues of memory leak. First with 1.9.12, so we upgraded to 1.9.13 but also encountered similar issues on multiple production servers. No other changes were made |
@mtbitcoin What flags are you running with? |
@karalabe i am running further tests, but it appears it might be related to someone intentionally dossing the nodes by running eth_call or gas estimates, applying rpc.gascap appears to have helped |
As requested - https://pastebin.com/1mGvk4SQ
Thanks for the heads up! will definitely give it a try.
Thanks! I'll try that as well :) |
Péter, I replied back with the requested information a couple of weeks back, can you please review it and let us know if anything stands out? My organization and I were hoping this would be fixed in Please let me know if you require any additional information from our end. Thanks! |
We've been looking into this today, and can't find any obvious culprit. Does this issue still appear with most recent version? Also, if it does, would be great with a new stack trace, since the codelines have changed. |
Please check with latest Geth, latest Go. What would really help is to try to minimize the moving components. Lets try a 16GB machine, idling without RPC calls, just syncing with the network. That is what we're running all the time and should not go OOM. If that works stably, lets try to add RPC into the mix. That would really help if you could provide what API calls you are doing. There are very easy ways to make a node go boom with the "correct" RPC requests. |
This is no more an issue for us... (archive and default synch nodes) with the latest version Edit: the gascap helped on our end (from what we could see, someone figured out they could dos the nodes and were sending in calls with high gas limits) |
I've got the same issue with the following set-up: Geth version: Geth eats up all my RAM after a long time running. I started with the line: Afterwards I tried to use the solution presented by @mtbitcoin : However a gascap of 500000 seemed to be too high so I changed it to 300000. @mtbitcoin |
also interested in @mtbitcoin reply to the above question |
Should there be a Gas Limit ?? Interested to Finding this issue |
Hi, we are running several Geth nodes.
Every week, at least one node which is synced to tip 'loses' 4-10k blocks and begins re-syncing them. At the same time, all peers are dropped/disconnected.
We are seeing OOM errors around the time this happens.
RAM usage creeps up to 100%, then the blocks & peers are dropped.
We upgraded one node to 16GB from 8GB, and it slowly consumed the additional memory and the issue happened again.
What could this be related to, where should we look, and which flags could we modify to potentially resolve this issue?
System information
Geth version:
1.9.12-stable
OS & Version: Ubuntu 16.04.6 LTS
Expected behaviour
node stays in-sync with the network
Actual behaviour
eats up available RAM (8-16GB) and drops 4-10k blocks, begins resyncing them
also drops peers
Steps to reproduce the behaviour
unclear; the node is running and answering RPC requests via websockets. sometimes the issue coincides to a sudden influx of requests to the node, but not always.
The text was updated successfully, but these errors were encountered: