Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Applying Unbound blacklist fails with error #6026

Closed
2 tasks done
haarp opened this issue Sep 16, 2022 · 15 comments
Closed
2 tasks done

Applying Unbound blacklist fails with error #6026

haarp opened this issue Sep 16, 2022 · 15 comments
Labels
support Community support

Comments

@haarp
Copy link

haarp commented Sep 16, 2022

Describe the bug

After adding a new entry in the Unbound blocklist whitelist (going from 2 to 3 elements) and pressing Download&Apply, it waits a while and then I am greeted with an error box stating

Error updating blocklists
An error occurred during script execution. Check the logs for details

Backend log shows

Timeout (120) executing : unbound dnsbl
Script action failed with Command ' /usr/local/opnsense/scripts/unbound/blocklists.py && /usr/local/opnsense/scripts/unbound/wrapper.py -b ' returned non-zero exit status 99. at Traceback (most recent call last): File "/usr/local/opnsense/service/modules/processhandler.py", line 482, in execute subprocess.check_call(script_command, env=self.config_environment, shell=True, File "/usr/local/lib/python3.9/subprocess.py", line 373, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command ' /usr/local/opnsense/scripts/unbound/blocklists.py && /usr/local/opnsense/scripts/unbound/wrapper.py -b ' returned non-zero exit status 99.

Subsequent attempts at pressing Download&Apply immediately return with the error box. I am currently unable to save at all.

To Reproduce

Steps to reproduce the behavior (I am not sure how reproducible it is on other installs):

  1. Have Unbound blocklist active with Blocksite.list Fraud,Malware,Phishing,Ransomware,Scam; EasyList; EasyPrivacy; WindowsSpyblocker (spy)'
  2. Have two elements in whitelist: foo.com, .foo.com
  3. Download&Apply
  4. Use for a while
  5. Later, add third element bar.com to whitelist
  6. Press Download&Apply

Expected behavior

Applies without trouble

Describe alternatives you considered

Rebooting the box perhaps? Not an option until at least the weekend. And I'd be workarounding, nox fixing things :)

Screenshots

Relevant log files

See above for log

Additional context

Add any other context about the problem here.

Environment

Software version used and hardware type if relevant, e.g.:

OPNsense 22.7.4-amd64
Intel(R) Celeron(R) J4125 CPU @ 2.00GHz (4 cores, 4 threads)

@AdSchellevis
Copy link
Member

Can you try to run /usr/local/opnsense/scripts/unbound/blocklists.py && /usr/local/opnsense/scripts/unbound/wrapper.py -b from a console to see if this produces a readable error?

@haarp
Copy link
Author

haarp commented Sep 16, 2022

OK, reboot didn't fix it.

Can you try to run /usr/local/opnsense/scripts/unbound/blocklists.py && /usr/local/opnsense/scripts/unbound/wrapper.py -b from a console to see if this produces a readable error?

Nope, no output unfortunately. Just the exit code.

/usr/local/opnsense/scripts/unbound/blocklists.py is the one exiting 99. Problems with locking?

    # check for a running download process, this may take a while so it's better to check...
    try:
        lck = open('/tmp/unbound-download_blocklists.tmp', 'w+')
        fcntl.flock(lck, fcntl.LOCK_EX | fcntl.LOCK_NB)
    except IOError:
        # already running, exit status 99
        sys.exit(99)

@haarp
Copy link
Author

haarp commented Sep 16, 2022

I deleted the lockfile and tested the script again. It gets stuck somewhere and doesn't return.

After 2 minutes, the web interface gives up, but leaves the script running, which also keeps the lock file around, failing subsequent attempts.

@AdSchellevis AdSchellevis added the support Community support label Sep 16, 2022
@AdSchellevis
Copy link
Member

After 2 minutes, the web interface gives up, but leaves the script running, which also keeps the lock file around, failing subsequent attempts.

Could relate to connectivity issues, 2 minutes is just the max the frontend will wait (execution will continue, but restarts will exit while the first one is still busy).

@haarp
Copy link
Author

haarp commented Sep 16, 2022

Yes, I also suspect connectivity, along with long timeouts.

After 40 minutes, blocklists.py finally returns (if no other instance was running), but leaves /var/unbound/etc/dnsbl.conf untouched (i.e. didn't remove the newly whitelisted domain) :/

The script needs far lower timeouts, and should apply the whitelist even if some blocklists couldn't be downloaded. Lastly, the frontend's handling of this script can lead to a lot of confusion, esp. when it leaves an instance running, blocking further attempts to do anything from the web UI.

@AdSchellevis
Copy link
Member

It depends on the cause, at a first glance I don't expect there's much we can do, every request (list) has a timeout (120 seconds likely) and the script finishes when all are processed in sequential order. To remove items, it would need to download the items (whitelist is an exclude on the download operation), which apparently didn't happen. Fixing local connectivity is likely the best cause of action here.

Relevant code blocks:

for line in uri_reader(cnf['blocklists'][blocklist]):

if entry:
domain = entry.lower()
if whitelist_pattern.match(entry):
file_stats['skip'] += 1
else:
if domain_pattern.match(domain):
file_stats['blocklist'] += 1
blocklist_items.add(entry)
else:
file_stats['skip'] += 1

@kulikov-a
Copy link
Member

kulikov-a commented Sep 16, 2022

Hi!
@haarp

After 40 minutes, blocklists.py finally returns (if no other instance was running), but leaves /var/unbound/etc/dnsbl.conf untouched (i.e. didn't remove the newly whitelisted domain) :/

in this case there should be records in ubound log with blocklists download\parsing results and timing (and may be more useful info)?

@AdSchellevis i'm a bit concerned about request timeout value in blocklists.py. afaik Requests does not allow to control the total request\download time ( . It only controls the server response time and time between bytes (ref)
So i checked this with unresponsive server and indeed the script spends a lot of time only waiting for connection (of course this does not happen in more typical cases when the server name is not resolved or the server responds, but with an error code).
May be changing the timeout val at

to 5 sec makes sense?

@AdSchellevis
Copy link
Member

@kulikov-a I'm ok with a short time-out, a couple of seconds is usually enough indeed.

@haarp
Copy link
Author

haarp commented Sep 18, 2022

I've tracked down what caused the connectivity problems in the first place. IPv6 was not working correctly. OPNsense did have an IPv6, but couldn't establish connections.

In Reply to @kulikov-a

in this case there should be records in ubound log with blocklists download\parsing results and timing (and may be more useful info)?

Yes indeed. It shows that each blocklist took exactly 5 minutes to "download", which is unlikely. Presumably there's a timeout after 300s, and the actual download never took place. It also doesn't seem to detect download failure correctly. It probably aborted somewhere after downloading, because the newly-added third whitelist item never got added either.

<165>1 2022-09-16T13:08:37+02:00 OPNsense.domain unbound 87197 - [meta sequenceId="1"] blocklist download : exclude domains matching .*localhost$|sentry.io|.sentry.io|analytics.google.com|^(?![a-zA-Z_\d]).*
<165>1 2022-09-16T13:13:39+02:00 OPNsense.domain unbound 87197 - [meta sequenceId="1"] blocklist download https://blocklistproject.github.io/Lists/alt-version/fraud-nl.txt (lines: 196117 exclude: 1 block: 196091)
<165>1 2022-09-16T13:18:43+02:00 OPNsense.domain unbound 87197 - [meta sequenceId="1"] blocklist download https://blocklistproject.github.io/Lists/alt-version/malware-nl.txt (lines: 435457 exclude: 32 block: 435284)
<165>1 2022-09-16T13:23:45+02:00 OPNsense.domain unbound 87197 - [meta sequenceId="1"] blocklist download https://blocklistproject.github.io/Lists/alt-version/phishing-nl.txt (lines: 190316 exclude: 8 block: 190237)
<165>1 2022-09-16T13:28:45+02:00 OPNsense.domain unbound 87197 - [meta sequenceId="1"] blocklist download https://blocklistproject.github.io/Lists/alt-version/ransomware-nl.txt (lines: 1919 exclude: 0 block: 1904)
<165>1 2022-09-16T13:33:45+02:00 OPNsense.domain unbound 87197 - [meta sequenceId="1"] blocklist download https://blocklistproject.github.io/Lists/alt-version/scam-nl.txt (lines: 1283 exclude: 0 block: 1265)
<165>1 2022-09-16T13:38:45+02:00 OPNsense.domain unbound 87197 - [meta sequenceId="1"] blocklist download https://justdomains.github.io/blocklists/lists/easylist-justdomains.txt (lines: 21707 exclude: 3 block: 21704)
<165>1 2022-09-16T13:43:46+02:00 OPNsense.domain unbound 87197 - [meta sequenceId="1"] blocklist download https://justdomains.github.io/blocklists/lists/easyprivacy-justdomains.txt (lines: 20246 exclude: 1 block: 20245)
<165>1 2022-09-16T13:48:46+02:00 OPNsense.domain unbound 87197 - [meta sequenceId="1"] blocklist download https://raw.githubusercontent.com/crazy-max/WindowsSpyBlocker/master/data/hosts/spy.txt (lines: 353 exclude: 0 block: 347)
<165>1 2022-09-16T13:48:47+02:00 OPNsense.domain unbound 87197 - [meta sequenceId="2"] blocklist download done in 2410.23 seconds (851479 records)

@kulikov-a
Copy link
Member

kulikov-a commented Sep 19, 2022

@haarp blocklist script sends this strings to log (with blocklist url and parsing stats) only after receiving and processing the file. therefore, it seems to me that the files were still downloaded and processed (although, of course, it would be extremely interesting to understand the reason for such a large and, most importantly, the same time for downloading and processing files. May be there any traffic limiting effect?)

because the newly-added third whitelist item never got added either

analytics.google.com ?

@kulikov-a
Copy link
Member

@haarp

After 40 minutes, blocklists.py finally returns (if no other instance was running), but leaves /var/unbound/etc/dnsbl.conf untouched (i.e. didn't remove the newly whitelisted domain) :/

it should be in /usr/local/etc/unbound.opnsense.d/dnsbl.conf first and will be copied to /var/unbound/etc only on template reload imho. can you check /usr/local/etc/unbound.opnsense.d/dnsbl.conf for analytics.google.com?

@haarp
Copy link
Author

haarp commented Sep 20, 2022

it would be extremely interesting to understand the reason for such a large and, most importantly, the same time for downloading and processing files.

Indeed. The only thing I can think of is a timeout of 300 seconds in python's requests before it falls back to IPv4 and then successfully downloads within 1 second.

it should be in /usr/local/etc/unbound.opnsense.d/dnsbl.conf first and will be copied to /var/unbound/etc only on template reload imho. can you check /usr/local/etc/unbound.opnsense.d/dnsbl.conf for analytics.google.com?

Ah, I see. /usr/local/etc/unbound.opnsense.d/dnsbl.conf is used for DNSBL reloads exclusively. The only time /var/unbound/etc/dnsbl.conf is written is when unbound is to be restarted (due to config changes or otherwise)

The third whitelist entry (yup, it was analytics.google.com), has been successfully whitelisted after I fixed IPv6. Unfortunately I don't understand what I did to fix it, so I can't unfix it to test what prevented the whitelisting from working :)

@kulikov-a
Copy link
Member

@haarp so we agree that one of the reasons for the extremely slow blocklist download was incorrect IPv6 working on the local host?
possible explanation can be that requests (and underlying urllib (and underlying sockets)) prefer ipv6 (if present). and it may lead to double timeouts (excellent investigation)
don't have the suitable environment to check, but let's hope that the reduced timeout value (#6030) will work in this case as well

so I can't unfix it to test what prevented the whitelisting from working :)

if you can break it again and test the behavior with the new timeout value, that would be great. but I don't think it's necessary :)

@haarp
Copy link
Author

haarp commented Sep 20, 2022

so we agree that one of the reasons for the extremely slow blocklist download was incorrect IPv6 working on the local host?

Correct!

Thank you very much for yours and @AdSchellevis super quick response and fix. It really confirms why I should've switched over from pfSense earlier :)

Feel free to close at your discretion. If my IPv6 ever breaks again, I'll add a comment here and reopen.

@AdSchellevis
Copy link
Member

closed with #6030

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
support Community support
Development

No branches or pull requests

3 participants