-
Notifications
You must be signed in to change notification settings - Fork 29
Implement 1459203 (remove NSSM), 1436274 (use eventlog) #214
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks awesome, thanks so much for all your hard work, especially your care and attention to detail, ensuring we have some solid tests, and configurable options where they make sense, over hardcoded values.
@@ -225,7 +225,8 @@ func execute(t *testing.T, expectedExitCode ExitCode) { | |||
if err != nil { | |||
t.Fatalf("Test setup failure - could not write to tasks-resolved-count.txt file: %v", err) | |||
} | |||
exitCode := RunWorker() | |||
interruptChan := make(chan os.Signal, 1) | |||
exitCode := RunWorker(interruptChan) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this allow us to Ctrl-C
out of tests now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it does, yes. Were we not able to before?
I changed this to match the signature of the function, which I had changed before running the tests (yikes!) :P
simple_docker.go
Outdated
func remove(arguments map[string]interface{}) (err error) { | ||
return nil | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the remove
target only needed on Windows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I added this to match install()
but I don't think either are needed now that we're doing things in platformTargets()
usage_windows.go
Outdated
installed under. [default: Generic Worker]` | ||
installed under. [default: Generic Worker] | ||
--working-directory DIRECTORY The working directory the Generic Worker | ||
service will use. [default: C:\Windows\system32]` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the worker writes files to the working directory, can we change the default to e.g. C:\generic-worker
? Perhaps the correct drive letter can be determined from the system, in case the operating system isn't installed on C:.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would the behavior be if that directory doesn't exist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm going to change the behavior here.
If no --working-directory
is specified, we default to the parent directory of the generic-worker executable. If one is specified, we use it.
} | ||
log.Printf("Successfully configured eventlog source %q.", name) | ||
|
||
// start service manually in order to fail fast |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we create a machine image with generic-worker on it, we typically want to install it without starting it up (since it is just the machine image that real workers will use, and is not a real worker itself), so I think we don't want to start it up here, otherwise it would run on the machine which is not a worker, but it used to create the machine image for the actual workers.
var logWriter = io.MultiWriter(ioutil.Discard) | ||
|
||
func init() { | ||
// TODO make this work with --working-directory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that is done now, isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This always changes the working directiry to the parent directory of the generic worker binary.
multiuser_service_windows.go
Outdated
@@ -282,6 +284,8 @@ func removeService(name string) error { | |||
return fmt.Errorf("service %s is not installed", name) | |||
} | |||
defer s.Close() | |||
// throw away this error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a note to say why we ignore the error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this fails it's essentially because something went wrong interacting with the service, i.e. it's not running, it didn't respond, etc. We don't care about an error in this scenario, we're just trying to delete it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect. Could you include that as a code comment?
// but not actually removed until reboot | ||
|
||
// TODO | ||
cleanupService(t, name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What made you decide to move cleanupService
out of the defer block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The commented out blocks below, if uncommented, would happen before a deferred call to cleanupService()
multiuser_windows.go
Outdated
os.Chdir(cwd) | ||
// default to generic-worker executable parent dir | ||
if cwd == "" { | ||
dir := path.Dir(os.Args[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dir := path.Dir(os.Args[0]) | |
dir := filepath.Dir(os.Args[0]) |
@@ -62,7 +62,7 @@ func platformCommandLineParameters() string { | |||
--service-name SERVICE-NAME The name that the Windows service should be | |||
installed under. [default: Generic Worker] | |||
--working-directory DIRECTORY The working directory the Generic Worker | |||
service will use. [default: C:\Windows\system32]` | |||
service will use.` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we determine the system default at runtime and include it in the usage text?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think defaulting to the parent directory of the generic worker binary is a reasonable default. It's also trivial to implement in a cross platform manner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine, but we should describe what the default is in the docs.
012d1ce
to
ef04a9b
Compare
build.cmd: use multiuser build tag keep existing gitattributes directive
logging working w/ multiwriter!
add TestRunServiceWithoutEventlogSource, TestRunServiceWithBrokenWriter, use SKIP_ADMINISTRATOR_TESTS env var to skip service install/remove tests
…s locally rewrite runservice to return ExitCodes properly
use io.MultiWriter() instead of ioutil.Discard
20b84d8
to
518722b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Miles,
I had a play with the Windows binaries today, and found some issues:
-
The
generic-worker
creates a log file even when you rungeneric-worker --help
. The Windows service should direct stderr/stdout to a log file, but when generic-worker is invoked any other way, it should write to stdout/stderr without redirecting them somewhere else. -
The
remove service
subcommand is not documented in usage docs. Please include some help text. -
Same for
run-service
. -
Same for the crash file.
-
When I had a crash, the crash file content said
Error loading configuration file %v
. I suspect this might just be alog.Print
somewhere that should be alog.Printf
. -
I'm not sure we have a use case for calling
run-service
from the command line with arguments (such as--config
). My understanding was that runninggeneric-worker install service ...
would install the worker as a Windows Service, and from that point on, the service is configured, so doesn't need options to be passed to it. The user would just start it or stop it. I thought runninggeneric-worker run-service
would be equivalent to runningnet start "Generic Worker"
.I think we can get away entirely without having a
run-service
subcommand, becausenet start "Generic Worker"
is already available, and then the Windows service would invoke therun
subcommand, running as LocalSystem, and redirect stdout/stderr to the log file. Therun
subcommand can test to see whether it is being run interactively or not, if it is necessary for switching the logging mode, etc. Alternatively, you could add a--log LOG-FILE
option torun
that gets passed when invoked as a Windows service. -
generic-worker run
isn't logging to stdout/stderr when run interactively. -
run-service
is named inconsistently withremove service
andinstall service
which use separate words (rather than a hyphen), but this issue should disappear when we removerun-service
and adaptrun
instead (see point 6). -
generic-worker install service
is also directed to a log rather than stderr/stdout. -
I get a panic running the install step as Administrator. If the panic is due to missing files or anything else, we should try to report that and avoid crashing the worker:
C:\Users\Peter James Moore\Desktop>.\generic-worker-windows-386.exe install service
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xc0000005 code=0x0 addr=0x8 pc=0x7a023c]
goroutine 1 [running]:
github.com/taskcluster/generic-worker/vendor/golang.org/x/sys/windows/svc/mgr.(*Service).Start(0x0, 0x1265ef60, 0x3, 0x4, 0x1, 0x0)
Z:/task_1563291642/gopath1.10.8/src/github.com/taskcluster/generic-worker/vendor/golang.org/x/sys/windows/svc/mgr/service.go:45 +0xcc
main.deployService(0x126378a9, 0x15, 0x124cd32b, 0xe, 0x126ac190, 0x41, 0xaf000000, 0x0, 0x0)
Z:/task_1563291642/gopath1.10.8/src/github.com/taskcluster/generic-worker/multiuser_service_windows.go:206 +0x334
main.install(0x1265ef40, 0x1265ef40, 0x8f7dda)
Z:/task_1563291642/gopath1.10.8/src/github.com/taskcluster/generic-worker/multiuser_windows.go:242 +0x25a
main.platformTargets(0x1265ef40, 0x1265ef40)
Z:/task_1563291642/gopath1.10.8/src/github.com/taskcluster/generic-worker/multiuser_windows.go:428 +0x6db
main.main()
Z:/task_1563291642/gopath1.10.8/src/github.com/taskcluster/generic-worker/main.go:123 +0x1b5
-
If the
generic-worker
binary is being installed to a different location than it currently resides (e.g. if it is downloaded toDownloads
orDesktop
etc) then the binary should probably be copied to the target destination folder as part of the installation process.Admittedly we didn't do this before, so this is minor scope creep. I think it would make sense to tackle this at the same time though. If we don't do this, we should at least highlight in the usage docs, that the Windows service will be configured to use the generic-worker binary from its current location, so the generic-worker binary should be moved to its desired installation location before running the
install
subcommand.
Thanks for all your hard work so far on this so far. It is looking really good!
@milescrabill This has kind of bitrotted. Should we close it, or do you want to return to it? |
There's a ton here. Too much!
Continued from #210
adds dependency on
gopkg.in/natefinch/lumberjack.v2
for logfile managementremoves NSSM and modifies docs, userdata in
worker_types
, etc. as suchchanges the way the main
RunWorker
function is called for flexibility, it now accepts a signal chansplit out config handling and exit code handling into functions
adds debug logic for panic scenario exits including writing out a crash log in scenarios where other logging might be failing
support for installing/removing the generic worker service using native go
support for specifying configuration location at install time
support for windows event log when run as a service
logging to logfile by default in windows
remove need for a bat script wrapper to change the service working directory, cwd is set to parent dir of worker exe
Usage testing:
TODO:
io.Writer
to thelogWriter
MultiWriter
will be caught (worth testing?)lumberjack
)Weird things: