Questions on implementing a watcher for Visual Studio (VSIX Plugin)

LaggAt · February 6, 2019, 6:59pm

Hi, as mentioned in another thread I’m developing an watcher for Visual Studio as Plugin.
It’s already working well, but I have some questions how things are handled best:

Events: currently I’m cumulating similar events and set the duration. Is it better to push events as heartbeats and with duration=0, or should I cumulate it? I think cumulating should be better, as I really know when the user is working on the code and could provide better detail, but maybe I get your opinions on it.
aw-server.ini: I found this file in the local %AppData% folder. I want to use it to show the user 1) if AW is installed, and 2) if it is running (using the port config in there). When is this file created (Installation, first run) and does it always end up in that folder?
(this one isn’t related to this watcher, but I want it to happen) I’ve found some documentation on sync - I really want this - how far is the planning/development for this?

That’s it for now, as soon as I am ready I’ll release it to Github or so, and publish it as VS plugin.

Kind regards,
Florian.

johan-bjareholt · February 7, 2019, 10:10am

The heartbeat API simplifies things a lot so you don’t need to remember states in the watchers and just push the latest activity. In aw-client we actually do pre-merging of heartbeats to reduce the amount of events sent to aw-server (this was an optimization we did a while back to save writes to the database, the next coming versions we will have a very

It’s created the first time aw-server is started and yes it will always be placed exactly there. Very few actually modify these configs, so assuming port 5600 should also be fine.

It’s also worth mentioning that a few people have asked about having aw-server on a remote host (which is a really weird use-case we don’t recommend as the data is not encrypted, but it works) and in that case reading from aw-servers config would not work. I think the best approach would be to have a setting in your VS extension where you can manually set host+port.

Well, we have an idea of the architecture but we have barely started implementing it. Our idea of how we want it to work though is kind of complicated, so it’s likely going to take a while (my guess would be 1 year minimum, likely 2 years as we don’t work on this full-time). @ErikBjare has been working on a minimal prototype. We currently prioritize finishing our manual import/export though which should be finished in the next beta or two.

We really appreciate that people take their time and develop editor watchers for ActivityWatch! It will help us a lot to have mature editor watchers when we add more visualizations to aw-webui in the future so we can become a true competitor with the closed-source alternatives (because lets face it, our feature set in terms of editor activity visualization is seriously lacking currently, but it works and is good enough for simple use-cases)

LaggAt · February 8, 2019, 3:57pm

Thanks for your precise and complete answers. I need some logging, and some UI things like these settings you mentioned to finish and publish an alpha.

I havn’t seen the exact idea how it should be done. So I’m sorry if I’m repeating someone, but I want to tell you my thoughts.

We could write an exporter, taking every created bucket or event. This exporter needs to remember the timestamp he already did export. The easiest exporter implementation could write json files for every single entity. (could also be a stream to an importer, or whatever, but let’s stay basic).
Then we need some transport (Syncthing as mentioned somewhere, …)
Last, we need an impoter taking these files, import them and delete them when done. Importer could use the API as any watcher client.

That’s it to sync one-way.
Let’s set some higher goals:

For two-way we could think of folders in the path the exporter is exporting to, like: ./MACHINENAME/Export.

To think it to the end:
For n-way-sync the exporter/importer know a list of machine names. Then, we could use Folders again:
./EXPORTER1/IMPORTER2
./EXPORTER1/IMPORTER3
…

There may be a pitfall choosing the timestamp to know what was exported, if events from the past arrive at the aw_server. Maybe we should keep a timestamp per bucket_id.

All in all, aw_server is ready to handle this in the api, and it could possibly implemented in some hours.

-edit-
Just forgot one detail: I assumed all bucket_id’s start with the machine name - so the exporter may not export bucket_id’s from other machines to avoid loops.

johan-bjareholt · February 8, 2019, 6:59pm

I guess there’s no good documented overview of it as it’s mostly just something that me and @ErikBjare has been discussing about IRL.

Your simple implementation idea is good, but this is the large flaw. It cannot remove/modify events prior to the last sync without a lot of hassle which is very important if you for example want to remove old events which you don’t want to be remembered, then you don’t want they to still be there on your other synced devices. Just keeping the timestamp is not enough, we would have to track a list of IDs removed and the whole event if modified for all events prior to the last sync.

While syncthing has a protocol, it’s a huge protocol and implementing it ourselves would be a daunting task. There is no syncthing library but the program is pretty nicely split and it’s written in Go which makes it pretty hard to work with from other programming languages since it pretty much only uses static linking. (go has support for dynamic linking, but it’s rarely used and only supported on Linux).

Otherwise though, syncthing has pretty much all we need such as the ability to only sync parts of large files, discoverability on both a local network as well as remotely, configurable with backup and one-way syncing. So if we succeeded with this it would pretty much be the holy-grail. I’d want to try to do it this way and see where it goes, if the code gets clean and reliable we’d choose that and if it just becomes a large hack we have to try something else. But all of those features I mentioned before are a requirement for us and implementing all of them ourselves would be a whole project of itself.

We want to do the architecture right from the start so we don’t have to do any large architectural changes afterwards, we have done that mistake before and don’t want to do that again. Even if we went with the simple version you suggest though it would still take a few months for us with proper testing since we only work on ActivityWatch in our spare time.

LaggAt · February 8, 2019, 8:32pm

Right, that wouldn’t work. I thought about deleting events in future, but probably something like “all events older than some month” - which is an easy sql task. Anyway, this could be solved if the aw-server itself implements an event pump (a queue a plugin could listen on) - pushing anything what happens to this exporter.

My first task will be finishing the VS Watcher. Think I will have a look at aw-server’s code later. Where is your development effort now? aw-server (python) or aw-server-rust?

Weeks of coding can avoid hours of planning - right
I’m a big fan of good architecture, but dislike overengineering. And I love minimalistic approaches, doing just what is needed

I didn’t wanted to ask you to implement it, I want to offer some help. I will need a solution to this if I continue using aw, but this is a thing i could possible help. I know there are loads of other important tasks as ui improvements just to mention one, but I’m not that good on that.

That said, this was just my way to ask what your are on regarding this topic, to find out if I could do some work to fulfil my use case (and possibly help on yours if it’s on my way).

So - back to coding. Thanks for your time so far.

johan-bjareholt · February 11, 2019, 12:21pm

We still fully maintain aw-server and will continue to do so for a while. Our plan though is to replace it with aw-server-rust hopefully (within this year would be my hope). aw-server-rust needs extensive testing and we need to improve the transforms a bit as they are verified that they work but the accuracy is not. aw-server-rust also misses some nice stuff such as a config.

Yeah, the size of it is a bit daunting. If we somehow were able to just utilize the relay+discovery protocol in syncthing I might find your solution viable, but I’m not sure how that would be done.

Yeah I know, I just wanted to point out that I thought your estimate was a bit low and with our lack of available time to spend on this project would make it take a while.

johan-bjareholt · February 11, 2019, 12:48pm

Talked with some others, just realized that there’s a project called libp2p which looks to be pretty awesome. Might actually be a viable alternative.

LaggAt · February 11, 2019, 7:07pm

Have some minutes to think about it: how did you plan to support delete with int identifiers? Did you plan to some guid?

LaggAt · February 11, 2019, 11:04pm

Hi again, if you want to take a look, here is the source:

I’ll test it on a second machine tomorrow, before deploying it somewhere.

johan-bjareholt · February 12, 2019, 1:59pm

Events get an “id” property assigned in aw-server which is essentially just the primary key in the database table.
It is already possible to remove single events from the REST API via DELETE /0/buckets/{bucket_id}/events/{event_id}. We will add a endpoint later to do batch deletions.

I haven’t coded C# or done proper Windows development in 6 years, so I’m finding it hard to read the code and finding stuff since there are so many files. Will take a proper look at it some other time and get more used to that environment lol.

LaggAt · February 12, 2019, 6:10pm

Hi, yea I know. I thought of the issue with int ID’s when syncing - the event_id for the same event would be different on different nodes - something your could avoid when using guid instead of int.

I agree This isn’t C# issue. A visual studio plugin has a lot of files Just start at the AWPackage.cs, this is the AsyncPackage Visual Studio is initializing. In there look at the InitializeAsync(…).

I’ll try to find some time to get AppVeyor running to get installable packages, logo and some screenshots, And it also misses tests for now. I just wanted it running fast.

btw: it did run well today, Hope we have a new windows bin soon to see graphs for editor events.

LaggAt · February 12, 2019, 10:15pm

In the meanwhile, for those who doesn’t want to wait for an release, here is the current alpha version 0.0.0.2.

Quick Start:

download & extract ActivityWatch https://github.com/ActivityWatch/activitywatch/releases
start aw-server.exe (optional: start it with windows by putting a link in “shell:startup”)
install this plugin (click)

Should work out of the box. Please report issues on github.

johan-bjareholt · February 13, 2019, 2:59pm

That is a very good point, we would need a database migration to solve that issue in the future. Will take that into consideration.

LaggAt · February 13, 2019, 6:06pm

yes, + this would change the api. Maybe start early with a v2 API to get plugin writers prepared.

johan-bjareholt · February 14, 2019, 6:29pm

How would it change the API? It’s still just an int in the end (just a longer and guaranteed unique int).

Also, I realized if the sync is only implemented in aw-server-rust we might not need a database migration (only a aw-server -> aw-server-rust migration, which we’d need anyway)

LaggAt · February 14, 2019, 9:40pm

well, uuid is 128bit. Never thought of converting it to a very big int.
There is some sync code in aw-server-rust? Hope I find some time to look into it (and to get my hands on rust )

johan-bjareholt · February 15, 2019, 7:30am

The API is JSON where there’s no limit to the size of the int, that was my point.

There’s not yet, but I’ve been experimenting a couple of hours with libp2p on Rust the past days.