ALsats: active learning (for a few) sats - continuation

ALsats is a project that aims to reduce the investment needed to create minimum viable datasets for supervised machine learning. The current proposal is a follow-up from round 4 - Alsats: active learning (for a few) sats.

alsats provides the ability to do inexpensive, intelligent training-as-a-service as mentioned in my post on ALsats . It combines Active Learning for guided labeling of training data for supervised machine learning with micropayments in satoshis (and milisatoshis) using the Lightning Network.

The funding in round four was tied to two main deliverables:

  1. ALsats hosted on AWS EC2 server with publicly accessible API endpoints.
  2. Streamlit/Gradio/equivalent app that allows invited data scientists to train + label

Both deliverables were met, as outlined in the linked video.

(NB: If you are a Algovera associated data scientist looking forward use alsats to label image data, please contact me directly on Discord or DM me on Twitter. Iā€™m presently beta testing out the infrastructure myself.)

Round 6 proposal:
The round 6 proposal focuses heavily on data science usability. Specifically, the following mandatory deliverables are proposed:

  1. Label checkpointing - data scientists should be able to upload a checkpoint file (which can be downloaded at the end of each label-session) that allows them to continue labeling without a loss of generality.
  2. Model Checkpointing - At the end of the labeling run, a data scientist should be able to save the presently trained model for future labeling sessions and/or download the active learning model upon end of session.

How allotted funds have been used:
Allotted funds in round 4 were used to pay for infrastructure - The EC2 instance being used costs 8.2 cents an hour, which is about USD 720 per year, in addition to domain registration fees.

How round 6 allotments will be used if funded:
The deliverables in round 6 will require S3 usage, greater RAM allocation (to hold multiple concurrent models potentially) and ingress/egress of model data, which will add to costs. There is also the possibility that a bigger RAM and storage allocation may be needed as the current RAM and storage allocation (8GB, 30GB) can run a single streamlit app with Polar (Docker-based lightning regnet and simnet simulator) running simultaneously with just one client and server.

I look forward to receiving your support in round 6.

Squad Members:
antaraxia/antaraxia.eth: Algovera member.
Twitter Handle: https://twitter.com/antaraxia_kk

2 Likes