Student Groups
Student Groups
This page documents the University of Toronto Mississauga (UTM) student groups who contributed to the LibreCode / Annotator ecosystem between mid-2024 and late-2025, specifically around: converting asciinema terminal recordings into structured events, annotations, and derived documentation artifacts using LLMs.
Project context
Mentors
- Julia Longtin — https://github.com/julialongtin
- Arthur Wolf — https://github.com/arthurwolf
Mentorship program
- Human Feedback Foundation (Linux Foundation entity): https://humanfeedback.io/
- University of Toronto Mississauga: https://www.utm.utoronto.ca/
Core LibreCode resources
- Annotator repository: https://github.com/arthurwolf/annotator
- LibreCode / FaikVM wiki: https://wiki.faikvm.com/mediawiki/index.php/Main_Page
- Public hosted annotator instance: https://linuxpmi.org/
- Prompting guidelines/advice: Prompting-guidelines
Most student repositories are hosted under this GitHub organization: https://github.com/CSC392-CSC492-Building-AI-ML-systems
Fall 2024 – Early AutoDocs prototype
What they worked on
This group produced an early prototype of what later became AutoDocs: tooling to segment asciinema terminal recordings into meaningful chunks and generate higher-level annotations.
This work served primarily as a proof of concept and a starting point for later cohorts.
Contributors
(TODO: add names / GitHub links if identified)
Code and artifacts
- Archived Fall 2024 code base (referenced in later repos):
- Mentioned in the AutoDocs README as “Fall 2024 Team’s Code Base”
- Linked from the AutoDocs repository:
Notes
Later documentation notes that much of this code is outdated or non-functional, but it remains historically important.
Winter 2025 – AutoDocs expansion + documentation
What they worked on
This cohort rebuilt and extended the AutoDocs pipeline into a more complete system and produced formal documentation of their work.
The repository includes a tagged release explicitly described as a rewrite of the project by the Winter 2025 team. Release link: https://github.com/CSC392-CSC492-Building-AI-ML-systems/educational-AI-agent/releases/tag/winter-2025
Contributors
Known from the Winter 2025 release note and repository contributors list
- Brian Zhang — https://github.com/Pyosimros
- Vraj Patel — https://github.com/Vraj-Patel1
- Dan Nguyen — https://github.com/nuhgooyin
- Adreano La Rosa — (listed in release note; GitHub handle not yet confirmed)
Additional contributors shown by GitHub
- Abdallah Enaya — https://github.com/abdullah-enaya
- Renee K — https://github.com/renee-k
- aml-8 — https://github.com/aml-8
- Christopher Flores — https://github.com/cfstar188
- Uyiosa Iyekekpolor — https://github.com/uyoyo0
- eyexjay — https://github.com/eyexjay
Code and artifacts
- Main AutoDocs repository:
- Release tag capturing the Winter 2025 state:
- Public talk page referencing this pipeline (AI Tinkerers Toronto, March 2025):
Notes
The current AutoDocs repository states that it was “modified and extended from the Winter 2025 team’s code base”.
AutoDocs (consolidated / ongoing repository)
This is a living repository spanning multiple student cohorts rather than a single group.
Purpose
AutoDocs processes asciinema terminal recordings and produces structured outputs such as:
- segmented command events,
- annotated explanations,
- derived artifacts (for example, Dockerfiles).
People
- Julia Longtin (lead contact): https://github.com/julialongtin
- Model publisher referenced in the README:
Repository
Contents (high level)
- ``data/``, ``frontend/``, ``models/`` directories
- Multiple parser scripts (Parser 0 / 1 / 2)
- References to fine-tuned model checkpoints via Hugging Face links
Autumn 2025 – DocStream consolidation (Educational AI Agent)
This appears to be a later cohort or iteration that built on AutoDocs and re-framed the system as DocStream (same core idea: streamed asciinema logs → events → hierarchical annotations → documentation). Repository: https://github.com/CSC392-CSC492-Building-AI-ML-systems/Autumn2025EducationalAIAgent
What they worked on
From the repository README, DocStream:
- converts raw, noisy terminal activity into structured, reproducible developer documentation,
- processes streamed asciinema logs,
- segments them into meaningful events,
- generates hierarchical annotations explaining terminal activity,
- includes an evaluation harness (based on an extended EleutherAI LM Evaluation Harness) with task and metric scaffolding under ``data/llm_Evaluation/``.
Code and artifacts
- Main repository:
- Repository structure pointers:
- ``data/`` — datasets and evaluation harness inputs
- ``models/model_0/`` — segmentation training and inference
- ``models/model_1/`` — annotation training and inference
- ``demo/`` — front-end visualization demo
- ``runpod/`` — deployment and runpod materials
- White paper included in the repository root:
- Previous iteration explicitly linked from the DocStream README:
Models
- Model 0 — Event Segmentation
- Segments streamed terminal logs into XML-structured events
- Model 1 — Hierarchical Annotation
- Reads Model 0 event chunks and generates summaries with hierarchical depth (goal / subtask structure)
Contributors
GitHub accounts appearing repeatedly in the commit history and likely core Autumn 2025 contributors:
- Ryan Pankratz — https://github.com/ryan-pankratz
- (also appears as: https://github.com/RyanPankratz)
- Victor Shea — https://github.com/VictorShea
- Moe Reda — https://github.com/Moe-Reda
- Patea4 — https://github.com/Patea4