Test Collab’s QA Copilot turns plain-English instructions into runnable Playwright scripts. Our current accuracy isn’t production-ready; so we’re crowdsourcing smarter approaches.
### Challenge We give you - *Public test set* (CSV of instructions) + metric script - *Invite-only demo URL* so you can poke the live app and extract any visual context you need
Build any tool—LLM, RAG, classic NLP, computer-vision add-ons, hybrids—that consumes the context and outputs the correct Playwright code. Run the metric locally, report accuracy; we re-score on a private hold-out set.
### Reward tiers (first verified submission wins each) - 40–60 % accuracy → $250 - 60–70 % accuracy → $500 - 70 % + accuracy → $750
*Total pool: $1500*
### Rules - Submit fully reproducible code/weights/setup. - Earliest validated score grabs the tier; no duplicate payouts. - Payment within 14 days of validation.
### Join Email *[email protected]* with subject *“Model Bounty”* with a brief intro, to get the NDA and data link.
Happy hacking, Abhimanyu @ Test Collab
.png)


