Two things I encountered recently share a structural insight about how to use LLMs well — not as a replacement for human judgment, but as a visible test of it.
The first is Unravel Tech's hiring post, written by founder Vedang Manerikar. The second is my own विपरीत केकावली exercise — a classroom assignment where students use LLMs to draft classical Marathi poetry and then must critically evaluate and fix what the LLM produces. Both converge on a single principle: the process of using the AI is the test.
Unravel Tech (@unraveldottech) is a Pune-based company building agentic AI systems. Their hiring post opens with a line that should make you pause:
We are looking for serious engineers who take their craft seriously. We lean heavily into using AI. These two sentences are not in conflict with each other.
— Unravel Tech hiring post
The post then lays out five instructions for applying — each one a hidden test:
The genius is that every instruction simultaneously filters for competence and character. An applicant who can't get an agent to parse LinkedIn profiles isn't going to thrive building agentic systems. An applicant who hides the AI's involvement doesn't share Unravel's values. An applicant who picks a boring rhyming word might lack the creative spark they want.
"Apply, DSPy, and a third rhyming word of your choice." Consider the choices: AI is on-the-nose. fly is aspirational. spy is playful. awry is self-aware. alibi is clever. samurai is bold. Each choice says something different about the applicant — and Unravel knows this. The constraint forces a micro-creative decision that doubles as a personality signal.
In my earlier post, I proposed an exercise for Marathi classrooms: after deeply studying Moropant's Kekavali (1790), students compose its विपरीत (corollary) — where the original invokes every good thing, the corollary warns against every corresponding evil. The twist: students must use an LLM for their first draft, but then must critically evaluate and perfect the output using their own knowledge of Marathi prosody, vocabulary, and meaning.
The exercise was inspired by the University of Michigan Law School's admissions essay (July 2025), which requires applicants to use generative AI and then demonstrate how thoughtfully they used it. Same principle — different domain.
The LLM will produce something. But it will almost certainly get the meter wrong (मात्रावृत्त, 11+13 मात्रा per line), use Hindi-influenced vocabulary instead of authentic Marathi, repeat "न घडो" monotonously instead of varying verb endings, and fail to properly invert the theological metaphors (यशोदा-कृष्ण, दयामेघ-मयूर). The student who can spot and fix these mistakes demonstrates genuine understanding. The student who submits the raw output demonstrates nothing.
Both approaches share a structural DNA:
| Dimension | Unravel Hiring | विपरीत केकावली |
|---|---|---|
| Domain | Software engineering recruitment | Marathi classical poetry education |
| AI role | Agent researches profiles, drafts cover letter | LLM generates first draft of विपरीत poetry |
| Human role | Judgment on tone, rhyme choice, quality of argument | Checking meter, vocabulary, imagery, meaning |
| Transparency | "Signed by you and your agent" | Submit raw LLM draft alongside corrected version |
| What raw AI output lacks | Personality, cultural fit signal, genuine motivation | Correct मात्रा, authentic Marathi, proper metaphor inversion |
| The real test | Quality of collaboration with the agent | Quality of corrections to the LLM's draft |
Drawing from both examples, here are principles that any organization could adopt when designing hiring processes for the AI era:
The old approach — "don't use AI for your application" — is unenforceable and counterproductive. Candidates who follow the rule are disadvantaged. Candidates who break it learn to hide it. Unravel flips this: use the AI, but show us how. The Michigan Law School does the same. My classroom exercise does the same. Once you require AI use, you can evaluate the quality of that use — which is a far more useful signal than whether someone used it at all.
This is the key insight. Choose evaluation criteria where current LLMs predictably struggle:
The task should be designed so that submitting raw LLM output produces a visibly mediocre result. The evaluator should be able to tell, at a glance, whether the human added value.
Unravel's "signed by you and your agent" is a small but powerful norm. It normalizes human-AI collaboration while making it legible. In a classroom, this is the reflection paragraph ("what the LLM got wrong, what I fixed, why"). In hiring, this could be a brief note on which parts of the application were AI-assisted and how the candidate directed the process.
This isn't about policing honesty — it's about selecting for people who think about AI as a collaborator rather than a crutch or a secret weapon.
Unravel doesn't have a separate "AI skills assessment." The application is the assessment. Every instruction — finding the email, composing the subject line, writing the cover letter — tests a different capability. This is more ecologically valid than any whiteboard test. You're seeing exactly how candidates work, with exactly the tools they'd use on the job.
In both the classroom and the hiring process, the LLM serves as a revealing foil. It produces something that looks right at a surface level but is wrong in ways that only domain expertise can detect. The ability to spot and fix those errors — in Marathi prosody or in a cover letter's argument — is the skill being tested. The AI doesn't replace the test; it sharpens it.
This pattern is generalizable. Here's how any team could design an LLM-integrated hiring process:
Give candidates a broken codebase and an LLM. Ask them to use the LLM to diagnose and fix the bugs. Evaluate not just the fix, but the prompting strategy, the ability to catch when the LLM hallucinates a solution, and the quality of the final commit message.
Ask candidates to use an LLM to draft a piece, then submit both the draft and their edited version with annotations. The editing reveals taste, voice, and judgment — things the LLM can't fake.
Give candidates a research question and access to LLM tools with web search. Evaluate the quality of their search strategy, their ability to identify when the LLM is confabulating sources, and the synthesis of findings.
Ask candidates to design an exercise like the विपरीत केकावली — one that requires students to use LLMs but tests domain knowledge through the quality of their corrections. The exercise design itself is the assessment.
Unravel's hiring post says it plainly: "The old way of building software is dying." The same is true for hiring, for education, and for any domain where LLMs have crossed the threshold from novelty to infrastructure.
The organizations that adapt fastest will be those that stop asking "did they use AI?" and start asking "how well did they use it?" — and, more importantly, "what did they add that the AI couldn't?"
Moropant wrote the Kekavali in 1790. Vedang wrote his hiring post in 2026. Separated by 236 years and entirely different domains, both point to the same truth: the value is never in the tool — it's in the judgment applied to the tool's output.
मूळ तत्त्व: Kids are going to use LLMs no matter what. Unless we make them think while they use it, how are they going to learn?
— From the विपरीत केकावली post
Replace "kids" with "candidates" and "learn" with "demonstrate competence," and you have the future of recruiting.
Note: This post was itself composed with assistance from Claude. The ideas, connections, and editorial judgment are mine. That's the point.