Shreya Shankar
Shreya Shankar

@sh_reya

7 Tweets 2 reads Dec 09, 2022
Of course prompt engineering is a job. ML systems and software have always been an amalgamation of models and rules. Now, with GPT et al., it is really hard to figure out what rules (explicit and implicit) to construct and how to write them.
It is amazing to me how little thought is given to software engineering expertise when building AI-powered software, beyond “we made a REST endpoint”
The more experienced I get, the more I see this pattern of innovation: enumerating the cross product {algorithmic advances} x {data storage and partitioning schemes} x {SWE buzzwords} x {UI/UX assets}. E.g., a dashboard for CI/CD for patching LLMs on cloud. Cool idea at a glance
But the problem with this idea isn’t the lack of existence, it’s the lack of definition. Unit tests in ML are different from software, they require a massive constant effort to collect live failures (interview study paper). CI/CD for ML should be
Fundamentally collaborative, among diverse stakeholders (PMs, eng, DS), in a way that SWE doesn’t even compare. How do we design it? Looking to SWE for some invariants, e.g., no new dependent changes introduced during a build. How do you make sure upstream models are “edited”
Before downstream ones? And then getting to the patching part, who’s to say patching > adding filters? SWE learnings tell us people would rather add code than delete code actively used in prod. If you can’t define the change’s behavior before building CI, will you use the change?
Anyways, I’m not the only one having thoughts like this. I’ve been thinking more about stuff like this recently as I’m trying to develop “vision.” Curious to read about other people’s long-term visions, particularly in the space of end-to-end systems building ML pipelines

Loading suggestions...