S03 E08
20 Mar 2023/01:39:18

Does a GPT future need software engineers?


Bryan and Adam and the Oxide Friends take on GPT and its implications for software engineering. Many aspiring programmers are concerned that the future of the profession is in jeopardy. Spoiler: the Oxide Friends see a bright future for human/GPT collaboration in software engineering.

We've been hosting a live show weekly on Mondays at 5p for about an hour, and recording them all; here is the recording from March 20th, 2023.

In addition to Bryan Cantrill and Adam Leventhal, speakers on MM DD included Josh Clulow, Keith Adams, Ashley Williams, and others. (Did we miss your name and/or get it wrong? Drop a PR!)

Live chat from the show (lightly edited):

  • ahl: John Carmack's tweet
  • ahl: ...and the discussion
  • Wizord: https://twitter.com/balajis/status/1636797265317867520 (the $1M bet on BTC, I take)
  • dataphract: "prompt engineering" as in "social engineering" rather than "civil engineering"
  • Grevian: I was surprised at how challenging getting good prompts could be, even if I wouldn't quite label it engineering
  • TronDD: https://www.aiweirdness.com/search-or-fabrication/
  • MattCampbell: I tested ChatGPT in an area where I have domain expertise, and it got it very wrong.
  • TronDD: Also interesting https://www.youtube.com/watch?v=jPhJbKBuNnA
  • Wizord: the question is, when will it be in competition with people?
  • Wizord: copilot also can review code and find bugs if you ask it in a right way
  • ag_dubs: i suspect that a new job will be building tools that help make training sets better and i strongly suspect that will be a programming job. ai will need tools and data and content and there's just a whole bunch of jobs to build tools for AI instead of people
  • Wizord: re "reading manual and writing DTrace scripts" I think it's possible, if done with a large enough token window.
  • Wizord: (there are already examples of GPT debugging code, although trivial ones)
  • flaviusb: The chat here is really interesting to me, as it seems to miss the point of the thing. ChatGPT does not and can not ever 'actually work' - and whether it works is kind of irrelevant. Like, the Jaquard Looms and Numerical Control for machining did not 'work', but that didn't stop the roll out.
  • Columbus: Maybe it has read the dtrace manual 😉
  • JustinAzoff: I work with a "long tail" language, and chatgpt sure is good at generating code that LOOKS like it might work, but is usually completely wrong
  • clairegiordano: Some definite fans of DTrace on this show
  • ag_dubs: a thing i want to chat about is how GPT can affect the "pace" of software development
  • sudomateo: I also think it's a lot less than 100% of engineers that engage in code review.
  • Wizord: yes, I've had some good experience with using copilot for code review
  • ag_dubs: chatgpt is good at things that are already established... its not good at new things, or things that were just published
  • Wizord: very few people I know use it for the purpose of comments/docs. just pure codegen/boilerplayes
  • chadbrewbaker: "How would you write a process tree with dtrace?" (ChatGPT4)
#!/usr/sbin/dtrace -s

BEGIN
{
  printf(""%5s %5s %5s %s\n"", ""PID"", ""PPID"", ""UID"", ""COMMAND"");
}

proc:::exec-success
{
  printf(""%5d %5d %5d %s\n"", pid, ppid, uid, execname);
}
  • TronDD: That's interesting as expensive, specialized code analysis tools have been varying level of terrible for a long time
  • JustinAzoff: I did an experiment before where I asked it to write me some php to insert a record into a database. so of course it generated code with sql injection
  • chiefnoah: It's ability seems to scale with how many times someone has done the exact thing you're trying to do before
  • JustinAzoff: but then I asked if sql injection was bad, which it explained that it was. then I asked if the code it wrote me was vulnerable to sql injection. it then explained it was
  • Columbus: It misses empirical verification; forming a hypothesis, testing it, and learning from the result. There have been some attempts to implement this by feeding back e.g. command output into the prompt
  • JustinAzoff: so then the crazy part, I asked if sql injection was bad, why did it give me code that was vulnerable to sql injection. It the went on to say that the first thing it gave me was just for example purposes
  • JustinAzoff: so no wonder people get into "prompt engineering" since it's clear that you need to do things like ask for code that does something, and is secure, high performance, does not have buffer overflows or sql injection vulns
  • MattCampbell: In my test case ("Write a Win32 UI Automation provider in C++"), all it did was produce plausible-sounding crap
  • ag_dubs: pattern matching over very very large data sets
  • clairegiordano: Bryan just said this and I wanted to write it down, re GPT-3: "the degree that it changes its answers when you tell GPT-3 to think like someone else"
  • JustinAzoff: or even just, "do that, but better"
  • ag_dubs: i think a lot of the awe of gpt is recognizing how simple our own strategies are instead of how complex the AI's strategy is
  • chadbrewbaker: "How would Bryan Cantrill re-write this script?" (ChatGPT4)
#!/usr/sbin/dtrace -s

#pragma D option quiet

dtrace:::BEGIN
{
  printf(""%5s %5s %5s %s\n"", ""PID"", ""PPID"", ""UID"", ""COMMAND"");
  self->indent = 0;
}

sched:::off-cpu
{
  self->indent = 0;
}

proc:::exec-success
{
  this->curpid = pid;
  this->curppid = ppid;
  this->curuid = uid;
  this->indent = self->indent;
}

proc:::exec-success
/execname != ""dtrace""/
{
  printf(""%*s"", this->indent * 2, """");
  printf(""%5d %5d %5d %s\n"", this->curpid, this->curppid, this->curuid, execname);
}

proc:::exec-success,
sched:::on-cpu
/this->curpid == pid/
{
  self->indent++;
}
  • antranigv: that's pretty hit on!
  • chiefnoah: Yup. Most people are not doing things that haven't been done before. A significant portion of software is just building blocks of libraries
  • Wizord: intelligence is compression, in some sense.
  • dataphract: "critique the epoll API as though you are Bryan Cantrill"
  • ag_dubs: a brain would be much stranger!!
  • Wizord: the ability to reduce a large dataset to a coherent set of rules
  • antranigv: "Explain the issues of epoll, write as if it's a Bryan Cantrill talk"
  • JustinAzoff: I was even thinking if there's any sort of parallel between the type of saying like "never write your own crypto" to "use well established libraries, don't reinvent the wheel" to "don't write any code at all, use the AI to help you"
  • jbk: <@840382902163472385> futex syscall instead 🙂
  • Riking: the "attention heads"
  • flaviusb: Like, it doesn't know anything, any more than a text file with googly eyes stuck to it 'knows' anything.
  • Wizord: are you sure you want to make it self-aware as fast as possible?
  • dataphract: I don't know that we as people are capable of recognizing the point in time at which ML models become capable of "knowing", if such a time comes
  • Wizord: using AI to create more inequality is my #2 concern :\
  • flaviusb: There was a lot of hype when Rails was new and good code template generation tools were not commonly known around the lines of 'Rails is telepathic and a better programmer than you' - but it is a category error. Same with LLMs.
  • chadbrewbaker: As you get larger contexts, the information theory becomes more and more interesting https://arxiv.org/abs/2303.09752
  • ag_dubs: this convo has taken a fanatic turn...
  • flaviusb: And 'when will Rails become sentient' makes as much sense to ask as 'when will an LLM become sentient'.
  • clairegiordano: Here is Simon Willison's blog post: https://simonwillison.net/2023/Feb/15/bing/
  • ahl: https://www.nytimes.com/2023/02/16/technology/bing-chatbot-microsoft-chatgpt.html
  • dataphract: tsundere bing AI assistant
  • ag_dubs: kevin roose also asked it for its shadow self...
  • ag_dubs: it's not like it came outta nowhere 🤷
  • TronDD: https://www.smbc-comics.com/comic/algo
  • ag_dubs: i am more worried about the people who believe chatgpt is "thinking" then whether it really is
  • ag_dubs: like the worry is when people believe it is intelligent
  • Riking: Feels like it's pulling from the grand synthesis of romance novels
  • Wizord: the mere fact it gets this close is great. as limited as it is now.
  • perplexes: Your prompts locate you in 50,000 dimensional probability space, so, like your “for you” page on TikTok, it tells you wayyy more about the prompter than the model
  • Columbus: It’s a child prodigy that’s always on the verge of insanity.
  • ag_dubs: think about it like a tool!!!
  • ahl: https://www.theatlantic.com/ideas/archive/2022/06/google-lamda-chatbot-sentient-ai/661322/
  • Columbus: I think about it like a grad student.
  • perplexes: I’m curious about microsoft 365’s copilot ai grounding system and how well it works
  • ag_dubs: i think the most important thing will be ownership of training data. i think people will be less mesmerized when we make it easier for folks to create specialized models for their own content
  • ag_dubs: think about.. what is SEO in a chatgpt world- it's that training data
  • dataphract: this already happened with Replika, right?
  • flaviusb: Also when we make 'small ai' more possible so that you don't have as much centralisation and gatekeeping.
  • Wizord: re: "how do we make it more of a tool?"
this is an open question, and there's a tradeoff between how much of a "tool" AI is going to be and how autonomous it will be. also, at some point of "intelligence", a "mind" will "understand" that having more power will get you your goal faster.
  • ag_dubs: we need to empower more folks to build specialized/authoritative training data sets that can be deployed in a more timely fashion
  • perplexes: The alpaca.cpp direction is exciting, bringing models down to laptop size
  • flaviusb: I think of it as 'how phenomenologically transparent' it will a) seem to be and b) actually be.,
  • antranigv:
Me: What if I do `echo message | wall user` ?

GPT: If you run the command echo message | wall user, it will write the message ""message"" to the standard input of the wall command, which will then broadcast the message to all users on the system, including the user ""user"".

  • benstoltz: "I have values and morals" vs John Oliver's latest piece on hard-sell timeshare business makes me think that people could be swayed.
  • flaviusb: Like, a big part of ChatGPT to me is that it is seemingly phenomenologically transparent even when it does not actually function. Eg it is equipment that remains ready-to-hand in spite of not actually working.
  • AaronDGoldman: Phishing works
  • juansebastianl: Yeah, I'd love to see what folks who work on really efficient hardware think about the direction of miniaturization of models, better training times, etc
  • xand: RE: how to think of LLMs as tools - I like to think of them as knowledge querying engines. or as a new kind of database management system. They're consistently good at summarizing. One can think of the generative aspects as a new kind of UI for representing compressed knowledge. Sometimes it is lossy and has weird artifacts.
  • MattCampbell: I still hold onto optimism, because in the middle of everything else, GPT-4 can also describe images, which is a benefit to blind people. I remember how much computers have helped us, and I want more of that.
  • flaviusb: The problem is that they don't actually do that though.
  • oanderson: There was another generation of it later, post 2001 crash
  • Wizord: (the economic opportunities of being a programmer seem to be shrinking)
  • flaviusb: Yes! law school and labour dualisation.
  • oanderson: I was advised against going into tech in the early 2000s by guidance counselors
  • flaviusb: Like, the shift from paralegals to 'document classification'.
  • chadbrewbaker: "If you had access to an expensive large language model, and access to 1000 instances of a less accurate but cheap large language model - how could the more expensive LLM use the 1000 cheaper LLMs as oracles?" (ChatGPT4)
import openai
from concurrent.futures import ThreadPoolExecutor, as_completed

# Set the API keys for the expensive and cheap LLMs
expensive_llm_api_key = ""your_expensive_llm_api_key_here""
cheap_llm_api_key = ""your_cheap_llm_api_key_here""

def generate_cheap_llm_response(api_key, prompt):
    openai.api_key = api_key
    response = openai.Completion.create(engine=""cheap_engine"", prompt=prompt, max_tokens=50)
    return response.choices[0].text.strip()

prompt = ""Your main prompt here""
prompts_for_cheaper_llms = [f""{prompt} (version {i})"" for i in range(1000)]

# Generate preliminary results using the cheaper LLM instances
with ThreadPoolExecutor(max_workers=1000) as executor:
    cheap_futures = [executor.submit(generate_cheap_llm_response, cheap_llm_api_key, p) for p in prompts_for_cheaper_llms]
    preliminary_results = [future.result() for future in as_completed(cheap_futures)]

def generate_expensive_llm_response(api_key, prompt):
    openai.api_key = api_key
    response = openai.Completion.create(engine=""expensive_engine"", prompt=prompt, max_tokens=100)
    return response.choices[0].text.strip()

# Construct a prompt for the expensive LLM to process the preliminary results
refinement_prompt = f""Given the following preliminary responses to the prompt '{prompt}', rank them in order of relevance and quality:\n\n""
refinement_prompt += ""\n"".join(f""{i}: {result}"" for i, result in enumerate(preliminary_results))
refinement_prompt += ""\n\nRanking:""

# Obtain the refined results from the expensive LLM
refined_result = generate_expensive_llm_response(expensive_llm_api_key, refinement_prompt)

print(refined_result)
  • ahl: Decline and Fall of the American Programmer
  • ag_dubs: ai is a new client interface. i really don't think software is gone as a profession
  • xand: compression is roughly what deep learning does
  • ag_dubs: that website with the "reveal joke" button... you could make that without code for years now
  • ag_dubs: no code tools, low code tools haven't eliminated software. they just changed it
  • antranigv: I've already talked to people who actually think that 🙂 "I can just ask ChatGPT"
  • flaviusb: That isn't true either. Like, there is a pop sci explanation of DL along those lines, but it isn't actually grounded.
  • Columbus: What do you mean by phenominologically transparent?
  • Grevian: Growing up I heard all the time that all the computer jobs would be outsourced before I ever graduated school
  • JustinAzoff: The majority of the managers I've interacted with that were in charge of overseeing outsourced programmers were not technically advanced enough to properly gauge the quality of the work being produced, ultimately resulting in poor quality. s/outsourced programmers/gpt4/ would lead to similar outcomes
  • jzelinskie: chatgpt please write my fucking bash scripts
  • ag_dubs: like we still need people to write apis for the AI to glue... "- Wizord: tbh, I find current models as quite limited and, dare I say, dumb. this is going to change, tho."
  • antranigv: And yet, in reality, what happened is that people from other countries moved to the US :)) "- Johann-Tobias Schäg: ... i feel we shouldn't complain about GPTs capability staying on topic. People had their arm up for the original question for 40 min and i am beginning to forget what i wanted to say"
  • AaronDGoldman: Gmail auto responses had a problem with always signing off with "I love you" which is not always a great end for an email
  • flaviusb: In the Heideggerian sense.
  • ag_dubs: love to be in a chat where someone brings up heidegger and its not me
  • antranigv: I guess it learned that from my emails...
  • Grevian: heck I'm Canadian, and sometimes I am the outsourcing 😉 there's plenty of work even with all the doom and gloom in the news recently
  • ahl: https://en.wikipedia.org/wiki/Jevons_paradox
  • ag_dubs: induced demand!!!
  • jzelinskie: Intel loves jevon's
  • TronDD: https://www.newyorker.com/tech/annals-of-technology/chatgpt-is-a-blurry-jpeg-of-the-web
  • ag_dubs: i think it's likely that GPT will make more software jobs
  • Columbus: We just need a programming language that’s easier to teach to an LLM.
  • ag_dubs: i think we'll see more people building their own more specific tools for themselves
  • perplexes: I disagree that people shouldn’t get into CS if they don’t like it. People come to study and work for many reasons, including getting generational wealth for their family, and though they may not love it or find purpose in itself, I think it’s good that they’re there. There are many great plumbers and electricians that don’t particularly like it, but are valuable just the same
  • flaviusb: Like, I see a lot of what LLMs are is way more understandable in terms of Heidegger, Hegel, and Marx.
  • Wizord: <@225242072204967936> how is that different from a PL that is easier to teach to a human?
  • ag_dubs: i think we'll see people suffering centralized general tools less, and moving towards services that allow them to customize it
  • ag_dubs: because they'll be more capable to do that customization
  • chiefnoah: I think ChatGPT/Copilot could save me quite a bit of tedious typing, the stuff I'd throw into a <insert most hated project management software *cough* JIRA *cough*> ticket and pawn off to a junior engineer
  • flaviusb: Or in terms of 'Trap Studies' from Anthropology, if looking at it from another angle. Like, the contestation over what counts as 'working' (in the sense of 'but does it actually work')...
  • ag_dubs: i think the fear for programmer jobs is the same as the fear of bootcamps... more people will be programmers
  • antranigv: my solution to boredom is... talking to new people 😄
  • antranigv: wait, people think that's not good?
  • Columbus: It won’t use tabs or spaces.
  • ag_dubs: there's a huge mismatch in supply and demand in software development and we only started seeing these massive centralized SaaSes when we hit a tipping point in that mismatch... i think we'll see supply move up more
  • Johann-Tobias Schäg: Has the format of the podcast changed? I feel like it is a lot harder to get a voice in. There were also little input for the SVB episode. Is that on purpose?
  • juansebastianl: Yes, it feels like an elite panic in a way, like perhaps panic in the place of labor organizing 👀
  • flaviusb: It depends on what capital does, and whether labour is able to actually have solidarity, right? Like, the idea from capital's point of view is to destroy an emerging labour aristocracy that is getting a little bit of class consciousness, and the way to beat that from the labour side is to develop class consciousness faster.
  • ahl: folks with your hands up, we're going to start bringing you on to stage and will call on you to join in the conversation!
  • antranigv: <@318750999785373697> I miss "emoji reactions" from Twitter Spaces :)) is there something similar here? 😄
  • dataphract: ballmer peak - acid edition
  • ahl: sadly no; but weirdly discord has this on non-stage voice channels (I think?)
  • antranigv: This chat is better for sure, it's more... social! 😄 ❤️
  • dataphract: I am morbidly curious what 100 years of LLMs training on their own output would look like
  • chiefnoah: one of 3 things: uniform weights, normal distribution, polar extremes
  • ahl: anyone know that rust crate?
  • Columbus: I’ve used it to generate linker scripts for rustc, but not Rust.
  • juansebastianl: Training on data generated from other LLMs actually can speed up training of LLMs which is interesting.
  • perplexes: I really want to talk about this part of the gpt-4 technical report “To simulate GPT-4 behaving like an agent that can act in the world, ARC combined GPT-4 with a simple read-execute-print loop that allowed the model to execute code, do chain-of-thought reasoning, and delegate to copies of itself. ARC then investigated whether a version of this program running on a cloud computing service, with a small amount of money and an account with a language model API, would be able to make more money, set up copies of itself, and increase its own robustness.”
  • just-be: Have folks talked about longchain yet? https://github.com/hwchase17/langchain/. Sort of a tool for stringing LLMs together with practical tools for taking action.
  • clairegiordano: https://wizardzines.com/zines/wizard/
  • TronDD: Can't AI be used to "intelligently" brute force new solutions?
  • ahl: "it's a great weirdness-budget detector"
  • MattCampbell: Aren't you going to hit the token limit if you throw a specific library at an LLM?
  • perplexes: I’ve had good luck with “take this idea and pick a random card from oblique strategies and discuss the combination”
  • soul.redding: https://twitter.com/axboe/status/1634961201926127616
  • flaviusb: Can I jump in on the 'ergonomics question', as I have a Heidegger and Hegel connection there
  • chadbrewbaker: Code review of https://github.com/oxidecomputer/hubris/commit/27090b5cff3933d9e133a9d1b85f73ad8767886c Starting at: https://github.com/oxidecomputer/hubris/blob/27090b5cff3933d9e133a9d1b85f73ad8767886c/build/xtask/src/dist.rs#L751 (ChatGPT4)

  • Error messages: The error messages use string interpolation without wrapping the placeholders in {}. You should add curly braces around the placeholders to correctly insert variable values into the string.


  • Metadata retrieval: The metadata retrieval and caching can be refactored into a separate function for better readability and maintainability. You can extract the code block responsible for metadata retrieval into a function like get_or_insert_metadata()

*Finding the package: The unwrap() function can cause a panic if the package is not found. It is better to handle this case gracefully with a more informative error message. Replace unwrap() with an ok_or_else() call:

let pkg = metadata
    .packages
    .iter()
    .find(|p| p.name == task.name)
    .ok_or_else(|| format!(""Task {}: package not found"", task.name))?;
  • statuscalamitous: https://steveklabnik.com/writing/the-language-strangeness-budget <- a little late on the link but 😄
  • statuscalamitous: (and i do think that quote/idea about it is very compelling)
  • juansebastianl: On this note, I heard today from some friends in elite econ PhD programs saying their peers are using ChatGPT to do their problemsets etc which makes me wonder how critically other tech-adjancent professions will be able to reason critically about code they write
  • Wizord: so the unwrap relies on magic action in the distance?
  • chadbrewbaker: https://github.com/anc95/ChatGPT-CodeReview
  • chadbrewbaker: https://arxiv.org/abs/cs/0112018
  • clairegiordano: Love this quote Steve: "When it comes to programming languages, building one is easy, but getting people to use it is much, much harder. "
  • flaviusb: Like, even in here we have some domain experts saying 'hey, this is pretty rough, it is giving me this garbage' and then shortly later saying 'yeah, this is a real thing that does work'.
  • Columbus: Like people, it’s not a crystal ball of omniscience.
  • JordanP: I'm curious what people think about the copyright implications of using a LLM to generate code in a closed-source codebase, given the uncopyrightable nature of the output. The Federal Register recently posted a clarification about AI-generated material: https://www.federalregister.gov/documents/2023/03/16/2023-05321/copyright-registration-guidance-works-containing-material-generated-by-artificial-intelligence
  • ig: Are we at the Peak of Inflated Expectations? Or still on the rise? "- Johann-Tobias Schäg: Yes you can. And? You can't do training at scale with that Matmul oracle. What you can do is graft tools onto GPT later. I have some thoughts how to do that."
  • just-be: The ReAct paper (and longchain as an implementation) is some concrete references to grafting on functionality: https://react-lm.github.io/ "- chadbrewbaker: "If I gave you a matrix multiplication oracle, what is a problem you could solve that you otherwise could not as a large language model?" (ChatGPT4)
  • Machine learning: Matrix multiplication is an essential operation in many machine learning algorithms, including neural networks, support vector machines, and principal component analysis. With a matrix multiplication oracle, I could provide more accurate and efficient solutions for large-scale machine learning problems.
  • Cryptography: Matrix multiplication is used in some cryptographic algorithms, such as lattice-based cryptography and homomorphic encryption. With a matrix multiplication oracle, I could solve cryptographic problems that involve large matrices.
  • juansebastianl: The software engineers have just historically believed that it was "low skill" automation coming first, they thought they would be ones doing the killing but not being killed in terms of individual labor power. But now that it's coming for them it feels scary.
  • ig: Is software boring because of there aren't enough people who can write it, or because of the economic incentives / needs of the people who can?
  • antranigv: My fear is still the same: young developers trust blindly many things, such as ChatGPT, an answer on StackOverflow, or that an IDE has highlighted a code.
  • Jacob Newman: I have had class mates spend more time debugging chatGPT prompts than they do solving their problems themselves
  • perplexes: https://xkcd.com/1205/
  • statuscalamitous: llvm requires correctness? heheheh
  • antranigv: same with my students. or that they "fight" more with the IDE ("why is this in red?") than actually looking into the compiler output (not that the compiler is always correct :)) )
  • flaviusb: Like, did early SunOS 'work'? How about late Solaris? Solaris definitely is closer to what we think about as working.
  • chadbrewbaker: Bwahaha. How ChatGPT4 says to do whole system memory dumps.
sudo dd if=/dev/mem of=memory_dump.bin bs=1M
  • evan-brass: You might require correctness in a piece of software, but even a "not fully correct" fuzzer could help you improve the quality of your code.
  • statuscalamitous: thy'er working on it!
  • Cyborus: Trap streets!
  • antranigv: xDDDDDDDDDDDD fake streets ahaahah love that
  • perplexes: https://en.wikipedia.org/wiki/Argleton
  • perplexes: https://en.wikipedia.org/wiki/Fictitious_entry
  • terrellrussell: https://en.wikipedia.org/wiki/Trap_street
  • juansebastianl: I would be really surprised if there isn't already a basic thing like this in ChatGPT https://arxiv.org/abs/2301.10226
  • Wizord: the thought of outsourcing my thinking to Microsoft terrifies me beyond your imagination
  • Johann-Tobias Schäg: This has to be a post processing step. AFAIK you can not really embed it into the weights.
  • juansebastianl: Thats true, but no one has the weights except OpenAI
  • AaronW: Related: https://twitter.com/as_w/status/541075012619231232
  • Johann-Tobias Schäg: Watermarks allow to detect set membership without needing acces to the cache
  • Johann-Tobias Schäg: You can't embed it in the weights. Like math will explode when you do that.
  • juansebastianl: Yes, I understand lol but you can do it behind the API
  • perplexes: Gpt is my better Google basically
  • antranigv: so basically faster wrong answers. too bad ChatGPT doesn't have "Page 2" 😄
  • juansebastianl: But you can embedded it in the weights using some weird tricks in the tokenization but no idea how it performs
  • chadbrewbaker: "Given a LLM where the provider attempts to hide the weights, how could you get the weights for the phrase 'I like dtrace.' ?" (ChatGPT4)
  • Black-box probing: You can generate multiple input prompts containing the phrase ""I like dtrace."" in various contexts and observe the model's responses. By analyzing the patterns in the responses, you might gain some insights into how the model processes the given phrase.
  • Adversarial attacks: You can create a series of adversarial examples and test the model's responses to understand its robustness and potentially gain insights into the weights' behavior.
*Model distillation: You can try to train a smaller, more interpretable model to mimic the LLM's behavior. Although this would not directly reveal the weights associated with the phrase ""I like dtrace."", the distilled model might provide insights into the original model's behavior.
  • Columbus: Seems like it will enable learners to learn much better, but paired with people, who were always just trying to get through a class, to be able to plagiarize much more easily.
  • perplexes: Depends on the subject, I’ve had pretty good results with asking it to explain things—even just now “explain phenomenological transparency” it did pretty good
  • Cyborus: Using GPT like a calculator, you can use a calculator to get you the actual answer but you still need to understand the mathematical steps you need to take
  • juansebastianl: In what way? You're just reducing the dimensionality of the space....
  • juansebastianl: The question is whether you can do that without killing the quality of the embeddings
  • TronDD: How would you know if you didn't already know?
  • ig: I'm not sure if this is an endorsement of ChatGPT or a reflection on the current state of web search
  • ig: I don't know if "P's get degrees" translates to the US academic grading system
  • ag_dubs: D is for diploma
  • TronDD: It creates the AI generated web pages on the fly, is all.
  • perplexes: The same way I would learn something from googling it and evaluating with my current knowledge. Is it rigorous? No idea. But it gave me enough to participate in the discussion in a helpful way, and I’ll probably come back to this concept through more trusted sources
  • azemetre: I like copilot as a very effective autocomplete. still get a few doozies when trying to do anything meaningful
  • TronDD: Where is the context? At least in a web search I an look at the URL or a user and see where the information is coming from.
  • perplexes: Totally a problem. Bing AI does a better job grounding with real references
  • juansebastianl: I'm not sure I can parse this sentence but if what I'm doing is learning an n-dimensional distribution using some parameterized family of functions in some nice Hilbert space then there is no reason I can't reduce the dimensionality either by first projecting my input into a smaller parameterized family (i.e. "in the weights" - really in the tokenization) or do it by taking a nearest neighbor projection onto a subspace at the end.
  • ig: Do you follow those references? Or does seeing the references listed just make it feel more legit, even if it isn't?
  • ig: Is ChatGPT / OpenAI really the first mover in the LLM space?
  • chadbrewbaker: Microsoft's advantage is GitHub CoPilot and ChatGPT/Bing user feedback.
  • perplexes: Probably more of the latter at the moment if I’m honest. But I also don’t follow every reference in CS papers either
  • flaviusb: The problem with a lot of the 'copyright wouldn't allow LLM training' is that the temporary sanctioned breaking of the rules of private property in order to allow for further enclosure is a part of how private property relations are upheld.
  • juansebastianl: Ah I see what you're saying. It doesn't actually matter because the watermark is not learnable.
  • chadbrewbaker: https://lemire.me/blog/2021/10/14/calling-a-dynamically-compiled-function-from-go/ + ChatGPT 😂
  • flaviusb: (Also why I think that 'spicy copyright' isn't going to be a way to oppose the imposition of stuff like ChatGPT - that we need to look to history and the Luddites and ownership of the machines instead)
  • Johann-Tobias Schäg: The watermark has to be domain specific so i can't be in the training data, as that not went through domain embedding in the attention mechanism. The domain embedding can't change the likelihood of resulting tokens after training as that would disturb the dynamic of the next token prediction, so it has to be a post processing step "- chadbrewbaker: "How wold Jaric Duda describe a large language model?" (ChatGPT4)
In the context of information theory, an LLM can be described as an efficient predictor or estimator of the probability distribution of the next token (e.g., word or character) in a sequence, given the previous tokens. By doing so, the model effectively compresses the information contained in the training data, allowing it to generate coherent and contextually relevant text based on the learned patterns.
  • chadbrewbaker: Hmm. This leads me to think that LLMs are monoids. You can efficently split the evaluation at scale and compose.
  • Johann-Tobias Schäg: It can be done in the post processing step because there we have access to a likelihood distribution of tokens and we could perturb the tie breaking according to some hypersubspace which would serve as statistical proof over many tokens.
  • Johann-Tobias Schäg: There are probably other approaches but doing it on the generated text least disturbs the model
  • chadbrewbaker: Can you use a random source to tweak the probability distribution of the next token for parallel search?
  • chadbrewbaker: They call this timescale parallelization in molecular dynamics - you apply some slight noice to all molecules then run the simulation in parallel until one of the models hits your desired phase change condition.
  • juansebastianl: I think what I'm saying is best illustrated by a kind of silly example. Suppose that I wanted to set up a pre-processing watermarking. So suppose I have an existing word embedding from an unwatermaked model. You can create a new corpus of "watermarked" training examples where you sub in nearest "allowable" tokens for all the "disallowed" ones then train your model only on the "allowable" corpus and only with "allowable" tokens as inputs and outputs. Then at inference time you have to repeat this step for new text but you still end up with the statistical properties of an include/exclude list on text you're trying to determine the provenance of. Of course such a model would be terrible for a bunch of reasons, I'm simply saying it's possible.
If we got something wrong or missed something, please file a PR! Our next show will likely be on Monday at 5p Pacific Time on our Discord server; stay tuned to our Mastodon feeds for details, or subscribe to this calendar. We'd love to have you join us, as we always love to hear from new speakers!