I requested ChatGPT4 to do some stats modelling – it was okay…ish
Hello guys ! There’s been some debate, particularly on right here, concerning the “future of information science” and “whose job goes to be taken” and so forth and so forth. Imo I don’t know the reply, however I believe LLMs have positively modified the panorama.
One of many actually fascinating issues ChatGPT has unlocked is that individuals can now code with out actually figuring out how you can. I believe in case you already are accustomed to coding, utilizing ChatGPT to enhance productiveness is superior. However in case you’re simply beginning out and use it generate code you may’t clarify, then I believe you will get into plenty of hassle.
And I believe that is very true when there’s a mathematical modelling selection facet to your code. My thought was that simply because one thing works / compiles, doesn’t imply it’s an excellent mannequin and doesn’t imply that the specific decisions / assumptions make sense. This, after all, isn’t chatGPTs fault, it’s the customers fault for not checking!
Anyway, to analyze this level, I just lately examined ChatGPT to jot down a Stan code (bayesian inference ) to foretell premier league matches. My feeling was that the duty easy sufficient for it to do an okay job, however not so generic it there’s 1,000,000 examples on the web.
I put the outcomes on YouTube (hyperlink under), however in abstract I discovered the next:
1. ChatGPT made a good mannequin, however with some actually bizarre decisions. Eg It determined to make use of a traditional distribution to mannequin objective variations , the place I believe a Skellam would have been higher. It additionally determined to not mannequin the variance of this distribution , as a substitute deciding that it was 1. Tremendous bizarre!
2. It wasn’t in a position to rationalise about issues like over parameterisation. The mannequin it construct had manner too many parameters, unnecessarily. The thought of parsimony wasn’t actually there. Perhaps with higher prompts it could have, however out of the field it made the mannequin overly complicated
3. Immediate engineering actually makes a distinction. I believe with higher prompts, the mannequin May have been higher. There was even some extent the place I noticed an error and prompted chatGPT to repair it and it did! However once more, this all relied on me with the ability to learn Stan code and know what was good and dangerous.
For me, I learnt that at the least for duties the place plenty of modelling decisions have to be made, people nonetheless beat GPT. However maybe sooner or later, those who win would be the information scientists/ engineers that know what they’re doing however are in a position to immediate GPT optimally to maximise their productiveness enhance.
The movies are right here :
Half 1: https://m.youtube.com/watch?v=4LTUYTxKuIk&t=66s&pp=ygUTbGVhcm4gc3RhbiB3aXRoIHJpYwpercent3Dpercent3D
Half 2: https://m.youtube.com/watch?v=XjQpV6c9K5g&t=1s&pp=ygUTbGVhcm4gc3RhbiB3aXRoIHJpYwpercent3Dpercent3D
Comments ( 3 )
Did you do anything like iterate with GPT or was the problem statement given and then the first answer taken as the instruction? I.e feedback your opinions about its choices to see if it would improve upon it. Also did you use anything like chain of thought prompting?
Also I see this as an interesting use/test case for AutoGPT as that allows the agent to reflect on its thoughts and show its reasoning, if you have the API? It would be an interesting experiment to plug in the Stan documentation as a knowledge base to see how it uses it.
Look forward to watching the videos in full!
I use chatgpt to build complex, fractional design experiments, choosing my test stat and substantive stat measures and it works beautifully. I get paid a lot and I don’t do a lot so I’m not sure what application you’re talking about but it’s a fucking perfect tool. I iterate with it like 14 -20 times before I’m done but it has only improved
Thanks, this was very insightful.